This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: mimetypes.init() raise unhandled excption in windows
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 2.7
process
Status: closed Resolution: duplicate
Dependencies: Superseder: mimetypes initialization fails on Windows because of non-Latin characters in registry
View: 9291
Assigned To: Nosy List: adamhj, r.david.murray
Priority: normal Keywords:

Created on 2013-11-13 05:20 by adamhj, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Messages (2)
msg202731 - (view) Author: adamhj (adamhj) Date: 2013-11-13 05:20
my system is windows 2k3 sp2, python version is 2.7.6

i found this bug when trying to install the newest setuptools

------------------------------------------------------------------------
X:\xxx>ez_setup.py
Extracting in d:\docume~1\xxx\locals~1\temp\tmpcyxs8s
Now working in d:\docume~1\xxx\locals~1\temp\tmpcyxs8s\setuptools-1.3.2
Installing Setuptools
Traceback (most recent call last):
  File "setup.py", line 17, in <module>
    exec(init_file.read(), command_ns)
  File "<string>", line 8, in <module>
  File "d:\docume~1\xxx\locals~1\temp\tmpcyxs8s\setuptools-1.3.2\setuptools\__init__.py", line 11, in <module>
    from setuptools.extension import Extension
  File "d:\docume~1\xxx\locals~1\temp\tmpcyxs8s\setuptools-1.3.2\setuptools\extension.py", line 5, in <module>
    from setuptools.dist import _get_unpatched
  File "d:\docume~1\xxx\locals~1\temp\tmpcyxs8s\setuptools-1.3.2\setuptools\dist.py", line 15, in <module>
    from setuptools.compat import numeric_types, basestring
  File "d:\docume~1\xxx\locals~1\temp\tmpcyxs8s\setuptools-1.3.2\setuptools\compat.py", line 19, in <module>
    from SimpleHTTPServer import SimpleHTTPRequestHandler
  File "D:\Python27\lib\SimpleHTTPServer.py", line 27, in <module>
    class SimpleHTTPRequestHandler(BaseHTTPServer.BaseHTTPRequestHandler):
  File "D:\Python27\lib\SimpleHTTPServer.py", line 208, in SimpleHTTPRequestHandler
    mimetypes.init() # try to read system mime.types
  File "D:\Python27\lib\mimetypes.py", line 358, in init
    db.read_windows_registry()
  File "D:\Python27\lib\mimetypes.py", line 258, in read_windows_registry
    for subkeyname in enum_types(hkcr):
  File "D:\Python27\lib\mimetypes.py", line 249, in enum_types
    ctype = ctype.encode(default_encoding) # omit in 3.x!
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc4 in position 33: ordinal not in range(128)
Something went wrong during the installation.
See the error message above.
------------------------------------------------------------------------

then i see into the code, the exception is raised in this function

------------------------------------------------------------------------
        def enum_types(mimedb):
            i = 0
            while True:
                try:
                    ctype = _winreg.EnumKey(mimedb, i)
                except EnvironmentError:
                    break
                try:
                    ctype = ctype.encode(default_encoding) # omit in 3.x!
                except UnicodeEncodeError:
                    pass
                else:
                    yield ctype
                i += 1
------------------------------------------------------------------------

i checked my registry, there is an key in HKCR whose name is in Chinese(encoding GBK), which causes this problem(btw, the default_encoding is 'ascii', assigned from sys.getdefaultencoding())

i don't know is it legal to use a non-ascii string as the name of a registry key in windows, but i think there is some problem in these piece of code. why the variable ctype need to be decoded here? i checked the _winreg.EnumKey() function, it returns byte string:

>>> _winreg.EnumKey(key,8911)
'\xbb\xad\xb0\xe5\xa1\xa3\xce\xc4\xb5\xb5'
#this is the problem key which cause the exception

so python tries to encode it(with ascii encoding) first, and then the exception is raised(and unhandled)

shouldn't we just remove the try..encode..except paragraph?
msg202747 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2013-11-13 13:46
This is a duplicate of issue 9291.  Can you answer Victor's question over there for us?
History
Date User Action Args
2022-04-11 14:57:53adminsetgithub: 63766
2013-11-13 13:46:26r.david.murraysetstatus: open -> closed

superseder: mimetypes initialization fails on Windows because of non-Latin characters in registry

nosy: + r.david.murray
messages: + msg202747
resolution: duplicate
stage: resolved
2013-11-13 05:20:58adamhjcreate