Author vstinner
Recipients Dmitry.Jemerov, Roman.Evstifeev, Vladimir Iofik, aclover, brian.curtin, eric.araujo, frankoid, kaizhu, r.david.murray, vldmit, vstinner
Date 2012-12-06.14:53:31
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1354805611.38.0.155250764983.issue9291@psf.upfronthosting.co.za>
In-reply-to
Content
>   File "c:\Python27\lib\mimetypes.py", line 250, in enum_types
>    ctype = ctype.encode(default_encoding) # omit in 3.x!
> UnicodeDecodeError: 'ascii' codec can't decode byte 0xe0 in position 0: ordinal not in range(128)

The encoding is wrong. We should read the registry using Unicode, or at least use the correct encoding. The correct encoding is the ANSI code page: sys.getfilesystemencoding().

Can you please try with: default_encoding = sys.getfilesystemencoding() ?

> python 3.1.2 mimetypes initialization also fails in redhat linux: (...)

In Python 3.3, MimeTypes.read() opens files in UTF-8. The issue #13025 explains why UTF-8 is used instead the locale encoding, or another encoding.

I see that read_mime_types() uses the locale encoding, it looks like a bug, it should also use UTF-8.
History
Date User Action Args
2012-12-06 14:53:31vstinnersetrecipients: + vstinner, eric.araujo, kaizhu, aclover, r.david.murray, brian.curtin, frankoid, Dmitry.Jemerov, vldmit, Vladimir Iofik, Roman.Evstifeev
2012-12-06 14:53:31vstinnersetmessageid: <1354805611.38.0.155250764983.issue9291@psf.upfronthosting.co.za>
2012-12-06 14:53:31vstinnerlinkissue9291 messages
2012-12-06 14:53:31vstinnercreate