Title: mimetypes should read the rule file using UTF-8, not the locale encoding
Author: STINNER Victor (vstinner) Date: 2011-09-20 23:07
On Debian and Ubuntu, /etc/mime.types file is pure ASCII, but on Fedora 15 it contains a non-ASCII character, ³ (U+00B3), in the line:
"application/vnd.geocube+xml                     g3 g³"

And the file is encoded in UTF-8.

That's why Python should read this file from UTF-8 instead of the locale encoding, because the locale encoding can be ASCII. Attached patch implements this idead.

I think that it is a bug and so it should also be fixed in Python 3.2.

(Python 2.7 reads the file in binary mode, it doesn't care of the encoding.)
Author: Éric Araujo (eric.araujo) Date: 2011-09-23 16:42
+1.  I’ve finally understood that open using the locale is Evil™.  Please use the file from Fedora in a test.
Author: Roundup Robot (python-dev) Date: 2011-10-14 01:04
New changeset 8d8ab3e04363 by Victor Stinner in branch '3.2':
Issue #13025: mimetypes is now reading MIME types using the UTF-8 encoding,

New changeset 2c223d686feb by Victor Stinner in branch 'default':
(Merge 3.2) Issue #13025: mimetypes is now reading MIME types using the UTF-8
Author: STINNER Victor (vstinner) Date: 2011-10-14 01:04
> Please use the file from Fedora in a test.

Author: Serhiy Storchaka (serhiy.storchaka) Date: 2020-06-20 09:11
However read_mime_types() still uses the locale encoding. See issue41048.
