This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Title: mimetypes should read the rule file using UTF-8, not the locale encoding
Type: behavior Stage: test needed
Components: Library (Lib), Unicode Versions: Python 3.2, Python 3.3
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: eric.araujo, python-dev, sandro.tosi, serhiy.storchaka, terry.reedy, vstinner
Priority: normal Keywords: patch

Created on 2011-09-20 23:07 by vstinner, last changed 2022-04-11 14:57 by admin. This issue is now closed.

File name Uploaded Description Edit
mimetypes_encoding.patch vstinner, 2011-09-20 23:07 review
Messages (5)
msg144357 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-09-20 23:07
On Debian and Ubuntu, /etc/mime.types file is pure ASCII, but on Fedora 15 it contains a non-ASCII character, ³ (U+00B3), in the line:
"application/vnd.geocube+xml                     g3 g³"

And the file is encoded in UTF-8.

That's why Python should read this file from UTF-8 instead of the locale encoding, because the locale encoding can be ASCII. Attached patch implements this idead.

I think that it is a bug and so it should also be fixed in Python 3.2.

(Python 2.7 reads the file in binary mode, it doesn't care of the encoding.)
msg144455 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011-09-23 16:42
+1.  I’ve finally understood that open using the locale is Evil™.  Please use the file from Fedora in a test.
msg145493 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2011-10-14 01:04
New changeset 8d8ab3e04363 by Victor Stinner in branch '3.2':
Issue #13025: mimetypes is now reading MIME types using the UTF-8 encoding,

New changeset 2c223d686feb by Victor Stinner in branch 'default':
(Merge 3.2) Issue #13025: mimetypes is now reading MIME types using the UTF-8
msg145494 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-10-14 01:04
> Please use the file from Fedora in a test.

msg371926 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-06-20 09:11
However read_mime_types() still uses the locale encoding. See issue41048.
Date User Action Args
2022-04-11 14:57:21adminsetgithub: 57234
2020-06-20 09:11:44serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg371926
2011-10-14 01:04:46vstinnersetstatus: open -> closed
resolution: fixed
messages: + msg145494
2011-10-14 01:04:25python-devsetnosy: + python-dev
messages: + msg145493
2011-09-23 16:59:50ezio.melottisettype: behavior
stage: test needed
2011-09-23 16:42:22eric.araujosetnosy: + eric.araujo
messages: + msg144455
2011-09-20 23:07:16vstinnersetcomponents: + Library (Lib), Unicode
2011-09-20 23:07:07vstinnercreate