classification
Title: mimetypes should read the rule file using UTF-8, not the locale encoding
Type: behavior Stage: test needed
Components: Library (Lib), Unicode Versions: Python 3.2, Python 3.3
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: eric.araujo, haypo, python-dev, sandro.tosi, terry.reedy
Priority: normal Keywords: patch

Created on 2011-09-20 23:07 by haypo, last changed 2011-10-14 01:04 by haypo. This issue is now closed.

Files
File name Uploaded Description Edit
mimetypes_encoding.patch haypo, 2011-09-20 23:07 review
Messages (4)
msg144357 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-09-20 23:07
On Debian and Ubuntu, /etc/mime.types file is pure ASCII, but on Fedora 15 it contains a non-ASCII character, ³ (U+00B3), in the line:
"application/vnd.geocube+xml                     g3 g³"

And the file is encoded in UTF-8.

That's why Python should read this file from UTF-8 instead of the locale encoding, because the locale encoding can be ASCII. Attached patch implements this idead.

I think that it is a bug and so it should also be fixed in Python 3.2.

(Python 2.7 reads the file in binary mode, it doesn't care of the encoding.)
msg144455 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011-09-23 16:42
+1.  I’ve finally understood that open using the locale is Evil™.  Please use the file from Fedora in a test.
msg145493 - (view) Author: Roundup Robot (python-dev) Date: 2011-10-14 01:04
New changeset 8d8ab3e04363 by Victor Stinner in branch '3.2':
Issue #13025: mimetypes is now reading MIME types using the UTF-8 encoding,
http://hg.python.org/cpython/rev/8d8ab3e04363

New changeset 2c223d686feb by Victor Stinner in branch 'default':
(Merge 3.2) Issue #13025: mimetypes is now reading MIME types using the UTF-8
http://hg.python.org/cpython/rev/2c223d686feb
msg145494 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-10-14 01:04
> Please use the file from Fedora in a test.

Done.
History
Date User Action Args
2011-10-14 01:04:46hayposetstatus: open -> closed
resolution: fixed
messages: + msg145494
2011-10-14 01:04:25python-devsetnosy: + python-dev
messages: + msg145493
2011-09-23 16:59:50ezio.melottisettype: behavior
stage: test needed
2011-09-23 16:42:22eric.araujosetnosy: + eric.araujo
messages: + msg144455
2011-09-20 23:07:16hayposetcomponents: + Library (Lib), Unicode
2011-09-20 23:07:07haypocreate