Title: Inconsistency with uppercase file extensions in MimeTypes.guess_type
Type: behavior Stage:
Components: Library (Lib) Versions: Python 3.5
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: r.david.murray, rodrigo.parra, tim.golden
Priority: normal Keywords: patch

Created on 2014-01-25 17:17 by rodrigo.parra, last changed 2014-01-26 15:55 by r.david.murray.

File name Uploaded Description Edit
case_guess_type.patch rodrigo.parra, 2014-01-25 17:17 review
Messages (2)
msg209218 - (view) Author: Rodrigo Parra (rodrigo.parra) Date: 2014-01-25 17:17
The functions looks up for the file extension in three maps: types_map, suffix_map and encodings_map.

Lookup in types_map is case insensitive (by calling lower() first).
Lookup in both suffix_map and encodings_map is case sensitive.

These can lead to some seemingly counterintuitive results, like:

guess_type("foo.tar") == ("application/x-tar", None)
guess_type("foo.TAR") == ("application/x-tar", None)

guess_type("foo.tgz") == ("application/x-tar", "gzip")
guess_type("foo.TGZ") == (None, None)

guess_type("foo.tar.gz") == ("application/x-tar", "gzip")
guess_type("foo.TAR.GZ") == (None, None)

Lookup should be case insensitive at least for the suffix_map, in which case (b) would be solved. The submitted patch implements this change.

As for the encodings_map, I am not so sure, in particular because of the tar.Z extension. I found that the compress command expects the uppercase 'Z'. If someone is relying in the results of guess_type to call compress, errors could occur.
msg209332 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2014-01-26 15:55
I'm tagging this for 3.5 instead, since there are backward compatibility concerns and the 3.4 RC will probably be a couple weeks from now.
Date User Action Args
2014-01-26 15:55:33r.david.murraysetnosy: + r.david.murray

messages: + msg209332
versions: + Python 3.5, - Python 3.4
2014-01-25 17:17:48rodrigo.parracreate