classification
Title: mimetypes doesn't recognize .csv
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.6, Python 3.5, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: Paul.Cauchon, Werner Van Geit, berker.peksag, eric.araujo, gmwils, iwd32900, petri.lehtinen, pitrou, python-dev, r.david.murray, sandro.tosi, terry.reedy
Priority: normal Keywords: patch

Created on 2012-02-06 17:43 by iwd32900, last changed 2016-04-09 05:20 by berker.peksag. This issue is now closed.

Files
File name Uploaded Description Edit
issue13952.patch gmwils, 2013-02-23 17:25 review
Messages (16)
msg152751 - (view) Author: Ian Davis (iwd32900) Date: 2012-02-06 17:43
The mimetypes module does not respond with "text/csv" for files that end in ".csv", and I think it should  :)  For goodness sake, "text/tab-delimited-values" is in there as ".tsv", and that seems much less used (to me).
msg152752 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012-02-06 18:04
Yes, but text/tab-delimited-values/.tsv is older.  .tsv dates from the days of Gopher, but text/csv was formalized only in October of 2005.  Presumably nobody has asked for it before, for some odd reason.

Now we get to debate again whether updating mimetypes with a registered type can be considered a bug fix.  We've gone both ways in the past, as far as I can tell.  This one has the advantage of actually having a formal IANA registration, unlike the last couple.
msg152757 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-02-06 19:05
I would argue the embedded mime-types dictionary should at least mirror current IANA assignments, which are already present in up-to-date Unix systems:

>>> mimetypes.guess_type("foo.csv")
('text/csv', None)

So not having text/csv is IMHO a bug.
Also it would be nice if there were an easy way to keep the mime-types dictionary up-to-date wrt. a system's mime-types file.
msg152759 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012-02-06 19:11
As far as I know having it mirror the IANA registry is the intent (there's a comment in the module that can be read as implying that).  So I'd be inclined to treat this one as a bug and fix it in 2.7 and 3.2 as well as 3.3.

I'm not sure what you mean by your final comment, since by default the system mime types are read on both Unix and Windows and merged with the built in table.
msg152761 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-02-06 19:13
> I'm not sure what you mean by your final comment, since by default the
> system mime types are read on both Unix and Windows and merged with
> the built in table.

I mean to have our built-in table mirror a recent Unix system's
mime-types table. There could be a special switch to mimetypes.py, which
would output the Python code of a dict mirroring /etc/mime.types (or
"/etc/mime.types" + the current built-in table) when run. Then it would
be easy to integrate the changes back into the code.
msg152762 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012-02-06 19:19
Ah, analagous to the way keyword.py regenerates its embedded table based on the actual python grammar?  Yes, that would be nice.
msg153106 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2012-02-11 05:55
I’ve been one to argue that additions to the mimetypes registry are clearly new features.  Now if two senior devs like you think otherwise, I’m reconsidering.  These additions can’t possibly break code, can they?  So I can agree with a viewpoint that mimetypes should match what the IANA publishes and that adding missing types is a bugfix.  (It’s less disturbing than updating HTMLParser for example, and I agree with that.)  Georg’s inclusion of a registry addition for IIRC 3.2.2 would also indicate RM support for this viewpoint.

About Antoine’s remark: mimetypes already reads mime.types files, so even if our internal registry is not up-to-date the module should know about all types present in /etc/mime.types.
msg153129 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-02-11 15:06
> About Antoine’s remark: mimetypes already reads mime.types files, so
> even if our internal registry is not up-to-date the module should know
> about all types present in /etc/mime.types.

The point was about systems which don't have a /etc/mime.types
(Windows).
msg153137 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012-02-11 21:04
On Windows we do (now) read from the registry as well.  My guess is there are a lot more Windows systems out there with outdated registries then there are unix systems with outdated /etc/mime files, though.
msg153163 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2012-02-12 04:31
One solution would be to update our mimetypes file just before a new version, and then leave it until the next, just as we update unicodedata to current unicode and then leave it alone for bugfix releases. Rather than the entire IANA file, which has a lot of useless stuff, we might update from the most recent *nix files (assuming that they have less than 'everything').
msg182774 - (view) Author: Geoff Wilson (gmwils) * Date: 2013-02-23 17:25
Patch against 2.7 to add csv to the internal list. It is popular enough as a format, that it should work even if the system mime files are stale.
msg262730 - (view) Author: Werner Van Geit (Werner Van Geit) Date: 2016-04-01 09:34
Will this patch ever make it into the main python version ? I just ran into exactly this issue (mimetypes returns None as mimetype of csv file on Windows)
msg262789 - (view) Author: Berker Peksag (berker.peksag) * (Python committer) Date: 2016-04-02 06:17
I will commit issue13952.patch this weekend to 2.7, 3.5 and default.
msg263063 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2016-04-09 04:52
New changeset 711672506b40 by Berker Peksag in branch '3.5':
Issue #13952: Add .csv to mimetypes.types_map
https://hg.python.org/cpython/rev/711672506b40

New changeset 5143f86ffe57 by Berker Peksag in branch 'default':
Issue #13952: Add .csv to mimetypes.types_map
https://hg.python.org/cpython/rev/5143f86ffe57
msg263066 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2016-04-09 05:06
New changeset e704e0786332 by Berker Peksag in branch '2.7':
Issue #13952: Add .csv to mimetypes.types_map
https://hg.python.org/cpython/rev/e704e0786332
msg263070 - (view) Author: Berker Peksag (berker.peksag) * (Python committer) Date: 2016-04-09 05:20
Thanks for the patch, Geoff.
History
Date User Action Args
2016-04-09 05:20:24berker.peksagsetstatus: open -> closed
resolution: fixed
messages: + msg263070

stage: patch review -> resolved
2016-04-09 05:06:20python-devsetmessages: + msg263066
2016-04-09 04:52:28python-devsetnosy: + python-dev
messages: + msg263063
2016-04-02 06:17:34berker.peksagsetversions: - Python 3.2, Python 3.3, Python 3.4
nosy: + berker.peksag

messages: + msg262789

stage: needs patch -> patch review
2016-04-01 09:34:10Werner Van Geitsetnosy: + Werner Van Geit

messages: + msg262730
versions: + Python 3.4, Python 3.5, Python 3.6
2013-02-24 08:48:52petri.lehtinensetnosy: + petri.lehtinen
2013-02-23 17:25:15gmwilssetfiles: + issue13952.patch

nosy: + gmwils
messages: + msg182774

keywords: + patch
2012-05-15 12:16:53Paul.Cauchonsetnosy: + Paul.Cauchon
2012-02-12 04:31:50terry.reedysetnosy: + terry.reedy
messages: + msg153163
2012-02-11 21:04:27r.david.murraysetmessages: + msg153137
2012-02-11 15:06:15pitrousetmessages: + msg153129
2012-02-11 05:55:13eric.araujosetnosy: + sandro.tosi, eric.araujo
messages: + msg153106
2012-02-06 19:19:23r.david.murraysetmessages: + msg152762
2012-02-06 19:13:56pitrousetmessages: + msg152761
2012-02-06 19:11:04r.david.murraysetmessages: + msg152759
2012-02-06 19:05:02pitrousetnosy: + pitrou
messages: + msg152757
2012-02-06 18:04:26r.david.murraysetversions: + Python 3.2, Python 3.3, - Python 2.6
nosy: + r.david.murray

messages: + msg152752

stage: needs patch
2012-02-06 17:43:55iwd32900create