Author benhoyt
Recipients barry, benhoyt, brian.curtin, dlchambers, ishimoto, r.david.murray, sayap, terry.reedy, tim.golden
Date 2012-12-06.07:57:40
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1354780661.42.0.455104301647.issue15207@psf.upfronthosting.co.za>
In-reply-to
Content
Ah, thanks for making this an issue of its own! As I commented over at Issue10551, it's a serious problem, and makes mimetypes.guess_type() unusable out of the box on Windows.

Yes, the fix in Issue4969 uses "MIME\Database\Content Type", which is a mime type -> file extension mapping, *not the other way around*.

So while this patch is definitely an improvement (for the most part it doesn't produce wrong values!), but I'm not sure it's the way to go, for a few reasons:

1) Many of the most important keys aren't in the Windows registry (in HKEY_CLASSES_ROOT, where this patch looks). This includes .png, .jpg, and .gif. All of these important types fall back to the hard-coded "types_map" in mimetypes.py anyway.

2) Some that do exist are wrong in the registry (or at the least, different from the built-in "types_map"). This includes .zip, which is "application/x-zip-compressed" (at least in my registry) but should be "application/zip".

3) It's slowish, as it has to load about 6000 registry keys (and more as you install more stuff on your system), but only about 200 of those have the "Content Type" subkey. On my machine (Windows 7, 64 bit CPython) this adds over 100ms to the startup time even on subsequent runs when cached -- and I think 100ms is pretty significant. Issue4969's version takes about 25ms, and reverting this completely would of course take 0ms.

4) Users and other programs can (and sometimes do!) change the Content Type keys in the registry -- whereas one wants mime type mappings to be consistent across systems. This last point is debatable for various reasons, and I think the above three points should carry the day, but I include it here for completeness. ;-)

For these reasons, I think we should revert the fix for Issue4969 and leave Windows users to get the default types_map as before, which is at least consisent -- and for mimetypes.guess_type(), you want consistency.
History
Date User Action Args
2012-12-06 07:57:41benhoytsetrecipients: + benhoyt, barry, terry.reedy, ishimoto, tim.golden, sayap, r.david.murray, brian.curtin, dlchambers
2012-12-06 07:57:41benhoytsetmessageid: <1354780661.42.0.455104301647.issue15207@psf.upfronthosting.co.za>
2012-12-06 07:57:41benhoytlinkissue15207 messages
2012-12-06 07:57:40benhoytcreate