classification
Title: mimetypes.guess_extension() doesn’t get JPG right
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.8, Python 3.7
process
Status: open Resolution: remind
Dependencies: Superseder:
Assigned To: Nosy List: _savage, corona10, fbidu, xtreak
Priority: normal Keywords:

Created on 2019-08-24 23:06 by _savage, last changed 2020-08-23 21:55 by _savage.

Messages (7)
msg350408 - (view) Author: Jens Troeger (_savage) * Date: 2019-08-24 23:06
I think this one’s quite easy to reproduce:

  Python 3.7.4 (default, Jul 11 2019, 01:08:00) 
  [Clang 10.0.1 (clang-1001.0.46.4)] on darwin
  Type "help", "copyright", "credits" or "license" for more information.
  >>> import mimetypes
  >>> mimetypes.guess_extension("image/jpg")  # Expected ".jpg"
  >>> mimetypes.guess_extension("image/jpeg")  # Expected ".jpg"
  '.jpe'

According to MDN

  https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/MIME_types/Complete_list_of_MIME_types

only "image/jpeg" is a valid MIME type; however, I’ve seen quite a bit of "image/jpg" out in the wild and I think that ought to be accounted for too.

Before I look into submitting a PR I wanted to confirm that this is an issue that ought to be fixed. I think it is.
msg350409 - (view) Author: Jens Troeger (_savage) * Date: 2019-08-24 23:08
Oops, forgot…

  >>> mimetypes.guess_extension("image/jpeg")  # Expected ".jpg" or ".jpeg"

as per referenced MDN. I personally would go with ".jpg" because that's the more common file name extension.
msg350917 - (view) Author: Dong-hee Na (corona10) * (Python committer) Date: 2019-08-31 14:41
It works well on the master branch version but also the latest 3.7 branch
I think that we can close this issue for `.jpe` issue if we don't have to support "image/jpg" case.


Python 3.9.0a0 (heads/master:daa82d019c, Aug 31 2019, 23:37:00)
[Clang 10.0.1 (clang-1001.0.46.4)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import mimetypes
>>> mimetypes.guess_extension("image/jpg")
>>> mimetypes.guess_extension("image/jpeg")
'.jpg'

Python 3.7.4+ (heads/3.7:9a28400aac, Aug 31 2019, 23:34:02)
[Clang 10.0.1 (clang-1001.0.46.4)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import mimetypes
>>> mimetypes.guess_extension("image/jpg")
>>> mimetypes.guess_extension("image/jpeg")
'.jpg'
msg350918 - (view) Author: Karthikeyan Singaravelan (xtreak) * (Python committer) Date: 2019-08-31 14:50
I think this is fixed with 2a99fd911ebeecedbb250a05667cd46eca4735b9 which would be included in 3.7.5 since this missed 3.7.4RC1 . There is also a test for this at https://github.com/python/cpython/blob/daa82d019c52e95c3c57275307918078c1c0ac81/Lib/test/test_mimetypes.py#L103
msg375740 - (view) Author: Jens Troeger (_savage) * Date: 2020-08-21 00:57
This is still not working: tried it on Python 3.8.5 and Python 3.7.8.

>>> import mimetypes
>>> mimetypes.guess_extension('image/jpg')
>>> mimetypes.guess_extension('image/jpeg')
'.jpg'

Both should return the same value; I expected the mimetype 'image/jpg' to return extension '.jpg' because that mimetype is used a lot.
msg375794 - (view) Author: Felipe Rodrigues (fbidu) * Date: 2020-08-22 11:22
@_savage, on the commit @xtreak referred, there's a note that "image/jpg" and some other non-standard mimetypes are only supported if `strict=False`[1]

So, this:

>>> mimetypes.guess_extension("image/jpg")

Gives no return. But this works:

>>> mimetypes.guess_extension("image/jpg", strict=False)
'.jpg'


---------

I guess we could improve the current documentation [2]. It currently specifies correctly the `strict` behavior:


> The optional strict argument is a flag specifying whether the list of known MIME types is limited to
> only the official types registered with IANA. When strict is True (the default), only the IANA types
> are supported; when strict is False, some additional non-standard but commonly used MIME types are 
> also recognized.

But I think it would be nice to have a table specifying what are those "non-standard but commonly used MIME types". Personally, I'd have a hard time guessing on a regular day of my life which of 'image/jpeg' and 'image/jpg' is standard or not. We could even add a nice note pointing out that the `common_types` property [3] is a list of those supported non-standard type .

Given the fact that the `strict` flag is used by different methods with the same behavior, maybe we could add a note on the top of the doc explaining the general meaning of that flag.



[1]: https://github.com/python/cpython/commit/2a99fd911ebeecedbb250a05667cd46eca4735b9#diff-fc65388a9cdf41980b2c31de5de67758R547

[2]: https://docs.python.org/3.10/library/mimetypes.html#mimetypes.guess_type

[3]: https://docs.python.org/3.10/library/mimetypes.html#mimetypes.common_types
msg375828 - (view) Author: Jens Troeger (_savage) * Date: 2020-08-23 21:55
@fbidu, oh I missed that, thank you! Shall I close the issue again, or what’s the common procedure in this case?
History
Date User Action Args
2020-08-23 21:55:25_savagesetmessages: + msg375828
2020-08-22 11:22:21fbidusetnosy: + fbidu
messages: + msg375794
2020-08-21 00:57:25_savagesetstatus: closed -> open
resolution: out of date -> remind
messages: + msg375740
2019-08-31 18:12:30ned.deilysetstatus: open -> closed
stage: resolved
resolution: out of date
versions: + Python 3.8
2019-08-31 14:50:50xtreaksetnosy: + xtreak
messages: + msg350918
2019-08-31 14:41:34corona10setnosy: - vstinner
2019-08-31 14:41:02corona10setnosy: + vstinner, corona10
messages: + msg350917
2019-08-24 23:08:14_savagesetmessages: + msg350409
2019-08-24 23:06:17_savagecreate