Author Dominik Czarnota
Recipients Dominik Czarnota
Date 2019-07-09.13:59:20
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1562680761.15.0.288138014947.issue37529@roundup.psfhosted.org>
In-reply-to
Content
The mimetype builtin module allows users to guess extension for a given mimetype through the `mimetypes.guess_extension` function.

Default mimetypes are stored in `types_map` and `_types_map_default` dictionaries that maps extensions to mimetypes. Those dictionaries are created by `_default_mime_types` function in `cpython/Lib/mimetypes.py`.

If a given extension have more than one mimetype, this information is lost.
This happens currently for ".bmp" extension in CPython's codebase.

This can be seen in the linked code below:
https://github.com/python/cpython/blob/110a47c4f42cf4db88edc1876899fff8f05190fb/Lib/mimetypes.py#L490-L502

Here is an example in an interactive IPython session:
```
In [1]: import mimetypes

In [2]: mimetypes.guess_extension('image/bmp')
Out[2]: '.bmp'

In [3]: mimetypes.guess_extension('image/x-ms-bmp')

In [4]:
```

The issue has been found by using Semmle's LGTM: https://lgtm.com/projects/g/python/cpython/snapshot/d099f261c762ac81042e47b530d279f932d89e09/files/Lib/mimetypes.py?sort=name&dir=ASC&mode=heatmap


PS / offtopic / loud thinking: Maybe there should be a debug build of CPython that would detect such key overwrites during dicts initialisation and warn about them?
History
Date User Action Args
2019-07-09 13:59:21Dominik Czarnotasetrecipients: + Dominik Czarnota
2019-07-09 13:59:21Dominik Czarnotasetmessageid: <1562680761.15.0.288138014947.issue37529@roundup.psfhosted.org>
2019-07-09 13:59:21Dominik Czarnotalinkissue37529 messages
2019-07-09 13:59:20Dominik Czarnotacreate