This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Fix mimetype.init() to account for from import
Type: behavior Stage: patch review
Components: Versions: Python 3.10
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: AWhetter, YoSTEALTH, andrei.avk, barry, brandtbucher, cheryl.sabella, maxking, r.david.murray, steve.dower, xtreak
Priority: normal Keywords: patch

Created on 2018-10-08 22:13 by YoSTEALTH, last changed 2022-04-11 14:59 by admin.

Pull Requests
URL Status Linked Edit
PR 16567 open AWhetter, 2019-10-03 19:37
Messages (7)
msg327375 - (view) Author: (YoSTEALTH) * Date: 2018-10-08 22:13
When a user uses from import, there is a flaw in how mimetype.init() updates its global references.

# Option-1 (flawed)
# -----------------
from mimetypes import init, types_map

print(types_map.get('.gz'))  # None
init()  # <- initialize
print(types_map.get('.gz'))  # None

# Option-2
# --------
import mimetypes

print(mimetypes.types_map.get('.gz'))  # None
mimetypes.init()  # <- initialize
print(mimetypes.types_map.get('.gz'))  # application/gzip

As you can see in https://github.com/python/cpython/blob/master/Lib/mimetypes.py#L344 line:358 global reference is reassigned and thus it prevents `from mimetype import types_map` from being updated and using old `types_map` reference.

Potential solution would be to `types_map.update(new dict content)` vs reassigning the variable.
msg353966 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2019-10-04 18:18
Adding email team here, since you were pinged on the PR.

PR 16567 doesn't fix the `inited` member, which is considerably more difficult to do, unfortunately, but it does mean that it's out of sync with the values you hold. Arguably this is a better state to be in though.

Unfortunately, the "internal" data structures referred to in the docs[1] are all documented public API, which means we can't just ignore this issue :( We ought to at least update the docs to mention that `inited` won't be updated if you've `from ... import`ed it (which would normally be obvious, except that now some other values _will_ be updated).

Also, the `init` docs say that it will only be completely rebuilt when `files` is `None` (and specifically calls out passing an empty list as a way to avoid this happening). I'm pretty sure we don't respect this now.

All up, I think we have more work to do here to get the behaviour right. Alternatively, we can make changes in 3.9 to specify better usage (such as explicitly discouraging direct use of the dictionaries and adding more helper functions if necessary).

[1]: https://docs.python.org/3/library/mimetypes.html#mimetypes.init
msg355581 - (view) Author: Ashley Whetter (AWhetter) * Date: 2019-10-28 18:00
Maybe updating the dictionaries isn't right at all.
I think if you were experienced enough you would have the intuition that mutable attributes like this are not going to change in your local scope if you import them into it. But relying on this knowledge makes it difficult for newer users. Plus we all make mistakes ;) So a note in the docs sounds like a good idea to me.

I think `init` does rebuild when `files` is `None` (https://github.com/python/cpython/blob/edb172a87296d9359593a23cd9a09f5867ea1f0e/Lib/mimetypes.py#L350) but it's not clear to me whether rebuilding should include any additional types registered with `add_type`. At the moment it says `the mapping will be added to the official MIME types`, but I think an additional note that this means that it will persist after subsequent calls to `init` would make that clear.

As for importing `init` into the local scope, I think ideally we wouldn't use `global` at all. We should be able to make `init` import `mimetypes` locally and explicitly access attributes on the module?
Another option is we make the module attributes accessible through a module level `__getattr__` which gets the attributes off of the global `Mimetypes` object stored in `mimetypes._db`. But that would still require usage of a `global _db`.
msg358718 - (view) Author: (YoSTEALTH) * Date: 2019-12-20 16:29
I didn't receive any notification of replay on this topic.

`_default_mime_types()` should never have been a function. This function should be safe to remove as its an internal function. This would avoid unneeded function call and globals used.

Stuff like `_suffix_map_default` could be constants like `SUFFIX_MAP_DEFAULT` with `suffix_map = SUFFIX_MAP_DEFAULT.copy()`

Ashley if you feel like making these changes (optional)
msg359070 - (view) Author: Ashley Whetter (AWhetter) * Date: 2019-12-30 23:37
Yes I'm happy to make those changes as part of this.

So clarify what needs to change in PR 16567:

1) Include a note in the docs for `inited` that outlines that if it is imported into the local scope, it will not be updated by calls to `init()`. Only `mimetypes.inited` will.

2) Add a clarification to the `add_type()` docs that changes made with `strict=True` means that they will persist after subsequent calls to `init()`.

3) Remove `_default_mime_types()` in favour of public global constants. The documentation currently refers to these globals as the "official MIME types". Should that wording change to use "default" instead of "official", or maybe should the names of the constants change to `OFFICIAL_SUFFIX_MAP` for example?
msg369746 - (view) Author: Cheryl Sabella (cheryl.sabella) * (Python committer) Date: 2020-05-23 20:03
@steve.dower, Mariatta had a question for you on the PR.
msg399490 - (view) Author: Andrei Kulakov (andrei.avk) * (Python triager) Date: 2021-08-13 04:01
How about returning the updated dicts from init() and showing an example in the docs of such usage?
History
Date User Action Args
2022-04-11 14:59:06adminsetgithub: 79119
2021-08-13 04:01:58andrei.avksetnosy: + andrei.avk
messages: + msg399490
2020-07-06 08:47:40terry.reedysetversions: + Python 3.10, - Python 2.7, Python 3.7, Python 3.8, Python 3.9
2020-05-23 20:03:39cheryl.sabellasetnosy: + cheryl.sabella
messages: + msg369746
2019-12-30 23:37:40AWhettersetmessages: + msg359070
2019-12-20 16:29:05YoSTEALTHsetmessages: + msg358718
2019-10-28 18:00:37AWhettersetnosy: + AWhetter
messages: + msg355581
2019-10-04 18:25:27brandtbuchersetnosy: + brandtbucher
2019-10-04 18:18:00steve.dowersetnosy: + maxking, barry, r.david.murray, steve.dower

messages: + msg353966
versions: + Python 3.9, - Python 3.4, Python 3.5, Python 3.6
2019-10-03 19:37:10AWhettersetkeywords: + patch
stage: patch review
pull_requests: + pull_request16158
2018-10-09 06:13:01xtreaksetnosy: + xtreak
2018-10-08 22:13:09YoSTEALTHcreate