classification
Title: list(sys.modules.items()) can throw RuntimeError: dictionary changed size during iteration
Type: behavior Stage: resolved
Components: Extension Modules Versions: Python 3.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: Markus Mohrhard, miss-islington, nirs, rhettinger, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2020-04-19 08:24 by Markus Mohrhard, last changed 2020-07-09 12:40 by nirs. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 19628 merged rhettinger, 2020-04-20 20:28
PR 19640 merged miss-islington, 2020-04-21 23:21
Messages (7)
msg366762 - (view) Author: Markus Mohrhard (Markus Mohrhard) * Date: 2020-04-19 08:24
We have hit an issue in the pickle module where the code throws an exception in a threaded environment:

The interesting piece of the backtrace is:

  File "/xxx/1004060/lib/python3.7/site-packages/numpy/core/__init__.py", line 130, in _ufunc_reduce
    return _ufunc_reconstruct, (whichmodule(func, name), name)
  File "/xxx/lib/python3.7/pickle.py", line 309, in whichmodule
    for module_name, module in list(sys.modules.items()):
RuntimeError: dictionary changed size during iteration

I tried to find a code path that would explain how the dict could be changed while the list is created but have not been able to find a code path that releases the GIL.

The executable is using many threads with imports happening in random threads and a custom class loader but we already make sure that the class loader is always holding the GIL.

The issue happens quite rarely (maybe once every one thousand's execution) so I don't have a reproducer right now.
msg366798 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2020-04-19 18:56
Suggest making this change:

-    for module_name, module in list(sys.modules.items()):
+    for module_name, module in sys.modules.copy().items():
msg366804 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2020-04-19 21:22
We should consider making this change just about everywhere in the standard library.  Besides avoiding a race condition, running a for loop over `d.copy().items()` is more memory efficient and possibly faster than a for loop over `list(d.items())`.
msg366854 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-04-20 16:33
I afraid that this is a part of the larger issue (see also issue31165). Seems the cause is that the GIL can be released when allocate or reallocate the memory. Does your project use some memory managing hooks?

Using sys.modules.copy().items() can help a lot, because you allocate memory for new dict only once. In case of list(sys.modules.items()) you allocate memory for every key-value pair, and may allocate memory several times to resize a list.

This reduces the probability of occurring the problem in many times, but not to zero. There is still a chance that the size of the original dict increase when you allocate a memory for a new dict, so copying the data to a new dict can fail or even crash or spoil the memory.
msg366952 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2020-04-21 23:20
New changeset 75bedbe2ed4119ff18a2ea86c544b3cf08a92e75 by Raymond Hettinger in branch 'master':
bpo-40327: Improve atomicity, speed, and memory efficiency of the items() loop (GH-19628)
https://github.com/python/cpython/commit/75bedbe2ed4119ff18a2ea86c544b3cf08a92e75
msg369660 - (view) Author: miss-islington (miss-islington) Date: 2020-05-22 22:22
New changeset 16d07812dd3833295cc001d19eea42eecbdb6ea5 by Miss Islington (bot) in branch '3.8':
bpo-40327: Improve atomicity, speed, and memory efficiency of the items() loop (GH-19628)
https://github.com/python/cpython/commit/16d07812dd3833295cc001d19eea42eecbdb6ea5
msg373399 - (view) Author: Nir Soffer (nirs) * Date: 2020-07-09 12:40
Does this really affect only python 3.7?

We see this in RHEL 8.2, using python 3.6.8:
https://bugzilla.redhat.com/show_bug.cgi?id=1837199#c69

Likely caused by:

    lvs = dict(self._lvs)

Without locking. self._lvs is a dict that may contain 1000's of items.

I'm not sure if this is relvant now for upstream, but backport to 3.6 would be useful.
History
Date User Action Args
2020-07-09 12:40:24nirssetnosy: + nirs
messages: + msg373399
2020-05-22 23:15:10rhettingersetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2020-05-22 22:22:54miss-islingtonsetmessages: + msg369660
2020-04-21 23:21:04miss-islingtonsetnosy: + miss-islington
pull_requests: + pull_request18964
2020-04-21 23:20:55rhettingersetmessages: + msg366952
2020-04-20 20:28:28rhettingersetkeywords: + patch
stage: patch review
pull_requests: + pull_request18956
2020-04-20 16:33:11serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg366854
2020-04-19 21:22:26rhettingersetmessages: + msg366804
2020-04-19 18:56:46rhettingersetnosy: + rhettinger
messages: + msg366798
2020-04-19 08:24:06Markus Mohrhardcreate