This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: importlib lock race issue in deadlock handling code
Type: Stage:
Components: Interpreter Core Versions: Python 3.10
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: rpurdie
Priority: normal Keywords:

Created on 2022-04-01 20:23 by rpurdie, last changed 2022-04-11 14:59 by admin.

Files
File name Uploaded Description Edit
testit2.py rpurdie, 2022-04-01 20:23
Messages (2)
msg416517 - (view) Author: Richard Purdie (rpurdie) Date: 2022-04-01 20:23
We've seen tracebacks in production like:

  File "<frozen importlib._bootstrap>", line 1004, in _find_and_load(name='oe.gpg_sign', import_=<built-in function __import__>)
  File "<frozen importlib._bootstrap>", line 158, in _ModuleLockManager.__enter__()
  File "<frozen importlib._bootstrap>", line 110, in _ModuleLock.acquire()
 KeyError: 139622474778432

and

  File "<frozen importlib._bootstrap>", line 1004, in _find_and_load(name='oe.path', import_=<built-in function __import__>)
  File "<frozen importlib._bootstrap>", line 158, in _ModuleLockManager.__enter__()
  File "<frozen importlib._bootstrap>", line 110, in _ModuleLock.acquire()
 KeyError: 140438942700992

I've attached a reproduction script which shows that if an import XXX is in progress and waiting at the wrong point when an interrupt arrives (in this case a signal) and triggers it's own import YYY, _blocking_on[tid] in importlib/_bootstrap.py gets overwritten and lost, triggering the traceback we see above upon exit from the second import.

I'm using a signal handler here as the interrupt, I don't know what our production source is as yet but this reproducer proves it is possible.
msg416539 - (view) Author: Richard Purdie (rpurdie) Date: 2022-04-02 06:56
This is a production backtrace after I inserted code to traceback if tid was already in _blocking_on. It is being triggered by a warning about an unclosed asyncio event loop and confirms my theory about nested imports, in the production case I'd guess being triggered by gc given the __del__.

  File "/home/pokybuild/yocto-worker/oe-selftest-fedora/build/meta/classes/base.bbclass", line 26, in oe_import
    import oe.data
  File "<frozen importlib._bootstrap>", line 1024, in _find_and_load
  File "<frozen importlib._bootstrap>", line 171, in __enter__
  File "/home/pokybuild/yocto-worker/oe-selftest-fedora/build/bitbake/lib/bb/cooker.py", line 168, in acquire
    return orig_acquire(self)
  File "<frozen importlib._bootstrap>", line 110, in acquire
  File "/usr/lib64/python3.10/asyncio/base_events.py", line 685, in __del__
    _warn(f"unclosed event loop {self!r}", ResourceWarning, source=self)
  File "/usr/lib64/python3.10/warnings.py", line 112, in _showwarnmsg
    _showwarnmsg_impl(msg)
  File "/usr/lib64/python3.10/warnings.py", line 28, in _showwarnmsg_impl
    text = _formatwarnmsg(msg)
  File "/usr/lib64/python3.10/warnings.py", line 128, in _formatwarnmsg
    return _formatwarnmsg_impl(msg)
  File "/usr/lib64/python3.10/warnings.py", line 56, in _formatwarnmsg_impl
    import tracemalloc
  File "<frozen importlib._bootstrap>", line 1024, in _find_and_load
  File "<frozen importlib._bootstrap>", line 171, in __enter__
  File "/home/pokybuild/yocto-worker/oe-selftest-fedora/build/bitbake/lib/bb/cooker.py", line 167, in acquire
    bb.warn("\n".join(traceback.format_stack()))
History
Date User Action Args
2022-04-11 14:59:58adminsetgithub: 91351
2022-04-02 06:56:46rpurdiesetmessages: + msg416539
2022-04-01 20:23:24rpurdiecreate