Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Import deadlock detection causes deadlock #82272

Open
rlamy mannequin opened this issue Sep 10, 2019 · 4 comments
Open

Import deadlock detection causes deadlock #82272

rlamy mannequin opened this issue Sep 10, 2019 · 4 comments
Labels
3.7 (EOL) end of life 3.8 only security fixes 3.9 only security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@rlamy
Copy link
Mannequin

rlamy mannequin commented Sep 10, 2019

BPO 38091
Nosy @brettcannon, @pitrou, @ericsnowcurrently, @rlamy, @phmc, @miss-islington, @vzhestkov
PRs
  • bpo-38091: Import deadlock detection causes deadlock #17518
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2019-09-10.13:26:40.706>
    labels = ['3.7', '3.8', 'type-bug', 'library', '3.9']
    title = 'Import deadlock detection causes deadlock'
    updated_at = <Date 2021-07-20.16:00:14.797>
    user = 'https://github.com/rlamy'

    bugs.python.org fields:

    activity = <Date 2021-07-20.16:00:14.797>
    actor = 'vzhestkov'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Library (Lib)']
    creation = <Date 2019-09-10.13:26:40.706>
    creator = 'Ronan.Lamy'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 38091
    keywords = ['patch']
    message_count = 3.0
    messages = ['351647', '363232', '397884']
    nosy_count = 7.0
    nosy_names = ['brett.cannon', 'pitrou', 'eric.snow', 'Ronan.Lamy', 'pconnell', 'miss-islington', 'vzhestkov']
    pr_nums = ['17518']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue38091'
    versions = ['Python 3.6', 'Python 3.7', 'Python 3.8', 'Python 3.9']

    @rlamy
    Copy link
    Mannequin Author

    rlamy mannequin commented Sep 10, 2019

    There seems to be a race condition in importlib._bootstrap._ModuleLock that can cause a deadlock. The sequence of operations is as follows:

    • Thread 1 calls lock.acquire()
    • Thread 1 sets itself as lock.owner and begins importing the module
    • Thread 2 calls lock.acquire() and waits for lock.lock
    • Thread 2 gets lock.lock
    • Thread 1 calls lock.acquire() again, due to a nested import
    • Thread 1 sets itself as blocking on lock: _blocking_on[tid1] = lock
    • Thread 2 enters lock.has_deadlock()
    • Thread 2 busy-waits forever in has_deadlock() because lock.owner == tid1 and _blocking_on[tid1] == lock
    • Thread 1 waits forever for lock.lock since thread 2 owns it

    The issue was found in pypy3 but it also affects all the recent CPython versions I tried.
    I can reliably reproduce the issue by adding an artificial delay to _ModuleLock.has_deadlock(), e.g. with this patch:

    diff --git a/Lib/test/test_import/__init__.py b/Lib/test/test_import/__init__.py
    index f167c84..7f7188e 100644
    --- a/Lib/test/test_import/__init__.py
    +++ b/Lib/test/test_import/__init__.py
    @@ -435,10 +435,15 @@ class ImportTests(unittest.TestCase):
                     os.does_not_exist
     
         def test_concurrency(self):
    +        def delay_has_deadlock(frame, event, arg):
    +            if event == 'call' and frame.f_code.co_name == 'has_deadlock':
    +                time.sleep(0.2)
    +
             sys.path.insert(0, os.path.join(os.path.dirname(__file__), 'data'))
             try:
                 exc = None
                 def run():
    +                sys.settrace(delay_has_deadlock)
                     event.wait()
                     try:
                         import package

    @rlamy rlamy mannequin added 3.7 (EOL) end of life 3.8 only security fixes 3.9 only security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels Sep 10, 2019
    @miss-islington
    Copy link
    Contributor

    New changeset 6daa37f by Armin Rigo in branch 'master':
    bpo-38091: Import deadlock detection causes deadlock (GH-17518)
    6daa37f

    @vzhestkov
    Copy link
    Mannequin

    vzhestkov mannequin commented Jul 20, 2021

    I ported the fix from 6daa37f for 3.6 and 3.8 shipped with SLE 15SP2 and openSUSE Tumbleweed, but it seems that this fix doesn't help.
    I have a deadlocks on running salt-api process managing salt-ssh systems with high workload. The service can get the deadlock in first 5 minutes or after 3-60 minutes of running the service with the same workload with almost equal chances.

    Here is the part of py-bt I see each time:

    (gdb) py-bt
    Traceback (most recent call first):
      File "<frozen importlib._bootstrap>", line 107, in acquire
      File "<frozen importlib._bootstrap>", line 158, in __enter__
      File "<frozen importlib._bootstrap>", line 595, in _exec
      File "<frozen importlib._bootstrap>", line 271, in _load_module_shim
      File "<frozen importlib._bootstrap_external>", line 852, in load_module
      File "<frozen importlib._bootstrap_external>", line 1027, in load_module
      File "<frozen importlib._bootstrap_external>", line 1034, in _check_name_wrapper
      File "/usr/lib/python3.8/site-packages/salt/loader.py", line 4779, in _load_module
      File "/usr/lib/python3.8/site-packages/salt/loader.py", line 1926, in _inner_load
        if self._load_module(name) and key in self._dict:
      File "/usr/lib/python3.8/site-packages/salt/loader.py", line 2193, in _load
      File "/usr/lib/python3.8/site-packages/salt/utils/lazy.py", line 99, in __getitem__
        if self._load(key):
      File "/usr/lib/python3.8/site-packages/salt/loader.py", line 1283, in __getitem__
        func = super().__getitem__(item)
      File "/usr/lib/python3.8/site-packages/salt/loader.py", line 1139, in __getitem__
        return self._dict[key + self.suffix]
      File "/usr/lib/python3.8/site-packages/salt/template.py", line 495, in check_render_pipe_str
      File "/usr/lib/python3.8/site-packages/salt/loader.py", line 1428, in render
        f_noext,
      File "/usr/lib/python3.8/site-packages/salt/pillar/__init__.py", line 781, in __init__
    ...

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    @encukou
    Copy link
    Member

    encukou commented Aug 1, 2022

    I just saw a buildbot failure that might be related to the test added in #17518:

    test_concurrency (test.test_import.ImportTests.test_concurrency) ... Warning -- Uncaught thread exception: RuntimeError
    Exception in thread Thread-3 (run):
    Traceback (most recent call last):
      File "/home/dje/cpython-buildarea/3.x.edelsohn-sles-z/build/Lib/threading.py", line 1036, in _bootstrap_inner
        self.run()
      File "/home/dje/cpython-buildarea/3.x.edelsohn-sles-z/build/Lib/threading.py", line 973, in run
        self._target(*self._args, **self._kwargs)
      File "/home/dje/cpython-buildarea/3.x.edelsohn-sles-z/build/Lib/test/test_import/__init__.py", line 471, in run
        sys.settrace(None)
    RuntimeError: Cannot install a trace function while another trace function is being installed
    ok
    

    encukou referenced this issue Aug 1, 2022
    This removes the unused negative_dictoffset function:
    the type this function would create is available as
        _testcapi.HeapCTypeWithNegativeDict
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life 3.8 only security fixes 3.9 only security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants