Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test_threaded_import: KeyError ignored in _get_module_lock.<locals>.cb #75253

Closed
vstinner opened this issue Jul 28, 2017 · 13 comments
Closed

test_threaded_import: KeyError ignored in _get_module_lock.<locals>.cb #75253

vstinner opened this issue Jul 28, 2017 · 13 comments
Labels
3.7 (EOL) end of life interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error

Comments

@vstinner
Copy link
Member

BPO 31070
Nosy @brettcannon, @ncoghlan, @vstinner, @serhiy-storchaka
PRs
  • [WIP] bpo-31070: Fix importlib _get_module_lock() race #3029
  • bpo-31070: Fix a race condition in importlib _get_module_lock(). #3033
  • [3.6] bpo-31070: Fix a race condition in importlib _get_module_lock(). (GH-3033). #3038
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2017-08-09.11:57:47.504>
    created_at = <Date 2017-07-28.15:03:41.001>
    labels = ['interpreter-core', 'type-bug', '3.7']
    title = 'test_threaded_import: KeyError ignored in _get_module_lock.<locals>.cb'
    updated_at = <Date 2017-10-28.04:46:37.586>
    user = 'https://github.com/vstinner'

    bugs.python.org fields:

    activity = <Date 2017-10-28.04:46:37.586>
    actor = 'ncoghlan'
    assignee = 'none'
    closed = True
    closed_date = <Date 2017-08-09.11:57:47.504>
    closer = 'serhiy.storchaka'
    components = ['Interpreter Core']
    creation = <Date 2017-07-28.15:03:41.001>
    creator = 'vstinner'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 31070
    keywords = []
    message_count = 13.0
    messages = ['299393', '299951', '299952', '299953', '299971', '299980', '299989', '299992', '299993', '299999', '300007', '300009', '305147']
    nosy_count = 4.0
    nosy_names = ['brett.cannon', 'ncoghlan', 'vstinner', 'serhiy.storchaka']
    pr_nums = ['3029', '3033', '3038']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue31070'
    versions = ['Python 3.6', 'Python 3.7']

    @vstinner
    Copy link
    Member Author

    See bpo-bpo-30891 and bpo-30876 for recent changes in importlib.

    http://buildbot.python.org/all/builders/AMD64%20Windows8.1%20Refleaks%203.x/builds/57/steps/test/logs/stdio

    1:33:30 [312/406/2] test_threaded_import passed (30 sec) -- running: test_decimal (826 sec), test_set (84 sec), test_mmap (1151 sec)
    beginning 6 repetitions
    123456
    ...Exception ignored in: <function _get_module_lock.<locals>.cb at 0x000000813CA18EB8>
    Traceback (most recent call last):
      File "<frozen importlib._bootstrap>", line 176, in cb
    KeyError: ('random',)
    ...

    @vstinner vstinner added tests Tests in the Lib/test dir 3.7 (EOL) end of life labels Jul 28, 2017
    @vstinner
    Copy link
    Member Author

    vstinner commented Aug 8, 2017

    Oh, it's easy to reproduce the bug on the master branch. Example on Linux:

    haypo@selma$ ./python -m test test_threaded_import
    Run tests sequentially
    0:00:00 load avg: 0.16 [1/1] test_threaded_import
    Exception ignored in: <function _get_module_lock.<locals>.cb at 0x7f53201e2f70>
    Traceback (most recent call last):
      File "<frozen importlib._bootstrap>", line 176, in cb
    KeyError: ('random',)
    1 test OK.

    Total duration: 2 sec
    Tests result: SUCCESS

    @vstinner
    Copy link
    Member Author

    vstinner commented Aug 8, 2017

    The problem is the weakref callback of _module_locks. It seems like the callback is called after a second thread replaced _module_locks[name]

    • Thread 1: _get_module_lock('random') sets _module_locks['random'] and registers a weakref callback on the newly created lock

    • Thread 1: the lock is destroyed, the weakref callback will be called as soon as possible

    • Thread 2: _get_module_lock('random') sees that the weakref is dead and so replaces _module_locks['random'] with a new lock, and registers a new weakref callback on the second lock

    • Thread 1: Call the weakref callback removing _module_locks['random'] -- BUG: the callback is called too late, after Thread 2 already sets a new lock

    • Thread 2: The lock is destroyed, the second weakref callback will be called as soon as possible

    • Thread 2: The second callback is called and fails with KeyError since _module_locks['random'] was already called in the Thread 1 in the meanwhile

    @vstinner
    Copy link
    Member Author

    vstinner commented Aug 8, 2017

    I wrote #3029 which seems to fix the issue.

    I don't know importlib well enough to understand why we need a weak reference to a lock.

    My PR adds a second lock per module lock (!) to be able to wait until _module_locks[name] is deleted when a thread detects that _module_locks[name] exists and the associated lock was destroyed.

    @serhiy-storchaka
    Copy link
    Member

    I think the solution can be simpler. PR 3033 uses the global import lock for guarding modification of the _module_locks dictionary.

    @serhiy-storchaka serhiy-storchaka added interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error and removed tests Tests in the Lib/test dir labels Aug 9, 2017
    @vstinner vstinner changed the title test_threaded_import: KeyError ignored in _get_module_lock.<locals>.cb on AMD64 Windows8.1 Refleaks 3.x test_threaded_import: KeyError ignored in _get_module_lock.<locals>.cb Aug 9, 2017
    @vstinner
    Copy link
    Member Author

    vstinner commented Aug 9, 2017

    Serhiy: "I think the solution can be simpler. PR 3033 uses the global import lock for guarding modification of the _module_locks dictionary."

    I tried exactly that, but it wasn't enough. But your PR is correct because it also checks the current value of _module_locks[name] using the callback parameter.

    I disliked my own PR, so I'm happy that someone else wrote a simpler fix! (I already abandonned my PR.)

    @serhiy-storchaka
    Copy link
    Member

    New changeset 9b0d1d6 by Serhiy Storchaka in branch 'master':
    bpo-31070: Fix a race condition in importlib _get_module_lock(). (bpo-3033)
    9b0d1d6

    @serhiy-storchaka
    Copy link
    Member

    New changeset f3b8917 by Serhiy Storchaka in branch '3.6':
    [3.6] bpo-31070: Fix a race condition in importlib _get_module_lock(). (GH-3033). (bpo-3038)
    f3b8917

    @serhiy-storchaka
    Copy link
    Member

    Thank you for review Victor and Nick.

    @vstinner
    Copy link
    Member Author

    vstinner commented Aug 9, 2017

    I'm curious to know if this issue is a regression caused by recent changes: bpo-30891 and bpo-30876. But well, it doesn't matter as soon as it's fixed :-)

    @ncoghlan
    Copy link
    Contributor

    ncoghlan commented Aug 9, 2017

    I don't think the changes created the bug, but I do think the new tests to provoke the other race conditions made it easier for the test suite to hit this race as well.

    @vstinner
    Copy link
    Member Author

    vstinner commented Aug 9, 2017

    I don't think the changes created the bug, but I do think the new tests to provoke the other race conditions made it easier for the test suite to hit this race as well.

    Oh ok, that's a good news!

    @ncoghlan
    Copy link
    Contributor

    A recent numpy threaded import bug report that appears in 3.6.1 but is absent in 3.6.3: numpy/numpy#9931

    So that's additional evidence that this really was a pre-existing race condition that the new tests for the SystemError cases revealed.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants