Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test_threaded_import fails with -j4 #59985

Closed
skrah mannequin opened this issue Aug 25, 2012 · 22 comments
Closed

test_threaded_import fails with -j4 #59985

skrah mannequin opened this issue Aug 25, 2012 · 22 comments
Labels
release-blocker tests Tests in the Lib/test dir type-bug An unexpected behavior, bug, or error

Comments

@skrah
Copy link
Mannequin

skrah mannequin commented Aug 25, 2012

BPO 15781
Nosy @brettcannon, @birkenfeld, @ncoghlan, @pitrou, @vstinner, @tiran, @jkloth, @skrah, @ericsnowcurrently
Files
  • importrace.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2012-08-29.22:54:16.003>
    created_at = <Date 2012-08-25.08:33:39.779>
    labels = ['type-bug', 'tests', 'release-blocker']
    title = 'test_threaded_import fails with -j4'
    updated_at = <Date 2012-09-09.09:18:54.614>
    user = 'https://github.com/skrah'

    bugs.python.org fields:

    activity = <Date 2012-09-09.09:18:54.614>
    actor = 'python-dev'
    assignee = 'none'
    closed = True
    closed_date = <Date 2012-08-29.22:54:16.003>
    closer = 'pitrou'
    components = ['Tests']
    creation = <Date 2012-08-25.08:33:39.779>
    creator = 'skrah'
    dependencies = []
    files = ['27019']
    hgrepos = []
    issue_num = 15781
    keywords = ['patch']
    message_count = 22.0
    messages = ['169116', '169118', '169122', '169124', '169129', '169132', '169133', '169134', '169150', '169151', '169154', '169217', '169218', '169220', '169223', '169225', '169229', '169230', '169231', '169423', '170028', '170087']
    nosy_count = 10.0
    nosy_names = ['brett.cannon', 'georg.brandl', 'ncoghlan', 'pitrou', 'vstinner', 'christian.heimes', 'jkloth', 'skrah', 'python-dev', 'eric.snow']
    pr_nums = []
    priority = 'release blocker'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue15781'
    versions = ['Python 3.3', 'Python 3.4']

    @skrah
    Copy link
    Mannequin Author

    skrah mannequin commented Aug 25, 2012

    test_parallel_module_init() fails on Windows-64:

    http://buildbot.python.org/all/builders/AMD64%20Windows7%20SP1%203.x/builds/553/steps/test/logs/stdio

    ======================================================================
    FAIL: test_parallel_module_init (test.test_threaded_import.ThreadedImportTests)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\test\test_threaded_import.py", line 125, in test_parallel_module_init
        self.check_parallel_module_init()
      File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\test\test_threaded_import.py", line 120, in check_parallel_module_init
        self.assertFalse(errors)
    AssertionError: [AttributeError("'module' object has no attribute 'randrange'",)] is not false

    @skrah skrah mannequin added tests Tests in the Lib/test dir type-bug An unexpected behavior, bug, or error labels Aug 25, 2012
    @skrah
    Copy link
    Mannequin Author

    skrah mannequin commented Aug 25, 2012

    The buildbot in question uses -j4, and I can reproduce this on Linux
    with:

    ./python -m test -uall -F -j4 -v test_threaded_import

    ======================================================================
    FAIL: test_parallel_meta_path (test.test_threaded_import.ThreadedImportTests)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "/home/stefan/hg/cpython/Lib/test/test_threaded_import.py", line 131, in test_parallel_meta_path
        self.check_parallel_module_init()
      File "/home/stefan/hg/cpython/Lib/test/test_threaded_import.py", line 120, in check_parallel_module_init
        self.assertFalse(errors)
    AssertionError: [AttributeError("'module' object has no attribute 'randrange'",), AttributeError("'module' object has no attribute 'randrange'",)] is not false

    Ran 6 tests in 4.221s

    @skrah skrah mannequin changed the title test_threaded_import fails on Windows test_threaded_import fails with -j4 Aug 25, 2012
    @pitrou
    Copy link
    Member

    pitrou commented Aug 25, 2012

    With
    $ ./python -m test -uall -F -j16 -v test_threaded_import
    I couldn't reproduce after 1000 test passes.

    @ncoghlan
    Copy link
    Contributor

    No joy here either - currently at 704 successful runs and counting with Stefan's command line.

    $ uname -a
    Linux lancre 3.5.1-1.fc17.x86_64 #1 SMP Thu Aug 9 17:50:43 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux
    
    $ cat /proc/cpuinfo | grep "model name"
    model name      : Intel(R) Core(TM) i7-2677M CPU @ 1.80GHz
    model name      : Intel(R) Core(TM) i7-2677M CPU @ 1.80GHz
    model name      : Intel(R) Core(TM) i7-2677M CPU @ 1.80GHz
    model name      : Intel(R) Core(TM) i7-2677M CPU @ 1.80GHz

    @skrah
    Copy link
    Mannequin Author

    skrah mannequin commented Aug 25, 2012

    I can only reproduce this on a Core i7. Native OS is Debian Wheezy,
    also reproduced earlier on a FreeBSD VM hosted on that machine
    (see msg167989). The machine has 4 physical cores, perhaps we
    can blame hyper-threading. ;)

    $ cat /proc/cpuinfo | grep "model name"
    model name      : Intel(R) Core(TM) i7 CPU         930  @ 2.80GHz
    model name      : Intel(R) Core(TM) i7 CPU         930  @ 2.80GHz
    model name      : Intel(R) Core(TM) i7 CPU         930  @ 2.80GHz
    model name      : Intel(R) Core(TM) i7 CPU         930  @ 2.80GHz
    model name      : Intel(R) Core(TM) i7 CPU         930  @ 2.80GHz
    model name      : Intel(R) Core(TM) i7 CPU         930  @ 2.80GHz
    model name      : Intel(R) Core(TM) i7 CPU         930  @ 2.80GHz
    model name      : Intel(R) Core(TM) i7 CPU         930  @ 2.80GHz

    I can't reproduce it on a Core 2 Duo running Ubuntu natively.

    @ncoghlan
    Copy link
    Contributor

    Up to 2279 runs without a failure here - yay for elusive hardware specific threading bugs :P

    I'll leave it running overnight and see what happens.

    @skrah
    Copy link
    Mannequin Author

    skrah mannequin commented Aug 25, 2012

    I can also reproduce it on the Core 2 machine with a ridiculuously
    low switch interval:

     diff --git a/Lib/test/regrtest.py b/Lib/test/regrtest.py
    --- a/Lib/test/regrtest.py
    +++ b/Lib/test/regrtest.py
    @@ -198,6 +198,8 @@
     except ImportError:
         multiprocessing = None
     
    +sys.setswitchinterval(0.00000001)
    +
     
     # Some times __path__ and __file__ are not absolute (e.g. while running from
     # Lib/) and, if we change the CWD to run the tests in a temporary dir, some

    @skrah
    Copy link
    Mannequin Author

    skrah mannequin commented Aug 25, 2012

    Oh, and as usual, my machines have all CPUs at 100% load.

    @ericsnowcurrently
    Copy link
    Member

    So this was a random one-off (hardly even intermittent) failed test that passed on the retry. Ah, threads. Is there an easy way to find all other failures in test_threaded_import in the past, for this and for all buildbots? Do we have information about load during tests? Is there an easy way to run the test suite under load on a build bot?

    With a large enough sample size we can make more sense of this. For instance, do failures predate the addition of per-module import locks or the importlib migration in 3.3? Are failures limited to particular platforms? Is the problem load related?

    BTW, test_threaded_import had another (different) failure a week ago on FreeBSD: bpo-15599.

    @ericsnowcurrently
    Copy link
    Member

    (sorry, only saw the first couple messages)

    @skrah
    Copy link
    Mannequin Author

    skrah mannequin commented Aug 25, 2012

    On the i7 machine with hyper-threading the issue occurs since
    edb9ce3a6c2e.

    The tests also got slower:

    3430d7329a3b:

    $ time ./python -m test -uall -v  test_threaded_import

    real 0m3.195s
    user 0m0.656s
    sys 0m0.284s

    edb9ce3a6c2e:

    $ time ./python -m test -uall -v  test_threaded_import

    real 0m24.032s
    user 0m0.880s
    sys 0m0.744s

    Here most of the time is spent in test_parallel_meta_path.

    DISCLAIMER: This machine runs Debian Wheezy, which is not
    stable yet! On the other hand the Windows bot also reproduced
    the issue.

    @pitrou
    Copy link
    Member

    pitrou commented Aug 27, 2012

    Ok, I can now reproduce with setswitchinterval(). Here is a patch. There was a race between putting the new module in sys.modules and setting its __initializing__ attribute, so now __initializing__ is set before putting the module in sys.modules.

    Also, there was another race when retrieving a module lock from the locks dict (the weakref could be destroyed between the __contains__ test and the actual fetch).

    @brettcannon
    Copy link
    Member

    Didn't patch it in to evaluate the surrounding code, but otherwise the patch LGTM.

    @birkenfeld
    Copy link
    Member

    Looks good to me. Please commit to default and I'll cherry-pick to rc2.

    @pitrou
    Copy link
    Member

    pitrou commented Aug 27, 2012

    If possible I would like to wait for Stefan's confirmation that it fixes the failures for him.

    @birkenfeld
    Copy link
    Member

    Sure.

    @skrah
    Copy link
    Mannequin Author

    skrah mannequin commented Aug 27, 2012

    Nice, on the finicky i7 machine test_threaded_import is fixed.

    I'm getting another error in test_importlib, both with and without
    the patch. I can open another issue for that if you think it's
    completely unrelated:

    $ ./python -m test -uall -F test_importlib      
    [  1] test_importlib
    test test_importlib failed -- Traceback (most recent call last):
      File "/home/stefan/hg/cpython/Lib/test/test_importlib/test_locks.py", line 80, in test_deadlock
        self.assertEqual(results.count((True, False)), 1)
    AssertionError: 2 != 1

    1 test failed:
    test_importlib

    @pitrou
    Copy link
    Member

    pitrou commented Aug 27, 2012

    I'm getting another error in test_importlib, both with and without
    the patch. I can open another issue for that if you think it's
    completely unrelated:

    Please do. Thanks!

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Aug 27, 2012

    New changeset 7fa6336e9864 by Antoine Pitrou in branch 'default':
    Issue bpo-15781: Fix two small race conditions in import's module locking.
    http://hg.python.org/cpython/rev/7fa6336e9864

    @pitrou
    Copy link
    Member

    pitrou commented Aug 29, 2012

    Looks fixed!

    @pitrou pitrou closed this as completed Aug 29, 2012
    @birkenfeld
    Copy link
    Member

    Now picked into 3.3.0 release clone as 85070f284fd2.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Sep 9, 2012

    New changeset 85070f284fd2 by Antoine Pitrou in branch 'default':
    Issue bpo-15781: Fix two small race conditions in import's module locking.
    http://hg.python.org/cpython/rev/85070f284fd2

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    release-blocker tests Tests in the Lib/test dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants