Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test_default_timeout() of test_threading.BarrierTests failure: BrokenBarrierError #56080

Closed
vstinner opened this issue Apr 18, 2011 · 16 comments
Closed
Labels
3.7 (EOL) end of life 3.8 only security fixes 3.9 only security fixes tests Tests in the Lib/test dir

Comments

@vstinner
Copy link
Member

BPO 11871
Nosy @db3l, @vstinner, @skrah, @native-api
Files
  • test_barrier_timeout.diff
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2011-04-18.21:59:03.506>
    labels = ['3.8', '3.7', 'tests', '3.9']
    title = 'test_default_timeout() of test_threading.BarrierTests failure: BrokenBarrierError'
    updated_at = <Date 2022-02-04.10:23:01.964>
    user = 'https://github.com/vstinner'

    bugs.python.org fields:

    activity = <Date 2022-02-04.10:23:01.964>
    actor = 'vstinner'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Tests']
    creation = <Date 2011-04-18.21:59:03.506>
    creator = 'vstinner'
    dependencies = []
    files = ['22761']
    hgrepos = []
    issue_num = 11871
    keywords = ['patch', 'needs review']
    message_count = 15.0
    messages = ['133995', '134583', '141124', '141241', '141263', '141270', '141274', '200564', '340889', '340949', '344118', '347533', '361753', '394791', '412502']
    nosy_count = 6.0
    nosy_names = ['db3l', 'vstinner', 'skrah', 'neologix', 'python-dev', 'Ivan.Pozdeev']
    pr_nums = []
    priority = 'normal'
    resolution = None
    stage = 'resolved'
    status = 'open'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue11871'
    versions = ['Python 3.7', 'Python 3.8', 'Python 3.9']

    @vstinner
    Copy link
    Member Author

    While trying to reproduce issue bpo-11870 using "gdb -args ./python Lib/test/regrtest.py -F -v --timeout=600 test_threading", I had the following error on Linux:
    ----------------------

    test_default_timeout (test.test_threading.BarrierTests) ... [Thread 0x7ffff1acf700 (LWP 27178) exited]
    [New Thread 0x7ffff1acf700 (LWP 27181)]
    [New Thread 0x7ffff12ce700 (LWP 27182)]
    [New Thread 0x7ffff3c99700 (LWP 27183)]
    [New Thread 0x7ffff325a700 (LWP 27184)]
    Unhandled exception in thread started by <function task at 0x1302340>
    Unhandled exception in thread started by <function task at 0x1302340>
    Unhandled exception in thread started by <function task at 0x1302340>
    Traceback (most recent call last):
    Unhandled exception in thread started by <function task at 0x1302340>
      File "/home/haypo/prog/HG/cpython/Lib/test/lock_tests.py", line 37, in task
    Traceback (most recent call last):
    Traceback (most recent call last):
      File "/home/haypo/prog/HG/cpython/Lib/test/lock_tests.py", line 37, in task
    Traceback (most recent call last):
      File "/home/haypo/prog/HG/cpython/Lib/test/lock_tests.py", line 37, in task
        f()
        f()
      File "/home/haypo/prog/HG/cpython/Lib/test/lock_tests.py", line 37, in task
      File "/home/haypo/prog/HG/cpython/Lib/test/lock_tests.py", line 838, in f
    ERROR
    ----------------------
    
    

    ERROR: test_default_timeout (test.test_threading.BarrierTests)

    Traceback (most recent call last):
      File "/home/haypo/prog/HG/cpython/Lib/test/lock_tests.py", line 843, in test_default_timeout
        self.run_threads(f)
      File "/home/haypo/prog/HG/cpython/Lib/test/lock_tests.py", line 672, in run_threads
        f()
      File "/home/haypo/prog/HG/cpython/Lib/test/lock_tests.py", line 838, in f
        i = barrier.wait()
      File "/home/haypo/prog/HG/cpython/Lib/threading.py", line 472, in wait
        self._enter() # Block while the barrier drains.
      File "/home/haypo/prog/HG/cpython/Lib/threading.py", line 496, in _enter
        raise BrokenBarrierError
    threading.BrokenBarrierError

    The error occured on:

    • Ubuntu 10.04
    • Python 3.3 (2127df2c972e)
    • Intel(R) Core(TM)2 Quad CPU Q9300 @ 2.50GHz
    • 4 GB of memory

    @vstinner vstinner added the tests Tests in the Lib/test dir label Apr 18, 2011
    @neologix
    Copy link
    Mannequin

    neologix mannequin commented Apr 27, 2011

    The most obvious explanation for that failure is that the barrier's timeout is too low.

       def test_default_timeout(self):
           """
           Test the barrier's default timeout
           """
           #create a barrier with a low default timeout
           barrier = self.barriertype(self.N, timeout=0.1)

    If the last thread waits on the barrier more than 0.1s after the first thread, then you'll get a BrokenBarrierError.
    A 0.1s delay is not that much, 100ms was the default quantum with Linux O(1) scheduler...

    @neologix
    Copy link
    Mannequin

    neologix mannequin commented Jul 25, 2011

    The attached patch bumps the barrier's default timeout to 300ms: it should be more than enough (unless you got a really crappy scheduler, or a really heavily loaded machine), especially since this problem doesn't seem to occur often (AFAICT).

    @neologix
    Copy link
    Mannequin

    neologix mannequin commented Jul 27, 2011

    Victor, can I commit it?

    @vstinner
    Copy link
    Member Author

    YES YOU CAN

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Jul 27, 2011

    New changeset aa9c0fdf2143 by Charles-François Natali in branch '3.2':
    Issue bpo-11871: In test_threading.BarrierTests, bump the default barrier timeout
    http://hg.python.org/cpython/rev/aa9c0fdf2143

    New changeset e8da570d29a8 by Charles-François Natali in branch 'default':
    Issue bpo-11871: In test_threading.BarrierTests, bump the default barrier timeout
    http://hg.python.org/cpython/rev/e8da570d29a8

    @neologix
    Copy link
    Mannequin

    neologix mannequin commented Jul 27, 2011

    YES YOU CAN

    :-)

    @neologix neologix mannequin closed this as completed Jul 27, 2011
    @skrah
    Copy link
    Mannequin

    skrah mannequin commented Oct 20, 2013

    It looks like it happened again:

    http://buildbot.python.org/all/builders/x86%20Gentoo%20Non-Debug%203.x/builds/5223/steps/test/logs/stdio

    ======================================================================
    ERROR: test_default_timeout (test.test_threading.BarrierTests)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/test/lock_tests.py", line 876, in test_default_timeout
        self.run_threads(f)
      File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/test/lock_tests.py", line 705, in run_threads
        f()
      File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/test/lock_tests.py", line 871, in f
        i = barrier.wait()
      File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/threading.py", line 616, in wait
        self._wait(timeout)
      File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/threading.py", line 654, in _wait
        raise BrokenBarrierError
    threading.BrokenBarrierError

    Ran 138 tests in 22.880s

    @vstinner
    Copy link
    Member Author

    The test still fails randomly:
    https://buildbot.python.org/all/#/builders/3/builds/2469

    @vstinner vstinner reopened this Apr 26, 2019
    @db3l
    Copy link
    Contributor

    db3l commented Apr 26, 2019

    I should mention that a high level of test parallelism on the part of my worker might have be a contributing factor in this most recent case.

    The worker was recently upgraded to a faster 4-core VM, but with limited I/O. In a test run the test processes invariably end up stuck on I/O heavy tests, idling the CPUs.

    So I've been running the tests under -j8, as I found it the most effective combination of supporting tests stuck on I/O while keeping the CPUs busy, but it does mean that in some cases there's a lot pending on the CPUs, and depending on the exact test ordering in a run presumably some more sensitive tests could be impacted.

    I have in fact seen an increase in random tests generating warnings (fail, then pass) than the worker had previously. I suspect the benefits of the extra parallelism on total test time (-j8 is about 20% faster than -j4) probably isn't valuable enough and will most likely be reducing it a bit.

    @native-api
    Copy link
    Mannequin

    native-api mannequin commented May 31, 2019

    Got this issue today in AppVeyor's PR check: https://ci.appveyor.com/project/python/cpython/builds/24945165, so it's not local to David's worker.
    (At rerun, the test succeeeded, so the check status was not affected.)

    @native-api native-api mannequin added 3.7 (EOL) end of life 3.8 only security fixes 3.9 only security fixes labels May 31, 2019
    @vstinner
    Copy link
    Member Author

    vstinner commented Jul 9, 2019

    test_threading failed on AMD64 Windows8.1 Refleaks 3.x, the unittest errors are missing, but unraisable exceptions look similar:
    https://buildbot.python.org/all/#/builders/80/builds/646

    0:21:11 load avg: 7.30 [156/419/3] test_threading failed (2 min 56 sec) -- running: test_concurrent_futures (7 min 40 sec), test_email (1 min 34 sec)
    beginning 6 repetitions
    123456
    .....Warning -- Unraisable exceptionWarning -- Unraisable exception

    Exception ignored in thread started byException ignored in thread started by: : <function Bunch.__init__.<locals>.task at 0x000000C7CAC57AF0><function Bunch.__init__.<locals>.task at 0x000000C7CAC57AF0>

    Traceback (most recent call last):
    Traceback (most recent call last):
      File "D:\buildarea\3.x.ware-win81-release.refleak\build\lib\test\lock_tests.py", line 41, in task
      File "D:\buildarea\3.x.ware-win81-release.refleak\build\lib\test\lock_tests.py", line 41, in task
        f()    
    f()  File "D:\buildarea\3.x.ware-win81-release.refleak\build\lib\test\lock_tests.py", line 939, in f

    File "D:\buildarea\3.x.ware-win81-release.refleak\build\lib\test\lock_tests.py", line 939, in f
    i = barrier.wait()
    i = barrier.wait() File "D:\buildarea\3.x.ware-win81-release.refleak\build\lib\threading.py", line 620, in wait

    File "D:\buildarea\3.x.ware-win81-release.refleak\build\lib\threading.py", line 620, in wait
    self._wait(timeout)
    File "D:\buildarea\3.x.ware-win81-release.refleak\build\lib\threading.py", line 660, in _wait
    self._wait(timeout)
    File "D:\buildarea\3.x.ware-win81-release.refleak\build\lib\threading.py", line 660, in _wait
    raise BrokenBarrierError
    threading. BrokenBarrierErrorraise BrokenBarrierError:

    threading.BrokenBarrierError: 
    Warning -- Unraisable exception
    Exception ignored in thread started by: <function Bunch.__init__.<locals>.task at 0x000000C7CAC57AF0>
    Traceback (most recent call last):
      File "D:\buildarea\3.x.ware-win81-release.refleak\build\lib\test\lock_tests.py", line 41, in task
        f()
      File "D:\buildarea\3.x.ware-win81-release.refleak\build\lib\test\lock_tests.py", line 939, in f
        i = barrier.wait()
      File "D:\buildarea\3.x.ware-win81-release.refleak\build\lib\threading.py", line 611, in wait
        self._enter() # Block while the barrier drains.
      File "D:\buildarea\3.x.ware-win81-release.refleak\build\lib\threading.py", line 635, in _enter
        raise BrokenBarrierError
    threading.BrokenBarrierError: 
    Warning -- Unraisable exception
    Exception ignored in thread started by: <function Bunch.__init__.<locals>.task at 0x000000C7CAC57AF0>
    Traceback (most recent call last):
      File "D:\buildarea\3.x.ware-win81-release.refleak\build\lib\test\lock_tests.py", line 41, in task
        f()
      File "D:\buildarea\3.x.ware-win81-release.refleak\build\lib\test\lock_tests.py", line 939, in f
        i = barrier.wait()
      File "D:\buildarea\3.x.ware-win81-release.refleak\build\lib\threading.py", line 611, in wait
        self._enter() # Block while the barrier drains.
      File "D:\buildarea\3.x.ware-win81-release.refleak\build\lib\threading.py", line 635, in _enter
        raise BrokenBarrierError
    threading.BrokenBarrierError: 
    test test_threading failed -- multiple errors occurred; run in verbose mode for details

    @vstinner
    Copy link
    Member Author

    I didn't see this failure recently, I close the issue.

    @vstinner
    Copy link
    Member Author

    10 years ago, the issue is not solved. Recent failure on AMD64 Windows8.1 Refleaks PR buildbot:
    https://buildbot.python.org/all/#/builders/470/builds/45

    The issue started to make the buildbot fail since I added a threading.excepthook in libregrtest.

    0:58:59 load avg: 4.46 [107/427/2] test_threading failed (1 min 21 sec) -- running: test_bufio (9 min 53 sec), test_asyncio (5 min 29 sec), test_compileall (19 min 3 sec)
    beginning 6 repetitions
    123456
    ..Warning -- Unraisable exception
    Exception ignored in thread started by: <function Bunch.__init__.<locals>.task at 0x000000F31C974050>
    Traceback (most recent call last):
      File "D:\buildarea\pull_request.ware-win81-release.refleak\build\lib\test\lock_tests.py", line 48, in task
        f()
      File "D:\buildarea\pull_request.ware-win81-release.refleak\build\lib\test\lock_tests.py", line 998, in f
        i = barrier.wait()
      File "D:\buildarea\pull_request.ware-win81-release.refleak\build\lib\threading.py", line 661, in wait
        self._wait(timeout)
      File "D:\buildarea\pull_request.ware-win81-release.refleak\build\lib\threading.py", line 701, in _wait
        raise BrokenBarrierError
    threading.BrokenBarrierError: 
    Warning -- Unraisable exception
    Exception ignored in thread started by: <function Bunch.__init__.<locals>.task at 0x000000F31C974050>
    Traceback (most recent call last):
      File "D:\buildarea\pull_request.ware-win81-release.refleak\build\lib\test\lock_tests.py", line 48, in task
        f()
      File "D:\buildarea\pull_request.ware-win81-release.refleak\build\lib\test\lock_tests.py", line 998, in f
        i = barrier.wait()
      File "D:\buildarea\pull_request.ware-win81-release.refleak\build\lib\threading.py", line 661, in wait
        self._wait(timeout)
      File "D:\buildarea\pull_request.ware-win81-release.refleak\build\lib\threading.py", line 701, in _wait
        raise BrokenBarrierError
    threading.BrokenBarrierError: 
    Warning -- Unraisable exception
    Exception ignored in thread started by: <function Bunch.__init__.<locals>.task at 0x000000F31C974050>
    Traceback (most recent call last):
      File "D:\buildarea\pull_request.ware-win81-release.refleak\build\lib\test\lock_tests.py", line 48, in task
        f()
      File "D:\buildarea\pull_request.ware-win81-release.refleak\build\lib\test\lock_tests.py", line 998, in f
        i = barrier.wait()
      File "D:\buildarea\pull_request.ware-win81-release.refleak\build\lib\threading.py", line 652, in wait
        self._enter() # Block while the barrier drains.
      File "D:\buildarea\pull_request.ware-win81-release.refleak\build\lib\threading.py", line 676, in _enter
        raise BrokenBarrierError
    threading.BrokenBarrierError: 
    Warning -- Unraisable exception
    Exception ignored in thread started by: <function Bunch.__init__.<locals>.task at 0x000000F31C974050>
    Traceback (most recent call last):
      File "D:\buildarea\pull_request.ware-win81-release.refleak\build\lib\test\lock_tests.py", line 48, in task
        f()
      File "D:\buildarea\pull_request.ware-win81-release.refleak\build\lib\test\lock_tests.py", line 998, in f
        i = barrier.wait()
      File "D:\buildarea\pull_request.ware-win81-release.refleak\build\lib\threading.py", line 652, in wait
        self._enter() # Block while the barrier drains.
      File "D:\buildarea\pull_request.ware-win81-release.refleak\build\lib\threading.py", line 676, in _enter
        raise BrokenBarrierError
    threading.BrokenBarrierError: 
    test test_threading failed -- Traceback (most recent call last):
      File "D:\buildarea\pull_request.ware-win81-release.refleak\build\lib\test\lock_tests.py", line 1003, in test_default_timeout
        self.run_threads(f)
      File "D:\buildarea\pull_request.ware-win81-release.refleak\build\lib\test\lock_tests.py", line 832, in run_threads
        f()
      File "D:\buildarea\pull_request.ware-win81-release.refleak\build\lib\test\lock_tests.py", line 998, in f
        i = barrier.wait()
      File "D:\buildarea\pull_request.ware-win81-release.refleak\build\lib\threading.py", line 661, in wait
        self._wait(timeout)
      File "D:\buildarea\pull_request.ware-win81-release.refleak\build\lib\threading.py", line 699, in _wait
        raise BrokenBarrierError
    threading.BrokenBarrierError

    @vstinner vstinner reopened this May 31, 2021
    @vstinner
    Copy link
    Member Author

    vstinner commented Feb 4, 2022

    The race condition still exists in tests. Recent failure on AMD64 Windows8.1 Refleaks 3.x:
    https://buildbot.python.org/all/#/builders/511/builds/249

    0:03:31 load avg: 2.93 [ 42/432/1] test_threading failed (1 error) (1 min 12 sec) -- running: test_runpy (1 min 8 sec), test_pydoc (55.4 sec), test_io (1 min 29 sec)
    beginning 6 repetitions
    123456
    ...Warning -- Unraisable exceptionWarning -- Unraisable exceptionWarning -- Unraisable exception
    Exception ignored in thread started by: 
    <function Bunch.__init__.<locals>.task at 0x000000CA51433CD0>
    Exception ignored in thread started by
    Exception ignored in thread started by: Traceback (most recent call last):
      File "D:\buildarea\3.x.ware-win81-release.refleak\build\Lib\test\lock_tests.py", line 49, in task
    <function Bunch.__init__.<locals>.task at 0x000000CA51433CD0>: 
    <function Bunch.__init__.<locals>.task at 0x000000CA51433CD0>
    Traceback (most recent call last):
      File "D:\buildarea\3.x.ware-win81-release.refleak\build\Lib\test\lock_tests.py", line 49, in task
    Traceback (most recent call last):
      File "D:\buildarea\3.x.ware-win81-release.refleak\build\Lib\test\lock_tests.py", line 49, in task
            f()f()    
    f()  
           ^  ^^^^^

    File "D:\buildarea\3.x.ware-win81-release.refleak\build\Lib\test\lock_tests.py", line 1021, in f
    File "D:\buildarea\3.x.ware-win81-release.refleak\build\Lib\test\lock_tests.py", line 1021, in f
    ^^^
    File "D:\buildarea\3.x.ware-win81-release.refleak\build\Lib\test\lock_tests.py", line 1021, in f
    i = barrier.wait()i = barrier.wait()

             i = barrier.wait()  
                 ^ ^^^ ^^ ^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ^^  File "D:\buildarea\3.x.ware-win81-release.refleak\build\Lib\threading.py", line 683, in wait

    File "D:\buildarea\3.x.ware-win81-release.refleak\build\Lib\threading.py", line 683, in wait
    ^
    File "D:\buildarea\3.x.ware-win81-release.refleak\build\Lib\threading.py", line 683, in wait
    self._wait(timeout)
    ^^^^ ^self._wait(timeout)^
    ^ self._wait(timeout)
    ^^ ^ ^ ^^ ^^^^^^^^^^^^^^^^^^^^^^^^^
    ^^ File "D:\buildarea\3.x.ware-win81-release.refleak\build\Lib\threading.py", line 723, in _wait
    ^^^^^^^^^^
    File "D:\buildarea\3.x.ware-win81-release.refleak\build\Lib\threading.py", line 723, in _wait
    ^^^^^ ^raise BrokenBarrierError^

         raise BrokenBarrierError  File "D:\buildarea\3.x.ware-win81-release.refleak\build\Lib\threading.py", line 723, in _wait
       ^
    ^ ^ ^ ^ ^^^    ^^raise BrokenBarrierError^^
    ^^ ^^ ^^ ^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ^^threading.^^BrokenBarrierError^^: ^^^
    ^^^^^
    ^threading.BrokenBarrierError^: ^
    ^^^
    threading.BrokenBarrierError: 
    Warning -- Unraisable exception
    Exception ignored in thread started by: <function Bunch.__init__.<locals>.task at 0x000000CA51433CD0>
    Traceback (most recent call last):
      File "D:\buildarea\3.x.ware-win81-release.refleak\build\Lib\test\lock_tests.py", line 49, in task
        f()
        ^^^
      File "D:\buildarea\3.x.ware-win81-release.refleak\build\Lib\test\lock_tests.py", line 1021, in f
        i = barrier.wait()
            ^^^^^^^^^^^^^^
      File "D:\buildarea\3.x.ware-win81-release.refleak\build\Lib\threading.py", line 674, in wait
        self._enter() # Block while the barrier drains.
        ^^^^^^^^^^^^^
      File "D:\buildarea\3.x.ware-win81-release.refleak\build\Lib\threading.py", line 698, in _enter
        raise BrokenBarrierError
        ^^^^^^^^^^^^^^^^^^^^^^^^
    threading.BrokenBarrierError: 
    test test_threading failed -- Traceback (most recent call last):
      File "D:\buildarea\3.x.ware-win81-release.refleak\build\Lib\test\lock_tests.py", line 1026, in test_default_timeout
        self.run_threads(f)
        ^^^^^^^^^^^^^^^^^^^
      File "D:\buildarea\3.x.ware-win81-release.refleak\build\Lib\test\lock_tests.py", line 855, in run_threads
        f()
        ^^^
      File "D:\buildarea\3.x.ware-win81-release.refleak\build\Lib\test\lock_tests.py", line 1021, in f
        i = barrier.wait()
            ^^^^^^^^^^^^^^
      File "D:\buildarea\3.x.ware-win81-release.refleak\build\Lib\threading.py", line 683, in wait
        self._wait(timeout)
        ^^^^^^^^^^^^^^^^^^^
      File "D:\buildarea\3.x.ware-win81-release.refleak\build\Lib\threading.py", line 721, in _wait
        raise BrokenBarrierError
        ^^^^^^^^^^^^^^^^^^^^^^^^
    threading.BrokenBarrierError

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    @vstinner
    Copy link
    Member Author

    vstinner commented Nov 3, 2022

    Sadly, I don't have time to investigate/fix this old issue, I just close it.

    @vstinner vstinner closed this as completed Nov 3, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life 3.8 only security fixes 3.9 only security fixes tests Tests in the Lib/test dir
    Projects
    Development

    No branches or pull requests

    2 participants