classification
Title: test_multiprocessing_forkserver hangs on the master branch
Type: Stage:
Components: macOS, Tests Versions: Python 3.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: haypo, mattbillenstein, ned.deily, pitrou, ronaldoussoren
Priority: normal Keywords: buildbot

Created on 2017-06-19 15:20 by haypo, last changed 2017-06-22 16:11 by pitrou.

Messages (6)
msg296348 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-06-19 15:20
regrtest hangs while running test_multiprocessing_forkserver in a worker process on the "x86-64 Sierra 3.x" buildbot.

Bug seen twice: build 343 and 337.

http://buildbot.python.org/all/builders/x86-64%20Sierra%203.x/builds/337/steps/test/logs/stdio

Run tests in parallel using 2 child processes
...
0:29:36 load avg: 5.24 [402/406] test_deque passed -- running: test_multiprocessing_forkserver (1346 sec)
0:29:37 load avg: 5.24 [403/406] test_descrtut passed -- running: test_multiprocessing_forkserver (1346 sec)
0:29:37 load avg: 4.90 [404/406] test_weakset passed -- running: test_multiprocessing_forkserver (1347 sec)
0:29:40 load avg: 4.90 [405/406] test_distutils passed -- running: test_multiprocessing_forkserver (1350 sec)

command timed out: 1200 seconds without output running ['make', 'buildbottest', 'TESTOPTS=-j2', 'TESTPYTHONOPTS=', 'TESTTIMEOUT=900'], attempting to kill
process killed by signal 9
program finished with exit code -1
elapsedTime=2983.908679
msg296376 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-06-19 21:17
Different issue, but same behaviour: test hangs and then killed by buildbot, whereas a single test was still running.

See also bpo-30351 which tracks similar bugs, but on Python 2.7.

I really hate such bugs :-(

http://buildbot.python.org/all/builders/AMD64%20Debian%20root%203.x/builds/928/steps/test/logs/stdio

...
0:15:10 load avg: 0.52 [404/406] test_unicodedata passed -- running: test_pydoc (401 sec)
0:15:10 load avg: 0.52 [405/406] test_exception_hierarchy passed -- running: test_pydoc (401 sec)

command timed out: 1200 seconds without output running ['make', 'buildbottest', 'TESTOPTS=-j2', 'TESTPYTHONOPTS=', 'TESTTIMEOUT=900'], attempting to kill
process killed by signal 9
program finished with exit code -1
elapsedTime=2111.415970
msg296419 - (view) Author: Matt Billenstein (mattbillenstein) Date: 2017-06-20 12:49
I've been debugging this and I can repro on El Capitan on a different machine as well -- it's an infrequent hang, I've been running in a loop:

mattb@mattb-mbp:~/src/misc/cpython master$ for i in $(seq 1000); do echo "Run: $i -- $(date)"; time./python.exe -m test --timeout 180 test_multiprocessing_forkserver || break; sleep 10; done

And in this case after ~30 successful runs:

Run: 31 -- Tue Jun 20 05:12:19 PDT 2017
Run tests sequentially
0:00:00 load avg: 4.28 [1/1] test_multiprocessing_forkserver
Timeout (0:03:00)!
Thread 0x00007fffa1b3a3c0 (most recent call first):
  File "/Users/mattb/src/misc/cpython/Lib/selectors.py", line 415 in select
  File "/Users/mattb/src/misc/cpython/Lib/multiprocessing/connection.py", line 916 in wait
  File "/Users/mattb/src/misc/cpython/Lib/multiprocessing/popen_fork.py", line 45 in wait
  File "/Users/mattb/src/misc/cpython/Lib/multiprocessing/process.py", line 121 in join
  File "/Users/mattb/src/misc/cpython/Lib/test/_test_multiprocessing.py", line 431 in test_many_processes
  File "/Users/mattb/src/misc/cpython/Lib/unittest/case.py", line 605 in run
  File "/Users/mattb/src/misc/cpython/Lib/unittest/case.py", line 653 in __call__
  File "/Users/mattb/src/misc/cpython/Lib/unittest/suite.py", line 122 in run
  File "/Users/mattb/src/misc/cpython/Lib/unittest/suite.py", line 84 in __call__
  File "/Users/mattb/src/misc/cpython/Lib/unittest/suite.py", line 122 in run
  File "/Users/mattb/src/misc/cpython/Lib/unittest/suite.py", line 84 in __call__
  File "/Users/mattb/src/misc/cpython/Lib/unittest/suite.py", line 122 in run
  File "/Users/mattb/src/misc/cpython/Lib/unittest/suite.py", line 84 in __call__
  File "/Users/mattb/src/misc/cpython/Lib/test/support/__init__.py", line 1772 in run
  File "/Users/mattb/src/misc/cpython/Lib/test/support/__init__.py", line 1896 in _run_suite
  File "/Users/mattb/src/misc/cpython/Lib/test/support/__init__.py", line 1936 in run_unittest
  File "/Users/mattb/src/misc/cpython/Lib/test/libregrtest/runtest.py", line 168 in test_runner
  File "/Users/mattb/src/misc/cpython/Lib/test/libregrtest/runtest.py", line 169 in runtest_inner
  File "/Users/mattb/src/misc/cpython/Lib/test/libregrtest/runtest.py", line 137 in runtest
  File "/Users/mattb/src/misc/cpython/Lib/test/libregrtest/main.py", line 374 in run_tests_sequential
  File "/Users/mattb/src/misc/cpython/Lib/test/libregrtest/main.py", line 454 in run_tests
  File "/Users/mattb/src/misc/cpython/Lib/test/libregrtest/main.py", line 530 in _main
  File "/Users/mattb/src/misc/cpython/Lib/test/libregrtest/main.py", line 504 in main
  File "/Users/mattb/src/misc/cpython/Lib/test/libregrtest/main.py", line 573 in main
  File "/Users/mattb/src/misc/cpython/Lib/test/__main__.py", line 2 in <module>
  File "/Users/mattb/src/misc/cpython/Lib/runpy.py", line 85 in _run_code
  File "/Users/mattb/src/misc/cpython/Lib/runpy.py", line 193 in _run_module_as_main

real    3m0.115s
user    0m5.804s
sys     0m0.683s

I also ran ~600 passes overnight on this test on an Ubuntu 16.04 machine without a hang, so it seems to be OSX specific.
msg296420 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-06-20 12:53
> I've been debugging this and I can repro on El Capitan on a different machine as well -- it's an infrequent hang, I've been running in a loop:

Hum, a timeout of 3 minutes seems short for test_multiprocessing_forkserver. How long does it take *usually* to run this test?

At least, we have a first clue: test_many_processes() of Lib/test/_test_multiprocessing.py.
msg296460 - (view) Author: Matt Billenstein (mattbillenstein) Date: 2017-06-20 15:39
It consistently takes between ~61 and ~73 seconds with this setup.
msg296641 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2017-06-22 16:11
Matt, if you try the following command, it will run the specific test in a loop in verbose mode:

$ ./python -m test --timeout=30 -F -m test_many_processes -v test_multiprocessing_forkserver
History
Date User Action Args
2017-06-22 16:11:26pitrousetnosy: + pitrou
messages: + msg296641
2017-06-22 16:05:14pitrousettitle: regrtest hangs on the master branch -> test_multiprocessing_forkserver hangs on the master branch
2017-06-20 15:39:59mattbillensteinsetmessages: + msg296460
2017-06-20 12:53:02hayposetmessages: + msg296420
2017-06-20 12:49:28mattbillensteinsetnosy: + mattbillenstein
messages: + msg296419
2017-06-19 21:17:37hayposettitle: test_multiprocessing_forkserver hangs on x86-64 Sierra 3.x. buildbot -> regrtest hangs on the master branch
2017-06-19 21:17:11hayposetmessages: + msg296376
2017-06-19 15:21:04hayposetnosy: + ronaldoussoren, ned.deily
components: + macOS
2017-06-19 15:20:57haypocreate