classification
Title: Timeouts on "x86 Ubuntu Shared 3.x" buildbot
Type: Stage:
Components: Versions: Python 3.5
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: Arfrever, davin, vstinner
Priority: normal Keywords:

Created on 2015-03-25 02:42 by vstinner, last changed 2015-03-30 20:18 by vstinner. This issue is now closed.

Messages (7)
msg239214 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015-03-25 02:42
First timeout:
http://buildbot.python.org/all/builders/x86%20Ubuntu%20Shared%203.x/builds/11358

This build was only triggered by one changeset 0b99d7043a99: "Issue #23694: Enhance _Py_open(), it now raises exceptions".

 * _Py_open() now raises exceptions on error. If open() fails, it raises an OSError with the filename.
 * _Py_open() now releases the GIL while calling open()
 * Add _Py_open_noraise() when _Py_open() cannot be used because the GIL is not held

Example of subprocess timeout:
---
Timeout (1:00:00)!
Thread 0x55aaedc0 (most recent call first):
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/subprocess.py", line 1502 in _try_wait
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/subprocess.py", line 1552 in wait
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/test/test_subprocess.py", line 58 in tearDown
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/unittest/case.py", line 580 in run
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/unittest/case.py", line 625 in __call__
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/unittest/suite.py", line 122 in run
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/unittest/suite.py", line 84 in __call__
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/unittest/suite.py", line 122 in run
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/unittest/suite.py", line 84 in __call__
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/unittest/runner.py", line 176 in run
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/test/support/__init__.py", line 1773 in _run_suite
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/test/support/__init__.py", line 1807 in run_unittest
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/test/test_subprocess.py", line 2532 in test_main
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/test/regrtest.py", line 1284 in runtest_inner
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/test/regrtest.py", line 967 in runtest
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/test/regrtest.py", line 763 in main
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/test/regrtest.py", line 1568 in main_in_temp_cwd
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/test/__main__.py", line 3 in <module>
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/runpy.py", line 85 in _run_code
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/runpy.py", line 170 in _run_module_as_main
make: *** [buildbottest] Error 1
---

Example of multiprocessing timeout:
---
[240/393] test_multiprocessing_spawn
Timeout (1:00:00)!
Thread 0x68fddb40 (most recent call first):
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/multiprocessing/connection.py", line 379 in _recv
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/multiprocessing/connection.py", line 407 in _recv_bytes
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/multiprocessing/connection.py", line 250 in recv
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/multiprocessing/managers.py", line 717 in _callmethod
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/multiprocessing/managers.py", line 955 in wait
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/test/_test_multiprocessing.py", line 834 in f
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/threading.py", line 871 in run
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/threading.py", line 923 in _bootstrap_inner
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/threading.py", line 891 in _bootstrap

Thread 0x58a5ab40 (most recent call first):
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/multiprocessing/connection.py", line 379 in _recv
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/multiprocessing/connection.py", line 407 in _recv_bytes
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/multiprocessing/connection.py", line 250 in recv
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/multiprocessing/managers.py", line 717 in _callmethod
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/multiprocessing/managers.py", line 955 in wait
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/test/_test_multiprocessing.py", line 834 in f
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/threading.py", line 871 in run
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/threading.py", line 923 in _bootstrap_inner
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/threading.py", line 891 in _bootstrap

(...)

Thread 0x55aaedc0 (most recent call first):
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/multiprocessing/connection.py", line 379 in _recv
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/multiprocessing/connection.py", line 407 in _recv_bytes
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/multiprocessing/connection.py", line 250 in recv
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/multiprocessing/managers.py", line 717 in _callmethod
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/multiprocessing/managers.py", line 943 in acquire
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/test/_test_multiprocessing.py", line 933 in test_notify_all
  ...
---
msg239217 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015-03-25 03:06
test_multiprocessing_spawn.test_notify_all() also hangs on "AMD64 Debian root 3.x" buildbot:

http://buildbot.python.org/all/builders/AMD64%20Debian%20root%203.x/builds/1959/steps/test/logs/stdio
---
[330/393] test_multiprocessing_spawn
Timeout (1:00:00)!
Thread 0x00002b303f269700 (most recent call first):
  File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/multiprocessing/synchronize.py", line 262 in wait
  File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/test/_test_multiprocessing.py", line 834 in f
  File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/threading.py", line 871 in run
  File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/threading.py", line 923 in _bootstrap_inner
  File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/threading.py", line 891 in _bootstrap

Thread 0x00002b303d918700 (most recent call first):
  File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/multiprocessing/synchronize.py", line 262 in wait
  File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/test/_test_multiprocessing.py", line 834 in f
  File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/threading.py", line 871 in run
  File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/threading.py", line 923 in _bootstrap_inner
  File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/threading.py", line 891 in _bootstrap

Thread 0x00002b3060947700 (most recent call first):
  File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/multiprocessing/synchronize.py", line 262 in wait
  File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/test/_test_multiprocessing.py", line 834 in f
  File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/threading.py", line 871 in run
  File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/threading.py", line 923 in _bootstrap_inner
  File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/threading.py", line 891 in _bootstrap

Thread 0x00002b303289db20 (most recent call first):
  File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/test/_test_multiprocessing.py", line 933 in test_notify_all
  File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/unittest/case.py", line 577 in run
  File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/unittest/case.py", line 625 in __call__
  File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/unittest/suite.py", line 122 in run
  File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/unittest/suite.py", line 84 in __call__
  File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/unittest/suite.py", line 122 in run
  File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/unittest/suite.py", line 84 in __call__
  File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/unittest/suite.py", line 122 in run
  File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/unittest/suite.py", line 84 in __call__
  File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/unittest/runner.py", line 176 in run
  File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/test/support/__init__.py", line 1773 in _run_suite
  File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/test/support/__init__.py", line 1807 in run_unittest
  File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/test/regrtest.py", line 1283 in test_runner
  File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/test/regrtest.py", line 1284 in runtest_inner
  File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/test/regrtest.py", line 967 in runtest
  File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/test/regrtest.py", line 763 in main
  File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/test/regrtest.py", line 1568 in main_in_temp_cwd
  File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/test/__main__.py", line 3 in <module>
  File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/runpy.py", line 85 in _run_code
  File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/runpy.py", line 170 in _run_module_as_main
---
msg239346 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015-03-26 17:37
A new failure test_subprocess.test_double_close_on_error:

http://buildbot.python.org/all/builders/x86%20Ubuntu%20Shared%203.x/builds/11411/steps/test/logs/stdio
---
Timeout (1:00:00)!
Thread 0x55aafdc0 (most recent call first):
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/subprocess.py", line 1407 in _execute_child
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/subprocess.py", line 855 in __init__
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/test/test_subprocess.py", line 1074 in test_double_close_on_error
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/unittest/case.py", line 577 in run
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/unittest/case.py", line 625 in __call__
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/unittest/suite.py", line 122 in run
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/unittest/suite.py", line 84 in __call__
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/unittest/suite.py", line 122 in run
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/unittest/suite.py", line 84 in __call__
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/unittest/runner.py", line 176 in run
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/test/support/__init__.py", line 1773 in _run_suite
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/test/support/__init__.py", line 1807 in run_unittest
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/test/test_subprocess.py", line 2532 in test_main
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/test/regrtest.py", line 1284 in runtest_inner
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/test/regrtest.py", line 967 in runtest
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/test/regrtest.py", line 763 in main
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/test/regrtest.py", line 1568 in main_in_temp_cwd
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/test/__main__.py", line 3 in <module>
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/runpy.py", line 85 in _run_code
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/runpy.py", line 170 in _run_module_as_main
make: *** [buildbottest] Error 1
---
msg239469 - (view) Author: Davin Potts (davin) * (Python committer) Date: 2015-03-29 00:55
@haypo:  Could you please take a look at applying my patches from issue23713 to address an intermittent multiprocessing test failure?  I don't know that it is the sole cause, but it is indeed one potential source of the misbehavior.
msg239470 - (view) Author: Davin Potts (davin) * (Python committer) Date: 2015-03-29 00:56
@haypo: I didn't think that one through -- the consequences of that fragile test are not a hang.  It's unrelated.
msg239540 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015-03-30 00:28
> This build was only triggered by one changeset 0b99d7043a99: "Issue #23694: Enhance _Py_open(), it now raises exceptions".

I was reproduce to issue on a buildbot and I got access to the buildbot. Using gdb, I saw that a process was stuck in _close_open_fds_safe(). The problem is that this function is called after fork() to run a child process. It's not safe to play with the GIL here.

This bug is the regression which makes some buildbots to hang.

I fixed the bug in the changeset 2e1234208bab.

I keep the issue open a few days to check if buildbots are repaired or not.
msg239639 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015-03-30 20:18
Last 5 builds of "x86 Ubuntu Shared 3.x" and "AMD64 Debian root 3.x" buildbots are green (success). The sporadic hang is gone! I close the issue.
History
Date User Action Args
2015-03-30 20:18:58vstinnersetstatus: open -> closed
resolution: fixed
messages: + msg239639
2015-03-30 00:28:40vstinnersetmessages: + msg239540
2015-03-29 00:56:37davinsetmessages: + msg239470
2015-03-29 00:55:03davinsetnosy: + davin
messages: + msg239469
2015-03-28 19:39:05Arfreversetnosy: + Arfrever
2015-03-26 17:37:16vstinnersetmessages: + msg239346
2015-03-25 03:06:58vstinnersetmessages: + msg239217
2015-03-25 02:42:02vstinnercreate