This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Add a --timeout option to regrtest.py using the faulthandler module
Type: Stage:
Components: Tests Versions: Python 3.3
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: Arfrever, neologix, pitrou, python-dev, vstinner
Priority: normal Keywords: buildbot, patch

Created on 2011-03-31 10:19 by vstinner, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
regrtest_timeout.patch vstinner, 2011-03-31 10:19 review
Messages (22)
msg132643 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-03-31 10:19
I would be nice to add a timeout option to regrtest.py to be able to dump the traceback after TIMEOUT seconds and also exit the process. It would help debugging timeout issues (e.g. test_multiprocessing ), but also protect buildbots against evil (long) tests.

Attached patch implements this feature.
msg132647 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-03-31 10:43
I created a new Mercurial repository to test this issue: features/regrtest_timeout.

Let's try it with a timeout of 5 minutes on:
 - "x86 Tiger custom" to learn more about test_threadsignals timeout (1 hour) => issue #11223
 - "x86 FreeBSD 7.2 3.x" to learn more about test_multiprocessing failures (timeout of 1 hour sometimes) => issue #8428

http://www.python.org/dev/buildbot/all/builders/x86%20Tiger%20custom/builds/2
http://www.python.org/dev/buildbot/all/builders/x86%20FreeBSD%207.2%20custom/builds/3
msg132651 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-03-31 11:28
Try the branch on "PPC Leopard 3.x" (test_subprocess timeout):

http://www.python.org/dev/buildbot/all/builders/PPC%20Leopard%20custom/builds/6
msg132654 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-03-31 11:59
Just a nit but:
"if timeout and 0 < timeout"

Could you use the usual coding style? ("if timeout and timeout > 0")

Putting constants on the left-side of comparisons is pointless in Python, and even in C it's quite ugly and surprising.
msg132655 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-03-31 12:01
> "x86 Tiger custom" to learn more about test_threadsignals timeout 
> (1 hour)

Please read the comments. There is no timeout anymore.
msg132656 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-03-31 12:03
>> "x86 Tiger custom" to learn more about test_threadsignals timeout 
>> (1 hour)
>
> Please read the comments. There is no timeout anymore.

The last build failed with:
---
...
[213/354] test_plistlib
[214/354] test_minidom
[215/354] test_threadsignals

command timed out: 3600 seconds without output, killing pid 24165
process killed by signal 9
---
http://www.python.org/dev/buildbot/all/builders/x86%20Tiger%203.x/builds/2262/steps/test/logs/stdio
msg132659 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-03-31 12:22
I ran the full test suite on "x86 Tiger custom" on my features/regrtest_timeout repository: no failure (test_threadsignals failure was not reproduced).

On "PPC Leopard custom", the test suite was interrupted (timeout) on test_io.test_basic_io:
----
...
[ 53/354] test_xdrlib
[ 54/354] test_descr
[ 55/354] test_io
Current thread 0xa09e1820:
  File "/Users/buildbot/buildarea/custom.parc-leopard-1/build/Lib/codecs.py", line 302 in decode
  File "/Users/buildbot/buildarea/custom.parc-leopard-1/build/Lib/_pyio.py", line 1378 in decode
  File "/Users/buildbot/buildarea/custom.parc-leopard-1/build/Lib/_pyio.py", line 1682 in _read_chunk
  File "/Users/buildbot/buildarea/custom.parc-leopard-1/build/Lib/_pyio.py", line 2002 in readline
  File "/Users/buildbot/buildarea/custom.parc-leopard-1/build/Lib/test/test_io.py", line 1964 in multi_line_test
  File "/Users/buildbot/buildarea/custom.parc-leopard-1/build/Lib/test/test_io.py", line 1945 in test_basic_io
  File "/Users/buildbot/buildarea/custom.parc-leopard-1/build/Lib/unittest/case.py", line 387 in _executeTestPart
  File "/Users/buildbot/buildarea/custom.parc-leopard-1/build/Lib/unittest/case.py", line 442 in run
  File "/Users/buildbot/buildarea/custom.parc-leopard-1/build/Lib/unittest/case.py", line 494 in __call__
  File "/Users/buildbot/buildarea/custom.parc-leopard-1/build/Lib/unittest/suite.py", line 105 in run
  File "/Users/buildbot/buildarea/custom.parc-leopard-1/build/Lib/unittest/suite.py", line 67 in __call__
  File "/Users/buildbot/buildarea/custom.parc-leopard-1/build/Lib/unittest/suite.py", line 105 in run
  File "/Users/buildbot/buildarea/custom.parc-leopard-1/build/Lib/unittest/suite.py", line 67 in __call__
  File "/Users/buildbot/buildarea/custom.parc-leopard-1/build/Lib/test/support.py", line 1078 in run
  File "/Users/buildbot/buildarea/custom.parc-leopard-1/build/Lib/test/support.py", line 1166 in _run_suite
  File "/Users/buildbot/buildarea/custom.parc-leopard-1/build/Lib/test/support.py", line 1192 in run_unittest
  File "/Users/buildbot/buildarea/custom.parc-leopard-1/build/Lib/test/test_io.py", line 2845 in test_main
  File "./Lib/test/regrtest.py", line 1029 in runtest_inner
  File "./Lib/test/regrtest.py", line 823 in runtest
  File "./Lib/test/regrtest.py", line 649 in main
  File "./Lib/test/regrtest.py", line 1604 in <module>
make: *** [buildbottest] Error 1
program finished with exit code 2
---
http://www.python.org/dev/buildbot/all/builders/PPC%20Leopard%20custom/builds/6

Cool! The traceback was printed correctly, the process exited correctly , and the build stopped also correctly. The full (interrupted) build took 27 min 41 sec, it takes usually 56 min 44 sec.

So a timeout of 5 minutes on each test (each function of each test file) is not enough on PPC Leopard buildslave. Anyway, I plan to use something closer to the actual buildbot timeout (1200 or 3600 seconds). And it would be better if the buildbot can choose its own regrtest.py timeout.
msg132667 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-03-31 13:25
Result on "x86 FreeBSD 7.2 custom":
--------------------
...
[148/354] test_fileinput
[149/354] test_subprocess
[36645 refs]
[36645 refs]
...
[36647 refs]
[44034 refs]
Current thread 0x28401040:
  File "/usr/home/db3l/buildarea/custom.bolen-freebsd7/build/Lib/subprocess.py", line 1282 in _execute_child
  File "/usr/home/db3l/buildarea/custom.bolen-freebsd7/build/Lib/subprocess.py", line 766 in __init__
  File "/usr/home/db3l/buildarea/custom.bolen-freebsd7/build/Lib/test/test_subprocess.py", line 687 in test_leaking_fds_on_error
  File "/usr/home/db3l/buildarea/custom.bolen-freebsd7/build/Lib/unittest/case.py", line 387 in _executeTestPart
  File "/usr/home/db3l/buildarea/custom.bolen-freebsd7/build/Lib/unittest/case.py", line 442 in run
  File "/usr/home/db3l/buildarea/custom.bolen-freebsd7/build/Lib/unittest/case.py", line 494 in __call__
  File "/usr/home/db3l/buildarea/custom.bolen-freebsd7/build/Lib/unittest/suite.py", line 105 in run
  File "/usr/home/db3l/buildarea/custom.bolen-freebsd7/build/Lib/unittest/suite.py", line 67 in __call__
  File "/usr/home/db3l/buildarea/custom.bolen-freebsd7/build/Lib/unittest/suite.py", line 105 in run
  File "/usr/home/db3l/buildarea/custom.bolen-freebsd7/build/Lib/unittest/suite.py", line 67 in __call__
  File "/usr/home/db3l/buildarea/custom.bolen-freebsd7/build/Lib/test/support.py", line 1078 in run
  File "/usr/home/db3l/buildarea/custom.bolen-freebsd7/build/Lib/test/support.py", line 1166 in _run_suite
  File "/usr/home/db3l/buildarea/custom.bolen-freebsd7/build/Lib/test/support.py", line 1192 in run_unittest
  File "/usr/home/db3l/buildarea/custom.bolen-freebsd7/build/Lib/test/test_subprocess.py", line 1607 in test_main
  File "./Lib/test/regrtest.py", line 1029 in runtest_inner
  File "./Lib/test/regrtest.py", line 823 in runtest
  File "./Lib/test/regrtest.py", line 649 in main
  File "./Lib/test/regrtest.py", line 1604 in <module>
program finished with exit code 1
elapsedTime=2100.157408
--------------------
http://www.python.org/dev/buildbot/all/builders/x86%20FreeBSD%207.2%20custom/builds/3

So test_subprocess does timeout (5 min) because of test_leaking_fds_on_error, it may be related to #8052. subprocess.MAXFD=655000 on FreeBSD: create a subprocess requires to call close() system calls 655,000 times and test_leaking_fds_on_error() creates 1024 subprocesses.

There is something interesting in this output: the test uses a subprocess and we only have the traceback of the parent. It may be nice to have the trace of the child process. It might be possible by sending a signal to the child process (but how can we get the list of the child processes in a C signal handler?).
msg132672 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2011-03-31 14:25
> There is something interesting in this output: the test uses a subprocess and we only have the traceback of the parent. It may be nice to have the trace of the child process. It might be possible by sending a signal to the child process (but how can we get the list of the child processes in a C signal handler?).

I don't think you can find that, but you could send a signal to the whole process group:
if (getpgrp() == getpid()) {
    kill(-getpgrp(), <signal>);
}

The getpgrp() == getpid() makes sure that you'll only do that if the current process is the group leader (and it's async-safe). You'll probably want to block <signal> in the parent's handler first.
Note that it won't work if your child process calls setsid(), of course (start_new_session argument to Popen).
msg132682 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2011-03-31 16:16
New changeset bdc946dc512a by Victor Stinner in branch 'default':
Issue #11727: add --timeout option to regrtest (disabled by default).
http://hg.python.org/cpython/rev/bdc946dc512a
msg132684 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2011-03-31 16:28
New changeset 0f3dbf93ac06 by Victor Stinner in branch 'default':
Issue #11727: Antoine and Arfrever don't like "0 < timeout" style (regrtest.py)
http://hg.python.org/cpython/rev/0f3dbf93ac06
msg132685 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-03-31 16:37
Patch commited into 3.3 (bdc946dc512a, 0f3dbf93ac06). I opened a thread on python-dev to decide if we turn this on by default on buildbots.
msg132739 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2011-04-01 14:00
New changeset 15f6fe139181 by Victor Stinner in branch 'default':
Issue #11727: set regrtest default timeout to 15 minutes
http://hg.python.org/cpython/rev/15f6fe139181
msg132746 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-04-01 15:38
Great! The timeout works:
-------------------------------------
...
[ 25/354] test_threadsignals
Thread 0xa000ed88:
  File "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/test/test_threadsignals.py", line 46 in test_signals
  File "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/unittest/case.py", line 387 in _executeTestPart
  File "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/unittest/case.py", line 442 in run
  File "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/unittest/case.py", line 494 in __call__
  File "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/unittest/suite.py", line 105 in run
  File "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/unittest/suite.py", line 67 in __call__
  File "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/unittest/suite.py", line 105 in run
  File "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/unittest/suite.py", line 67 in __call__
  File "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/test/support.py", line 1078 in run
  File "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/test/support.py", line 1166 in _run_suite
  File "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/test/support.py", line 1192 in run_unittest
  File "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/test/test_threadsignals.py", line 210 in test_main
  File "./Lib/test/regrtest.py", line 1032 in runtest_inner
  File "./Lib/test/regrtest.py", line 826 in runtest
  File "./Lib/test/regrtest.py", line 650 in main
  File "./Lib/test/regrtest.py", line 1607 in <module>
make: *** [buildbottest] Error 1
program finished with exit code 2
elapsedTime=1103.489396
-------------------------------------
http://www.python.org/dev/buildbot/all/builders/PPC%20Tiger%203.x/builds/1685/steps/test/logs/stdio

Compare it without the same failure without the timeout:
-------------------------------------
...
[106/354] test_threadsignals

command timed out: 3600 seconds without output, killing pid 1745
process killed by signal 9
program finished with exit code -1
elapsedTime=4785.840329
-------------------------------------
http://www.python.org/dev/buildbot/all/builders/PPC%20Tiger%203.x/builds/1684/steps/test/logs/stdio

Thanks to this timeout, the build stops after 18 minutes instead of 80 minutes.... 18 minutes including the timeout of 15 minutes :-)

test_threadsignals hangs at:
----------------
class ThreadSignals(unittest.TestCase):
    def test_signals(self):
        signalled_all.acquire()
        self.spawnSignallingThread()
        signalled_all.acquire() <~~~~ here
        ...
----------------

self.spawnSignallingThread() calls:
----------------
def send_signals():
    os.kill(process_pid, signal.SIGUSR1)
    os.kill(process_pid, signal.SIGUSR2)
    signalled_all.release()
----------------
msg132752 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2011-04-01 16:40
New changeset 053bc5ca199b by Victor Stinner in branch 'default':
Issue #11727: set regrtest default timeout to 30 minutes
http://hg.python.org/cpython/rev/053bc5ca199b
msg132755 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-04-01 16:49
Failures on the default branch with a timeout of 15 minutes:

 - test_io on "x86 FreeBSD 3.x"
 - test_signal on "x86 FreeBSD 7.2 3.x"
 - test_stdin_none on "sparc solaris10 gcc 3.x"

I think that the tests failed because the buildbot is slow, not because of a hang. These tests did not fail before. So I changed the timeout from 15 minutes to 30 minutes.

Trace of the "x86 FreeBSD 3.x" failure:
--------------
...
[  2/354] test_io
Thread 0x081f3000:
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd/build/Lib/unittest/case.py", line 574 in assertRaises
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd/build/Lib/test/test_io.py", line 2647 in check_interrupted_write
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd/build/Lib/test/test_io.py", line 2669 in test_interrupted_write_buffered
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd/build/Lib/unittest/case.py", line 387 in _executeTestPart
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd/build/Lib/unittest/case.py", line 442 in run
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd/build/Lib/unittest/case.py", line 494 in __call__
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd/build/Lib/unittest/suite.py", line 105 in run
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd/build/Lib/unittest/suite.py", line 67 in __call__
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd/build/Lib/unittest/suite.py", line 105 in run
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd/build/Lib/unittest/suite.py", line 67 in __call__
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd/build/Lib/test/support.py", line 1078 in run
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd/build/Lib/test/support.py", line 1166 in _run_suite
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd/build/Lib/test/support.py", line 1192 in run_unittest
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd/build/Lib/test/test_io.py", line 2845 in test_main
  File "./Lib/test/regrtest.py", line 1032 in runtest_inner
  File "./Lib/test/regrtest.py", line 826 in runtest
  File "./Lib/test/regrtest.py", line 650 in main
  File "./Lib/test/regrtest.py", line 1607 in <module>
--------------

Trace of the "x86 FreeBSD 7.2 3.x" failure:
-----------
...
[ 19/354] test_signal
Thread 0x28401040:
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd7/build/Lib/test/test_signal.py", line 436 in test_itimer_real
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd7/build/Lib/unittest/case.py", line 387 in _executeTestPart
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd7/build/Lib/unittest/case.py", line 442 in run
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd7/build/Lib/unittest/case.py", line 494 in __call__
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd7/build/Lib/unittest/suite.py", line 105 in run
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd7/build/Lib/unittest/suite.py", line 67 in __call__
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd7/build/Lib/unittest/suite.py", line 105 in run
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd7/build/Lib/unittest/suite.py", line 67 in __call__
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd7/build/Lib/test/support.py", line 1078 in run
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd7/build/Lib/test/support.py", line 1166 in _run_suite
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd7/build/Lib/test/support.py", line 1192 in run_unittest
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd7/build/Lib/test/test_signal.py", line 490 in test_main
  File "./Lib/test/regrtest.py", line 1032 in runtest_inner
  File "./Lib/test/regrtest.py", line 826 in runtest
  File "./Lib/test/regrtest.py", line 650 in main
  File "./Lib/test/regrtest.py", line 1607 in <module>
-----------
http://www.python.org/dev/buildbot/all/builders/x86%20FreeBSD%207.2%203.x/builds/1643/steps/test/logs/stdio

Trace of the Solaris failure:
------------
...
[277/354] test_subprocess
[37361 refs]
[37361 refs]
[37366 refs]
[37363 refs]
[37363 refs]
[37363 refs]
[37363 refs]
[37363 refs]
[37361 refs]
[37361 refs]
[37361 refs]
[37363 refs]
[37363 refs]
[37362 refs]
[37591 refs]
[37363 refs]
[37365 refs]
[37363 refs]
[37363 refs]
[44594 refs]
[37590 refs]
[37363 refs]
[37362 refs]
[37363 refs]
[37363 refs]
[37363 refs]
[37364 refs]
[37362 refs]
.
[37364 refs]
[37362 refs]
    this bit of output is from a test of stdout in a different process ...
[37362 refs]
[37362 refs]
[37590 refs]
[37590 refs]
[37363 refs]
[37361 refs]
[37361 refs]
[37361 refs]
[37361 refs]
[37363 refs]
[37372 refs]
[37372 refs]
[37372 refs]
[37363 refs]
[37372 refs]
[37363 refs]
[37372 refs]
[37363 refs]
[37372 refs]
[37363 refs]
[37372 refs]
[37363 refs]
[37372 refs]
[37363 refs]
[37372 refs]
[37363 refs]
[37372 refs]
[37363 refs]
[37372 refs]
[37363 refs]
[37372 refs]
[37363 refs]
[37372 refs]
[37363 refs]
[37364 refs]
[37365 refs]
[37361 refs]
[37361 refs]
[37361 refs]
[37361 refs]
[37362 refs]
[37364 refs]
[37363 refs]
[37368 refs]
[37366 refs]
[37368 refs]
[37366 refs]
[37363 refs]
[37361 refs]
[37361 refs]
[37361 refs]
[37361 refs]
[37361 refs]
[37361 refs]
[37366 refs]
[37363 refs]
[37363 refs]
[37363 refs]
[37363 refs]
[37363 refs]
[37363 refs]
[37372 refs]
[37372 refs]
[37361 refs]
[37361 refs]
[37361 refs]
[37363 refs]
[37363 refs]
[37362 refs]
[37591 refs]
[37363 refs]
[37365 refs]
[37363 refs]
[37363 refs]
[44594 refs]
[37372 refs]
[37363 refs]
[37372 refs]
[37363 refs]
[37372 refs]
[37363 refs]
[37372 refs]
[37363 refs]
[37372 refs]
[37363 refs]
[37372 refs]
[37363 refs]
[37372 refs]
[37363 refs]
[37372 refs]
[37363 refs]
[37372 refs]
[37363 refs]
[37372 refs]
[37363 refs]
[37372 refs]
[37363 refs]
[37364 refs]
[37590 refs]
[37365 refs]
[37361 refs]
[37361 refs]
[37361 refs]
[37361 refs]
[37362 refs]
[37364 refs]
[37363 refs]
[37362 refs]
[37363 refs]
[37363 refs]
[37363 refs]
[37364 refs]
[37362 refs]
.
[37364 refs]
[37362 refs]
    this bit of output is from a test of stdout in a different process ...
[37362 refs]
[37363 refs]
[37368 refs]
[37366 refs]
[37368 refs]
[37366 refs]
[37362 refs]
[37590 refs]
[37590 refs]
[37361 refs]
[37361 refs]
[37366 refs]
[37363 refs]
[37363 refs]
[37363 refs]
[37363 refs]
[37363 refs]
[37361 refs]
[37361 refs]
[37361 refs]
[37363 refs]
[37363 refs]
[37362 refs]
[37591 refs]
[37363 refs]
[37365 refs]
[37363 refs]
[37363 refs]
[44594 refs]
[37590 refs]
[37363 refs]
[37362 refs]
[37363 refs]
[37363 refs]
[37363 refs]
[37364 refs]
[37362 refs]
.
[37364 refs]
[37362 refs]
    this bit of output is from a test of stdout in a different process ...
[37362 refs]
[37362 refs]
[37590 refs]
[37590 refs]
[37361 refs]
[37361 refs]
[37366 refs]
[37363 refs]
[37363 refs]
[37363 refs]
[37363 refs]
[37363 refs]
[37361 refs]
[37361 refs]
[37361 refs]
[37363 refs]
[37363 refs]
[37362 refs]
[37591 refs]
[37363 refs]
[37365 refs]
[37363 refs]
[37363 refs]
[44594 refs]
[37590 refs]
[37363 refs]
[37363 refs]
[37362 refs]
[37363 refs]
[37363 refs]
Thread 0x00000001:
  File "/home2/buildbot/slave/3.x.loewis-sun/build/Lib/subprocess.py", line 466 in _eintr_retry_call
  File "/home2/buildbot/slave/3.x.loewis-sun/build/Lib/subprocess.py", line 1479 in _try_wait
  File "/home2/buildbot/slave/3.x.loewis-sun/build/Lib/subprocess.py", line 1521 in wait
  File "/home2/buildbot/slave/3.x.loewis-sun/build/Lib/test/test_subprocess.py", line 155 in test_stdin_none
  File "/home2/buildbot/slave/3.x.loewis-sun/build/Lib/unittest/case.py", line 387 in _executeTestPart
  File "/home2/buildbot/slave/3.x.loewis-sun/build/Lib/unittest/case.py", line 442 in run
  File "/home2/buildbot/slave/3.x.loewis-sun/build/Lib/unittest/case.py", line 494 in __call__
  File "/home2/buildbot/slave/3.x.loewis-sun/build/Lib/unittest/suite.py", line 105 in run
  File "/home2/buildbot/slave/3.x.loewis-sun/build/Lib/unittest/suite.py", line 67 in __call__
  File "/home2/buildbot/slave/3.x.loewis-sun/build/Lib/unittest/suite.py", line 105 in run
  File "/home2/buildbot/slave/3.x.loewis-sun/build/Lib/unittest/suite.py", line 67 in __call__
  File "/home2/buildbot/slave/3.x.loewis-sun/build/Lib/test/support.py", line 1078 in run
  File "/home2/buildbot/slave/3.x.loewis-sun/build/Lib/test/support.py", line 1166 in _run_suite
  File "/home2/buildbot/slave/3.x.loewis-sun/build/Lib/test/support.py", line 1192 in run_unittest
  File "/home2/buildbot/slave/3.x.loewis-sun/build/Lib/test/test_subprocess.py", line 1607 in test_main
  File "./Lib/test/regrtest.py", line 1032 in runtest_inner
  File "./Lib/test/regrtest.py", line 826 in runtest
  File "./Lib/test/regrtest.py", line 650 in main
  File "./Lib/test/regrtest.py", line 1607 in <module>
*** Error code 1
make: Fatal error: Command failed for target `buildbottest'
------------
http://www.python.org/dev/buildbot/all/builders/sparc%20solaris10%20gcc%203.x/builds/2861/steps/test/logs/stdio
msg132768 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-04-01 21:55
(oh, "sparc solaris10 gcc 3.x" failed on test_subprocess, test_stdin_none is the function, not the file)

With a timeout of 30 minutes:

 - "sparc solaris10 gcc 3.x" doesn't fail anymore.
 - "x86 FreeBSD 3.x" still fails on the same test: test_io (test_interrupted_write_buffered)
 - "x86 FreeBSD 7.2 3.x" fails on another test: test_socket (test_sendall_interrupted)

"x86 FreeBSD 7.2 3.x" trace:
----------
...
[201/354] test_socket
Thread 0x28401040:
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd7/build/Lib/test/test_socket.py", line 729 in check_sendall_interrupted
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd7/build/Lib/test/test_socket.py", line 740 in test_sendall_interrupted
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd7/build/Lib/unittest/case.py", line 387 in _executeTestPart
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd7/build/Lib/unittest/case.py", line 442 in run
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd7/build/Lib/unittest/case.py", line 494 in __call__
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd7/build/Lib/unittest/suite.py", line 105 in run
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd7/build/Lib/unittest/suite.py", line 67 in __call__
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd7/build/Lib/unittest/suite.py", line 105 in run
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd7/build/Lib/unittest/suite.py", line 67 in __call__
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd7/build/Lib/test/support.py", line 1078 in run
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd7/build/Lib/test/support.py", line 1166 in _run_suite
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd7/build/Lib/test/support.py", line 1192 in run_unittest
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd7/build/Lib/test/test_socket.py", line 2052 in test_main
  File "./Lib/test/regrtest.py", line 1032 in runtest_inner
  File "./Lib/test/regrtest.py", line 826 in runtest
  File "./Lib/test/regrtest.py", line 650 in main
  File "./Lib/test/regrtest.py", line 1607 in <module>
----------
http://www.python.org/dev/buildbot/all/builders/x86%20FreeBSD%207.2%203.x/builds/1644/steps/test/logs/stdio

"x86 FreeBSD 3.x" trace:
----------
...
[ 54/354] test_io
Thread 0x081f3000:
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd/build/Lib/unittest/case.py", line 574 in assertRaises
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd/build/Lib/test/test_io.py", line 2647 in check_interrupted_write
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd/build/Lib/test/test_io.py", line 2669 in test_interrupted_write_buffered
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd/build/Lib/unittest/case.py", line 387 in _executeTestPart
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd/build/Lib/unittest/case.py", line 442 in run
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd/build/Lib/unittest/case.py", line 494 in __call__
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd/build/Lib/unittest/suite.py", line 105 in run
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd/build/Lib/unittest/suite.py", line 67 in __call__
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd/build/Lib/unittest/suite.py", line 105 in run
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd/build/Lib/unittest/suite.py", line 67 in __call__
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd/build/Lib/test/support.py", line 1078 in run
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd/build/Lib/test/support.py", line 1166 in _run_suite
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd/build/Lib/test/support.py", line 1192 in run_unittest
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd/build/Lib/test/test_io.py", line 2845 in test_main
  File "./Lib/test/regrtest.py", line 1032 in runtest_inner
  File "./Lib/test/regrtest.py", line 826 in runtest
  File "./Lib/test/regrtest.py", line 650 in main
  File "./Lib/test/regrtest.py", line 1607 in <module>
----------
http://www.python.org/dev/buildbot/all/builders/x86%20FreeBSD%203.x/builds/1343/steps/test/logs/stdio

Reopen the issue because the story is not completly finished.
msg132770 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-04-01 22:18
Because the timeout was 30 minutes, we missed where test_ssl hangs on "x86 Windows7 3.x":
--------------
...
[259/354] test_filecmp
[260/354] test_ssl

command timed out: 1200 seconds without output, killing pid 3012
SIGKILL failed to kill process
using fake rc=-1
program finished with exit code -1

remoteFailed: [Failure instance: Traceback from remote host -- Traceback (most recent call last):
Failure: buildbot.slave.commands.TimeoutError: SIGKILL failed to kill process
]
--------------
http://www.python.org/dev/buildbot/all/builders/x86%20Windows7%203.x/builds/2877/steps/test/logs/stdio

It looks like this buildbot has some network issues:

 * build 2877 failed on test_ssl
 * build 2872 failed on test_poplib
 * build 2866 failed on test_ftplib
 * build 2865 failed on test_smtplib
 * build 2864 failed on test_socket
 * ...

Note: build 2869 failed on test_multiprocessing.
msg132772 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-04-01 22:37
For test_subprocess timeout, see also:
http://bugs.python.org/issue8429#msg103368
For test_io timeout, see also: issue #8431
msg132775 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-04-01 22:57
Timeout of 15 minutes on "x86 XP-4 3.x":
----------------------------
...
[334/354] test_threading
Thread 0x0000024c:
  File "D:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\threading.py", line 235 in wait
  File "D:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\test\lock_tests.py", line 392 in f
  File "D:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\test\lock_tests.py", line 37 in task

Thread 0x000002d8:
  File "D:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\threading.py", line 235 in wait
  File "D:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\test\lock_tests.py", line 392 in f
  File "D:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\test\lock_tests.py", line 37 in task

Thread 0x00000da4:
  File "D:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\threading.py", line 235 in wait
  File "D:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\test\lock_tests.py", line 392 in f
  File "D:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\test\lock_tests.py", line 37 in task

Thread 0x00000bb0:
  File "D:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\test\lock_tests.py", line 16 in _wait
  File "D:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\test\lock_tests.py", line 416 in _check_notify
  File "D:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\test\lock_tests.py", line 433 in test_notify
  File "D:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\unittest\case.py", line 387 in _executeTestPart
  File "D:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\unittest\case.py", line 442 in run
  File "D:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\unittest\case.py", line 494 in __call__
  File "D:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\unittest\suite.py", line 105 in run
  File "D:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\unittest\suite.py", line 67 in __call__
  File "D:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\unittest\suite.py", line 105 in run
  File "D:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\unittest\suite.py", line 67 in __call__
  File "D:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\test\support.py", line 1078 in run
  File "D:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\test\support.py", line 1166 in _run_suite
  File "D:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\test\support.py", line 1192 in run_unittest
  File "D:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\test\test_threading.py", line 728 in test_main
  File "../lib/test/regrtest.py", line 1032 in runtest_inner
  File "../lib/test/regrtest.py", line 826 in runtest
  File "../lib/test/regrtest.py", line 650 in main
  File "../lib/test/regrtest.py", line 1607 in <module>
s_push: parser stack overflow
program finished with exit code 1
elapsedTime=2601.059000
----------------------------
http://www.python.org/dev/buildbot/all/builders/x86%20XP-4%203.x/builds/4317/steps/test/logs/stdio

Hum, it looks like we have 4 threads, and all threads are waiting.

=> See issue #8799 and maybe also #4188 and #5114.
msg132890 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2011-04-03 21:46
New changeset 394f0ea0d29e by Victor Stinner in branch 'default':
Issue #11727, issue #11753, issue #11755: disable regrtest timeout
http://hg.python.org/cpython/rev/394f0ea0d29e
msg132999 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-04-05 00:48
regrtest default timeout is now 30 minutes.

I opened specific issues for each failure:

 * test_sendall_interrupted() of test_socket: issue #11753
 * test_itimer_real() of test_signal: issue #11755
 * test_threadsignals: issue #11768
 * test_notify() of test_threading: issue #11769

If you get new timeout issue, it's now better to open a new specific issue.
History
Date User Action Args
2022-04-11 14:57:15adminsetgithub: 55936
2011-04-05 00:48:42vstinnersetstatus: open -> closed
resolution: fixed
messages: + msg132999
2011-04-03 21:46:45python-devsetmessages: + msg132890
2011-04-01 22:57:44vstinnersetmessages: + msg132775
2011-04-01 22:37:05vstinnersetmessages: + msg132772
2011-04-01 22:18:12vstinnersetmessages: + msg132770
2011-04-01 21:55:51vstinnersetstatus: closed -> open
resolution: fixed -> (no value)
messages: + msg132768
2011-04-01 16:49:20vstinnersetmessages: + msg132755
2011-04-01 16:40:21python-devsetmessages: + msg132752
2011-04-01 15:38:55vstinnersetmessages: + msg132746
2011-04-01 14:00:12python-devsetmessages: + msg132739
2011-03-31 16:37:05vstinnersetstatus: open -> closed
resolution: fixed
messages: + msg132685
2011-03-31 16:28:26python-devsetmessages: + msg132684
2011-03-31 16:16:37python-devsetnosy: + python-dev
messages: + msg132682
2011-03-31 14:25:56neologixsetnosy: + neologix
messages: + msg132672
2011-03-31 14:00:02Arfreversetnosy: + Arfrever
2011-03-31 13:25:17vstinnersetmessages: + msg132667
2011-03-31 12:22:58vstinnersetmessages: + msg132659
2011-03-31 12:03:44vstinnersetmessages: + msg132656
2011-03-31 12:01:59pitrousetmessages: + msg132655
2011-03-31 11:59:56pitrousetnosy: + pitrou
messages: + msg132654
2011-03-31 11:28:04vstinnersetmessages: + msg132651
2011-03-31 10:43:59vstinnersetmessages: + msg132647
2011-03-31 10:19:44vstinnercreate