Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test suite: enable faulthandler timeout in assert_python #63169

Closed
neologix mannequin opened this issue Sep 8, 2013 · 3 comments
Closed

test suite: enable faulthandler timeout in assert_python #63169

neologix mannequin opened this issue Sep 8, 2013 · 3 comments
Labels
tests Tests in the Lib/test dir type-feature A feature request or enhancement

Comments

@neologix
Copy link
Mannequin

neologix mannequin commented Sep 8, 2013

BPO 18969
Nosy @vstinner

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2020-01-07.12:27:20.250>
created_at = <Date 2013-09-08.09:52:48.518>
labels = ['type-feature', 'tests']
title = 'test suite: enable faulthandler timeout in assert_python'
updated_at = <Date 2020-01-07.12:27:20.244>
user = 'https://bugs.python.org/neologix'

bugs.python.org fields:

activity = <Date 2020-01-07.12:27:20.244>
actor = 'vstinner'
assignee = 'none'
closed = True
closed_date = <Date 2020-01-07.12:27:20.250>
closer = 'vstinner'
components = ['Tests']
creation = <Date 2013-09-08.09:52:48.518>
creator = 'neologix'
dependencies = []
files = []
hgrepos = []
issue_num = 18969
keywords = []
message_count = 3.0
messages = ['197239', '197242', '359505']
nosy_count = 2.0
nosy_names = ['vstinner', 'neologix']
pr_nums = []
priority = 'normal'
resolution = 'fixed'
stage = 'resolved'
status = 'closed'
superseder = None
type = 'enhancement'
url = 'https://bugs.python.org/issue18969'
versions = ['Python 3.4']

@neologix
Copy link
Mannequin Author

neologix mannequin commented Sep 8, 2013

Currently, the test suite, as well as processes spawned by the script_helper.assert_python family, are run with faulthandler enabled.
That's great to debug crashes, but it would be even better if those processes were started with faulthandler's timeout:

  1. Most deadlock-prone tests are run in child processes, so in case of deadlock, you don't get any trace:

http://buildbot.python.org/all/builders/AMD64 FreeBSD 10.0 3.x/builds/353/steps/test/logs/stdio
"""
[269/380] test_threading
Timeout (1:00:00)!
Thread 0x0000000801c06400:
File "/usr/home/buildbot/koobs-freebsd10/3.x.koobs-freebsd10/build/Lib/subprocess.py", line 1615 in _communicate_with_poll
File "/usr/home/buildbot/koobs-freebsd10/3.x.koobs-freebsd10/build/Lib/subprocess.py", line 1535 in _communicate
File "/usr/home/buildbot/koobs-freebsd10/3.x.koobs-freebsd10/build/Lib/subprocess.py", line 945 in communicate
File "/usr/home/buildbot/koobs-freebsd10/3.x.koobs-freebsd10/build/Lib/test/script_helper.py", line 36 in _assert_python
File "/usr/home/buildbot/koobs-freebsd10/3.x.koobs-freebsd10/build/Lib/test/script_helper.py", line 55 in assert_python_ok
File "/usr/home/buildbot/koobs-freebsd10/3.x.koobs-freebsd10/build/Lib/test/test_threading.py", line 617 in assertScriptHasOutput
File "/usr/home/buildbot/koobs-freebsd10/3.x.koobs-freebsd10/build/Lib/test/test_threading.py", line 692 in test_4_joining_across_fork_in_worker_thread
File "/usr/home/buildbot/koobs-freebsd10/3.x.koobs-freebsd10/build/Lib/unittest/case.py", line 496 in run
File "/usr/home/buildbot/koobs-freebsd10/3.x.koobs-freebsd10/build/Lib/unittest/case.py", line 535 in __call__
File "/usr/home/buildbot/koobs-freebsd10/3.x.koobs-freebsd10/build/Lib/unittest/suite.py", line 117 in run
File "/usr/home/buildbot/koobs-freebsd10/3.x.koobs-freebsd10/build/Lib/unittest/suite.py", line 79 in __call__
File "/usr/home/buildbot/koobs-freebsd10/3.x.koobs-freebsd10/build/Lib/unittest/suite.py", line 117 in run
File "/usr/home/buildbot/koobs-freebsd10/3.x.koobs-freebsd10/build/Lib/unittest/suite.py", line 79 in __call__
File "/usr/home/buildbot/koobs-freebsd10/3.x.koobs-freebsd10/build/Lib/unittest/suite.py", line 117 in run
File "/usr/home/buildbot/koobs-freebsd10/3.x.koobs-freebsd10/build/Lib/unittest/suite.py", line 79 in __call__
File "/usr/home/buildbot/koobs-freebsd10/3.x.koobs-freebsd10/build/Lib/unittest/runner.py", line 168 in run
File "/usr/home/buildbot/koobs-freebsd10/3.x.koobs-freebsd10/build/Lib/test/support/init.py", line 1649 in _run_suite
File "/usr/home/buildbot/koobs-freebsd10/3.x.koobs-freebsd10/build/Lib/test/support/init.py", line 1683 in run_unittest
File "/usr/home/buildbot/koobs-freebsd10/3.x.koobs-freebsd10/build/Lib/test/regrtest.py", line 1275 in <lambda>
File "/usr/home/buildbot/koobs-freebsd10/3.x.koobs-freebsd10/build/Lib/test/regrtest.py", line 1276 in runtest_inner
File "/usr/home/buildbot/koobs-freebsd10/3.x.koobs-freebsd10/build/Lib/test/regrtest.py", line 965 in runtest
File "/usr/home/buildbot/koobs-freebsd10/3.x.koobs-freebsd10/build/Lib/test/regrtest.py", line 761 in main
File "/usr/home/buildbot/koobs-freebsd10/3.x.koobs-freebsd10/build/Lib/test/regrtest.py", line 1560 in main_in_temp_cwd
File "/usr/home/buildbot/koobs-freebsd10/3.x.koobs-freebsd10/build/Lib/test/main.py", line 3 in <module>
File "/usr/home/buildbot/koobs-freebsd10/3.x.koobs-freebsd10/build/Lib/runpy.py", line 73 in _run_code
File "/usr/home/buildbot/koobs-freebsd10/3.x.koobs-freebsd10/build/Lib/runpy.py", line 160 in _run_module_as_main
*** Error code 1
"""

Here, we just see that the main process is waiting for its child to complete, but we don't know anything about the child process stack.

  1. As an added benefit, this would prevent dangling child processes: when the parent is killed, they're reparented to init, and can keep running arbitrarily long, consuming memory/CPU/process table entry (well, maybe the buildbot scripts kill the whole process group, I don't know).

@neologix neologix mannequin added tests Tests in the Lib/test dir type-feature A feature request or enhancement labels Sep 8, 2013
@vstinner
Copy link
Member

vstinner commented Sep 8, 2013

I see two options:

  • faulthandler calls killpg(SIGABRT) on timeout to kill child processes (but it should ignore temporary the signal to not kill itself)
  • use a timeout, but shorter than the global timeout, for child processes

Not all tests use script_helper. But this is probably a different issue ;-)

@vstinner
Copy link
Member

vstinner commented Jan 7, 2020

I modified regrtest to use process groups in bpo-38502. It doesn't solve exactly this issue, but it does fix the overall problem of leaking running processes when a test fails for various reasons. For example, when using regrtest in multiprocessing (-jN) mode), if a test times out, child processes of this test will now be killed.

@vstinner vstinner closed this as completed Jan 7, 2020
@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tests Tests in the Lib/test dir type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

1 participant