This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: test_asyncio: test_subprocess_send_signal hangs on Fedora builders
Type: behavior Stage:
Components: asyncio, Tests Versions: Python 3.4
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: gvanrossum, opoplawski, python-dev, socketpair, vstinner, yselivanov
Priority: normal Keywords: buildbot, patch

Created on 2014-04-15 21:33 by opoplawski, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
test_signal.out opoplawski, 2014-04-15 21:33 test strace
test_send_signal.patch vstinner, 2014-07-16 17:07 review
Messages (12)
msg216392 - (view) Author: Orion Poplawski (opoplawski) Date: 2014-04-15 21:33
Trying to build Python 3.4.0 for Fedora we are seeing test_asyncio test_subprocess_send_signal hang every time, on all architectures.  Unfortunately I cannot reproduce this locally.  These builds are done inside of chroots, and the host has the kernel version 3.12.8-300.fc20 which is used for all build targets.  We see hangs building for Fedora Rawhide and RHEL 7.  We do *not* see hangs on our COPR builders which among other possible differences use RHEL6 hosts with kernel 2.6.32-358.el6.

I've attached an strace of the hanging test.  The calling process seems to be stuck in epoll_wait().

Tried using the watchdog patch from issue #19652 but that doesn't seem to manage to kill things.  In fact, the tests are never killed but the 1 hour timeout in the test runner.
msg216397 - (view) Author: Orion Poplawski (opoplawski) Date: 2014-04-15 21:49
Hmm, looking at things a little closer, it looks like the SIGHUP is arriving very early, perhaps too early?
msg216407 - (view) Author: Orion Poplawski (opoplawski) Date: 2014-04-15 22:23
It may also be possible that something has set the SIGHUP handler to SIG_IGN when the test is run.
msg216409 - (view) Author: Orion Poplawski (opoplawski) Date: 2014-04-15 22:38
Looks like in the Fedora koji builds, the SIGHUP sigaction is set to SIG_IGN, which causes the processes that the python tests are trying to kill with SIGHUP not to die.  Perhaps the koji builders should not be doing that, perhaps the python tests should reset the SIGHUP sigaction to SIG_DFL.
msg216434 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2014-04-16 05:42
This issue is a race condition or bug in the unit test, not in asyncio. The test doesn't check if echo.py is running, if Python started.

Python doesn't setup an handler for SIGHUP, it uses the current handler. On my Fedora 20, it looks to be "SIG_DFL":

Python 3.5.0a0 (default:795d90c7820d+, Apr 16 2014, 00:18:50) 
[GCC 4.8.2 20131212 (Red Hat 4.8.2-7)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import signal
>>> signal.getsignal(signal.SIGHUP)
<Handlers.SIG_DFL: 0>

Extract of the attached strace:
---
clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f9d1e8cba10) = 24719
Process 24719 attached
...
[pid 24719] rt_sigaction(SIGHUP, NULL, {SIG_IGN, [], 0}, 8) = 0
...
[pid 24625] kill(24719, SIGHUP)         = 0
[pid 24719] --- SIGHUP {si_signo=SIGHUP, si_code=SI_USER, si_pid=24625, si_uid=1000} ---
---

So the child process has SIGHUP configured to SIG_IGN on your platform.
msg216500 - (view) Author: Orion Poplawski (opoplawski) Date: 2014-04-16 16:54
We have determined that the koji builder is indeed setting the SIGHUP sigaction to SIG_IGN, which the python test is inheriting, and are working on trying to get that fixed.  However, it may be worth considering something like https://github.com/pexpect/pexpect/commit/1fbfddf33d196fd1f211fb95efdaa810b8b5dad3 in the python tests to ensure that the test run properly in situations like this (I can imagine someone running them under "nohup").
msg223235 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2014-07-16 17:07
Here is a patch implementing a basic synchronization between the parent and the child processing, to wait until the child is sleeping.

Can you please try this patch?

If it doesn't work, we might add a small sleep of 500 ms after the readline().
msg223349 - (view) Author: Orion Poplawski (opoplawski) Date: 2014-07-17 17:13
That appears to work. Thanks!
msg223377 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2014-07-17 21:51
New changeset 651475d67225 by Victor Stinner in branch '3.4':
Issue #21247: Fix a race condition in test_send_signal() of asyncio
http://hg.python.org/cpython/rev/651475d67225

New changeset 45e8eb53edbc by Victor Stinner in branch 'default':
(Merge 3.4) Issue #21247: Fix a race condition in test_send_signal() of asyncio
http://hg.python.org/cpython/rev/45e8eb53edbc
msg223378 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2014-07-17 21:52
> That appears to work. Thanks!

Cool, I commited my enhancement of the unit test.
msg223450 - (view) Author: Orion Poplawski (opoplawski) Date: 2014-07-18 23:00
I'm really sorry, I thought I had done the test build properly, but a second attempt has resulted in the same hang:

http://koji.fedoraproject.org/koji/taskinfo?taskID=7165208

So I don't think it does the trick.
msg254207 - (view) Author: Марк Коренберг (socketpair) * Date: 2015-11-06 19:18
Bug still reproduced. Jenkins running from init.d use /usr/bin/daemon. This mean SIGHUP will be in SIG_IGN state. Since echo.py does not setup sighup handler, sighup will be equivalent of SIGKILL. So, why not to use, say, SIGTERM instead? After such change all tests passed.

If not, signal handling tests should reset signal handling to SIG_DFL.

Please reopen
History
Date User Action Args
2022-04-11 14:58:01adminsetgithub: 65446
2015-11-06 19:18:56socketpairsetnosy: + socketpair
messages: + msg254207
2014-07-18 23:00:06opoplawskisetmessages: + msg223450
2014-07-17 21:52:04vstinnersetmessages: + msg223378
2014-07-17 21:51:47vstinnersetstatus: open -> closed
resolution: fixed
2014-07-17 21:51:17python-devsetnosy: + python-dev
messages: + msg223377
2014-07-17 17:13:08opoplawskisetmessages: + msg223349
2014-07-16 17:07:03vstinnersetfiles: + test_send_signal.patch
keywords: + patch
messages: + msg223235
2014-06-06 11:42:33vstinnersetkeywords: + buildbot
components: + Tests, asyncio
2014-04-16 16:54:06opoplawskisetmessages: + msg216500
2014-04-16 05:42:56vstinnersetnosy: + gvanrossum, vstinner, yselivanov
messages: + msg216434
2014-04-15 22:38:58opoplawskisetmessages: + msg216409
2014-04-15 22:23:43opoplawskisetmessages: + msg216407
2014-04-15 21:49:47opoplawskisetmessages: + msg216397
2014-04-15 21:33:52opoplawskicreate