Issue19652
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2013-11-19 11:20 by vstinner, last changed 2022-04-11 14:57 by admin. This issue is now closed.
Files | ||||
---|---|---|---|---|
File name | Uploaded | Description | Edit | |
asyncio_subprocess_watchdog.patch | vstinner, 2013-11-21 10:17 | review |
Messages (9) | |||
---|---|---|---|
msg203358 - (view) | Author: STINNER Victor (vstinner) * | Date: 2013-11-19 11:20 | |
I didn't understand the purpose of the issue #19334: it doesn't mention the OS nor the test. I prefer to create a more specific issue, so here you have one specific hang. http://buildbot.python.org/all/builders/AMD64%20Snow%20Leop%203.x/builds/517/steps/test/logs/stdio [385/385] test_asyncio Timeout (1:00:00)! Thread 0x00007fff71296cc0 (most recent call first): File "/Users/buildbot/buildarea/3.x.murray-snowleopard/build/Lib/selectors.py", line 291 in select File "/Users/buildbot/buildarea/3.x.murray-snowleopard/build/Lib/asyncio/base_events.py", line 614 in _run_once File "/Users/buildbot/buildarea/3.x.murray-snowleopard/build/Lib/asyncio/base_events.py", line 153 in run_forever File "/Users/buildbot/buildarea/3.x.murray-snowleopard/build/Lib/asyncio/base_events.py", line 172 in run_until_complete File "/Users/buildbot/buildarea/3.x.murray-snowleopard/build/Lib/test/test_asyncio/test_events.py", line 1155 in test_subprocess_send_signal File "/Users/buildbot/buildarea/3.x.murray-snowleopard/build/Lib/unittest/case.py", line 571 in run File "/Users/buildbot/buildarea/3.x.murray-snowleopard/build/Lib/unittest/case.py", line 610 in __call__ File "/Users/buildbot/buildarea/3.x.murray-snowleopard/build/Lib/unittest/suite.py", line 117 in run File "/Users/buildbot/buildarea/3.x.murray-snowleopard/build/Lib/unittest/suite.py", line 79 in __call__ File "/Users/buildbot/buildarea/3.x.murray-snowleopard/build/Lib/unittest/suite.py", line 117 in run File "/Users/buildbot/buildarea/3.x.murray-snowleopard/build/Lib/unittest/suite.py", line 79 in __call__ File "/Users/buildbot/buildarea/3.x.murray-snowleopard/build/Lib/unittest/suite.py", line 117 in run File "/Users/buildbot/buildarea/3.x.murray-snowleopard/build/Lib/unittest/suite.py", line 79 in __call__ File "/Users/buildbot/buildarea/3.x.murray-snowleopard/build/Lib/unittest/runner.py", line 168 in run File "/Users/buildbot/buildarea/3.x.murray-snowleopard/build/Lib/test/support/__init__.py", line 1685 in _run_suite File "/Users/buildbot/buildarea/3.x.murray-snowleopard/build/Lib/test/support/__init__.py", line 1719 in run_unittest File "/Users/buildbot/buildarea/3.x.murray-snowleopard/build/Lib/test/test_asyncio/__init__.py", line 31 in test_main File "/Users/buildbot/buildarea/3.x.murray-snowleopard/build/Lib/test/regrtest.py", line 1276 in runtest_inner File "/Users/buildbot/buildarea/3.x.murray-snowleopard/build/Lib/test/regrtest.py", line 965 in runtest File "/Users/buildbot/buildarea/3.x.murray-snowleopard/build/Lib/test/regrtest.py", line 532 in main File "/Users/buildbot/buildarea/3.x.murray-snowleopard/build/Lib/test/regrtest.py", line 1560 in main_in_temp_cwd File "/Users/buildbot/buildarea/3.x.murray-snowleopard/build/Lib/test/regrtest.py", line 1585 in <module> File "/Users/buildbot/buildarea/3.x.murray-snowleopard/build/Lib/runpy.py", line 73 in _run_code File "/Users/buildbot/buildarea/3.x.murray-snowleopard/build/Lib/runpy.py", line 160 in _run_module_as_main Traceback (most recent call last): File "/Users/buildbot/buildarea/3.x.murray-snowleopard/build/Lib/runpy.py", line 160, in _run_module_as_main "__main__", fname, loader, pkg_name) File "/Users/buildbot/buildarea/3.x.murray-snowleopard/build/Lib/runpy.py", line 73, in _run_code exec(code, run_globals) File "/Users/buildbot/buildarea/3.x.murray-snowleopard/build/Lib/test/__main__.py", line 3, in <module> regrtest.main_in_temp_cwd() File "/Users/buildbot/buildarea/3.x.murray-snowleopard/build/Lib/test/regrtest.py", line 1560, in main_in_temp_cwd main() File "/Users/buildbot/buildarea/3.x.murray-snowleopard/build/Lib/test/regrtest.py", line 736, in main raise Exception("Child error on {}: {}".format(test, result[1])) Exception: Child error on test_asyncio: Exit code 1 make: *** [buildbottest] Error 1 |
|||
msg203392 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2013-11-19 15:43 | |
The traceback is pretty useless. So is the title of the other bug report "hangs for 1 hour" (which just means it hangs forever but the test runner kills it after one hour). We would need to run the tests with -v so at least we can pinpoint which of the 600 or so tests in test_asyncio is failing. PS. Off-topic: "raise Exception(...)" is poor coding style: File "/Users/buildbot/buildarea/3.x.murray-snowleopard/build/Lib/test/regrtest.py", line 736, in main raise Exception("Child error on {}: {}".format(test, result[1])) |
|||
msg203393 - (view) | Author: STINNER Victor (vstinner) * | Date: 2013-11-19 15:51 | |
Ah, it looks like the test uses a child process. The following issue has a similar problem, we don't know that status of the child process: http://bugs.python.org/issue19564 I proposed a patch in this issue for multiprocessing, to enable also faulthandler in the child process. We can probably do something similar in asyncio with child process (enable faulthandler in unit tests with a short timeout). |
|||
msg203404 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2013-11-19 17:43 | |
I cannot help you unless you tell me which specific test is failing. But once you have a proposed fix I will review it. Thanks for advocating for the minority platforms! |
|||
msg203455 - (view) | Author: STINNER Victor (vstinner) * | Date: 2013-11-20 01:23 | |
> I cannot help you unless you tell me which specific test is failing. Oh, I didn't notice that test_subprocess_send_signal() is part of a mixin: SubprocessTestsMixin. Configure output: checking sys/devpoll.h usability... no checking sys/devpoll.h presence... no checking for sys/devpoll.h... no checking sys/epoll.h usability... no checking sys/epoll.h presence... no checking for sys/epoll.h... no checking poll.h usability... yes checking poll.h presence... yes checking for broken poll()... no checking for kqueue... yes I don't know which selector was used: select, poll or maybe kqueue? |
|||
msg203458 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2013-11-20 03:06 | |
Oh, sorry, I didn't realize the name of the failing test was in the issue title. But even that's no excuse, because it's also in the log. :-( Fortunately the line where we're hanging is also in the log: line 291 in selectors.py is in PollSelector. So that's settled. But we don't know if that's the first of the three tests that runs or not. What's also strange is that several other tests (test_subprocess_kill() and test_subprocess_terminate()) also send signals, so they should work exactly the same way. But again we don't know if any of them are being run before the signal test. Do you have any other traces from the same bot? Do they all fail in the same test? Is the test order randomized? |
|||
msg203602 - (view) | Author: STINNER Victor (vstinner) * | Date: 2013-11-21 10:17 | |
asyncio_subprocess_watchdog.patch: * add test_asyncio.watchdog.setup_watchdog(): use faulthandler.dump_traceback_later() to kill the process if it hangs more than 'timeout' seconds (do nothing if faulthandler is not present, eg. on other Python VMs than CPython) * add SubprocessTestsMixin.debug_info(): return returncode, stdout, stderr * use debug_info() on functions using subprocess may may hang The patch should give more information on the child process if it hangs. I didn't see the test_asyncio hang on "AMD64 Snow Leop 3.x" recently. The hang is probably random :-/ I chose a default timeout of 5 minutes, it should be enough for the short tests of test_asyncio. You may set the timeout to 15 minutes if a buildbot is too slow. Or set the timeout to 1 minute if you prefer to not hang a buildbot too long. I tested asyncio_subprocess_watchdog.patch manually with a shorter timeout and by adding time.sleep(60) in echo.py, echo2.py and echo3.py. |
|||
msg203636 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2013-11-21 15:16 | |
Hmm... When did you last see the hang? We've had a variety of checkins that might have affected this code. If you've seen at least one traceback later than the commit below, feel free to commit. Otherwise, let's wait. changeset: 87088:eb42adc53923 user: Guido van Rossum <guido@dropbox.com> date: Wed Nov 13 15:50:08 2013 -0800 summary: asyncio: Fix from Anthony Baire for CPython issue 19566 (replaces earlier fix). |
|||
msg210394 - (view) | Author: STINNER Victor (vstinner) * | Date: 2014-02-06 14:51 | |
I didn't see this bug recently, so I close the issue. |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:57:53 | admin | set | github: 63851 |
2014-02-06 14:51:21 | vstinner | set | status: open -> closed resolution: fixed messages: + msg210394 |
2013-11-21 15:16:05 | gvanrossum | set | messages: + msg203636 |
2013-11-21 10:17:15 | vstinner | set | files:
+ asyncio_subprocess_watchdog.patch keywords: + patch messages: + msg203602 |
2013-11-20 03:06:10 | gvanrossum | set | messages: + msg203458 |
2013-11-20 01:23:09 | vstinner | set | messages: + msg203455 |
2013-11-19 17:43:34 | gvanrossum | set | messages: + msg203404 |
2013-11-19 15:51:43 | vstinner | set | messages: + msg203393 |
2013-11-19 15:43:44 | gvanrossum | set | messages: + msg203392 |
2013-11-19 11:20:25 | vstinner | create |