Issue22331
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2014-09-03 21:31 by vstinner, last changed 2022-04-11 14:58 by admin. This issue is now closed.
Files | ||||
---|---|---|---|---|
File name | Uploaded | Description | Edit | |
int-write.patch | martin.panter, 2016-06-03 13:56 | review |
Messages (10) | |||
---|---|---|---|
msg226326 - (view) | Author: STINNER Victor (vstinner) * | Date: 2014-09-03 21:31 | |
http://buildbot.python.org/all/builders/x86%20FreeBSD%207.2%203.4/builds/332/steps/test/logs/stdio Log: --- [157/389/1] test_io Timeout (1:00:00)! Thread 0x2a4f0790 (most recent call first): File "/usr/home/db3l/buildarea/3.4.bolen-freebsd7/build/Lib/test/test_io.py", line 3257 in _read File "/usr/home/db3l/buildarea/3.4.bolen-freebsd7/build/Lib/threading.py", line 869 in run File "/usr/home/db3l/buildarea/3.4.bolen-freebsd7/build/Lib/threading.py", line 921 in _bootstrap_inner File "/usr/home/db3l/buildarea/3.4.bolen-freebsd7/build/Lib/threading.py", line 889 in _bootstrap Thread 0x28401040 (most recent call first): File "/usr/home/db3l/buildarea/3.4.bolen-freebsd7/build/Lib/threading.py", line 1077 in _wait_for_tstate_lock File "/usr/home/db3l/buildarea/3.4.bolen-freebsd7/build/Lib/threading.py", line 1061 in join File "/usr/home/db3l/buildarea/3.4.bolen-freebsd7/build/Lib/test/test_io.py", line 3278 in check_interrupted_write File "/usr/home/db3l/buildarea/3.4.bolen-freebsd7/build/Lib/test/test_io.py", line 3302 in test_interrupted_write_text File "/usr/home/db3l/buildarea/3.4.bolen-freebsd7/build/Lib/unittest/case.py", line 577 in run File "/usr/home/db3l/buildarea/3.4.bolen-freebsd7/build/Lib/unittest/case.py", line 625 in __call__ File "/usr/home/db3l/buildarea/3.4.bolen-freebsd7/build/Lib/unittest/suite.py", line 125 in run File "/usr/home/db3l/buildarea/3.4.bolen-freebsd7/build/Lib/unittest/suite.py", line 87 in __call__ File "/usr/home/db3l/buildarea/3.4.bolen-freebsd7/build/Lib/unittest/suite.py", line 125 in run File "/usr/home/db3l/buildarea/3.4.bolen-freebsd7/build/Lib/unittest/suite.py", line 87 in __call__ File "/usr/home/db3l/buildarea/3.4.bolen-freebsd7/build/Lib/unittest/suite.py", line 125 in run File "/usr/home/db3l/buildarea/3.4.bolen-freebsd7/build/Lib/unittest/suite.py", line 87 in __call__ File "/usr/home/db3l/buildarea/3.4.bolen-freebsd7/build/Lib/unittest/runner.py", line 168 in run File "/usr/home/db3l/buildarea/3.4.bolen-freebsd7/build/Lib/test/support/__init__.py", line 1750 in _run_suite File "/usr/home/db3l/buildarea/3.4.bolen-freebsd7/build/Lib/test/support/__init__.py", line 1784 in run_unittest File "/usr/home/db3l/buildarea/3.4.bolen-freebsd7/build/Lib/test/regrtest.py", line 1279 in test_runner File "/usr/home/db3l/buildarea/3.4.bolen-freebsd7/build/Lib/test/regrtest.py", line 1280 in runtest_inner File "/usr/home/db3l/buildarea/3.4.bolen-freebsd7/build/Lib/test/regrtest.py", line 967 in runtest File "/usr/home/db3l/buildarea/3.4.bolen-freebsd7/build/Lib/test/regrtest.py", line 763 in main File "/usr/home/db3l/buildarea/3.4.bolen-freebsd7/build/Lib/test/regrtest.py", line 1564 in main_in_temp_cwd File "/usr/home/db3l/buildarea/3.4.bolen-freebsd7/build/Lib/test/__main__.py", line 3 in <module> File "/usr/home/db3l/buildarea/3.4.bolen-freebsd7/build/Lib/runpy.py", line 85 in _run_code File "/usr/home/db3l/buildarea/3.4.bolen-freebsd7/build/Lib/runpy.py", line 170 in _run_module_as_main --- See also the issue #11859 (fixed). |
|||
msg226327 - (view) | Author: Roundup Robot (python-dev) | Date: 2014-09-03 21:33 | |
New changeset baa372eb731c by Victor Stinner in branch '3.4': Issue #22331: Skip test_interrupted_write_text() on FreeBSD older than 8.0 http://hg.python.org/cpython/rev/baa372eb731c New changeset 340d48347295 by Victor Stinner in branch 'default': (Merge 3.4) Issue #22331: Skip test_interrupted_write_text() on FreeBSD older http://hg.python.org/cpython/rev/340d48347295 |
|||
msg228577 - (view) | Author: STINNER Victor (vstinner) * | Date: 2014-10-05 15:12 | |
The test now pass on the buildbot FreeBSD 7.2. |
|||
msg266688 - (view) | Author: Martin Panter (martin.panter) * | Date: 2016-05-30 12:02 | |
This recently hung AMD64 FreeBSD 9.x 3.5. The stack trace was different, and there is only one thread: http://buildbot.python.org/all/builders/AMD64%20FreeBSD%209.x%203.5/builds/828/steps/test/logs/stdio [398/398] test_io Timeout (0:15:00)! Thread 0x0000000801807400 (most recent call first): File "/usr/home/buildbot/python/3.5.koobs-freebsd9/build/Lib/unittest/case.py", line 176 in handle File "/usr/home/buildbot/python/3.5.koobs-freebsd9/build/Lib/unittest/case.py", line 727 in assertRaises File "/usr/home/buildbot/python/3.5.koobs-freebsd9/build/Lib/test/test_io.py", line 3714 in check_interrupted_write File "/usr/home/buildbot/python/3.5.koobs-freebsd9/build/Lib/test/test_io.py", line 3743 in test_interrupted_write_text Also, x86 Ubuntu Shared 2.7 hung, but the only information I have is it was running test_io. In my Free BSD case, the write() call is stuck, but in Victor’s original case, the background read() call is stuck. I could explain both cases as a race condition with the alarm signal being delivered: Victor’s case: SIGALRM delivered somewhere inside assertRaises(), but before the write() system call, and Python has a chance to call the Python signal handler. No data is ever written, so the background read() hangs. My case: SIGALRM delivered just before or as write() starts, so Python has no chance to interrupt the system call and call its own Python handler. I wonder if we can change the test to only deliver a signal after first reading one byte, i.e. send the signal directly from the background thread. I think that should cover 99% of cases, though in theory it is possible for something else to interrupt write(), and our signal to be delivered while write() was restarting, causing it to hang. But the combination of those two events would be so unlikely it may not be worth worrying about. Also, if you close the read end of the pipe before the write end, it should protect against flush() hanging without the EBADF hack by raising EPIPE instead. I suggest rewriting the background thread like: def _read(): s = os.read(r, 1) read_results.append(s) # The main thread is very likely to be inside the write() syscall now, so interrupt it os.kill(os.getpid(), SIGALRM) and cleaning up the pipe like: finally: os.close(r) try: wio.close() except BrokenPipeError: pass |
|||
msg266691 - (view) | Author: STINNER Victor (vstinner) * | Date: 2016-05-30 12:47 | |
When we talk about signals, we must always be careful of threads. Python unit tests are known (at least by me ;-)) to "leak" threads. IMO the first step to make the test more reliable is to start it in a fresh process to control exactly the number of threads. A fresh process avoids also to inherit an expected signal and/or signal handler. I modified test_eintr recently to do something similar. |
|||
msg266692 - (view) | Author: STINNER Victor (vstinner) * | Date: 2016-05-30 12:50 | |
I also modified Python 3.6 to run each test file in a fresh process to avoid leaking too much things between tests, to isolate them better. |
|||
msg266723 - (view) | Author: Martin Panter (martin.panter) * | Date: 2016-05-31 00:10 | |
Thanks, that is a very good point about threads. I remember that testing tkinter leaves an internal TK or TCL thread running in the background. And I understand Free BSD delivers process signals to arbitrary threads (not necessarily the main thread). But the FreeBSD 3.5 buildbot looks like it ran test_io in a separate process (-j4 specified). Anyway it is another possible explanation of the hang (signal delivered to wrong thread, so write() call not interrupted). Perhaps it is better to use signal.pthread_kill() instead of os.kill() in my proposal. |
|||
msg266725 - (view) | Author: Martin Panter (martin.panter) * | Date: 2016-05-31 00:29 | |
Found the same hang on the 3.6 branch, which was definitely running separate processes: http://buildbot.python.org/all/builders/AMD64%20FreeBSD%209.x%203.x/builds/4210/steps/test/logs/stdio == CPython 3.6.0a0 (default:ed6345cb08ab, Apr 29 2016, 18:35:36) [GCC 4.2.1 20070831 patched [FreeBSD]] Run tests in parallel using 4 child processes |
|||
msg267100 - (view) | Author: Martin Panter (martin.panter) * | Date: 2016-06-03 13:56 | |
Here is a patch with my suggestion to send the signal only when read() returns, and also closing the read end of the pipe before the write end. |
|||
msg297114 - (view) | Author: STINNER Victor (vstinner) * | Date: 2017-06-28 01:26 | |
Hum, the resolution is already set to Fixed, and I didn't see this bug last months on buildbots, so I consider that yes, it's fixed. Sorry Martin, I lost track of this issue and so I don't know what to do with your patch :-/ I will just close the issue. |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:58:07 | admin | set | github: 66527 |
2017-06-28 01:26:58 | vstinner | set | status: open -> closed messages: + msg297114 stage: patch review -> resolved |
2016-06-03 13:56:41 | martin.panter | set | files:
+ int-write.patch keywords: + patch messages: + msg267100 stage: patch review |
2016-05-31 00:29:39 | martin.panter | set | messages: + msg266725 |
2016-05-31 00:10:04 | martin.panter | set | messages: + msg266723 |
2016-05-30 12:50:06 | vstinner | set | messages: + msg266692 |
2016-05-30 12:47:45 | vstinner | set | messages: + msg266691 |
2016-05-30 12:02:49 | martin.panter | set | status: closed -> open versions: + Python 2.7, Python 3.6, - Python 3.4 nosy: + martin.panter messages: + msg266688 |
2014-10-05 15:12:18 | vstinner | set | status: open -> closed resolution: fixed messages: + msg228577 |
2014-09-03 21:33:53 | python-dev | set | nosy:
+ python-dev messages: + msg226327 |
2014-09-03 21:31:48 | vstinner | create |