classification
Title: test_interrupted_write_text() of test_io failed of Python 3.3 on FreeBSD 7.2
Type: Stage:
Components: Tests Versions: Python 3.3
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: exarkun, pitrou, vstinner
Priority: normal Keywords:

Created on 2011-04-16 22:57 by vstinner, last changed 2011-07-04 12:32 by vstinner. This issue is now closed.

Messages (6)
msg133908 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-04-16 22:57
test_interrupted_write_text() of test_io failed of Python 3.3 on FreeBSD 7.2:
-----------------------------------------------
[250/354] test_io
Exception in thread Thread-1316:
Traceback (most recent call last):
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd7/build/Lib/threading.py", line 735, in _bootstrap_inner
    self.run()
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd7/build/Lib/threading.py", line 688, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd7/build/Lib/test/test_io.py", line 2630, in _read
    s = os.read(r, 1)
OSError: [Errno 4] Interrupted system call

Timeout (1:00:00)!
Thread 0x28401040:
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd7/build/Lib/test/test_io.py", line 2651 in check_interrupted_write
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd7/build/Lib/test/test_io.py", line 2672 in test_interrupted_write_text
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd7/build/Lib/unittest/case.py", line 387 in _executeTestPart
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd7/build/Lib/unittest/case.py", line 442 in run
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd7/build/Lib/unittest/case.py", line 494 in __call__
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd7/build/Lib/unittest/suite.py", line 105 in run
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd7/build/Lib/unittest/suite.py", line 67 in __call__
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd7/build/Lib/unittest/suite.py", line 105 in run
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd7/build/Lib/unittest/suite.py", line 67 in __call__
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd7/build/Lib/test/support.py", line 1078 in run
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd7/build/Lib/test/support.py", line 1166 in _run_suite
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd7/build/Lib/test/support.py", line 1192 in run_unittest
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd7/build/Lib/test/test_io.py", line 2845 in test_main
  File "./Lib/test/regrtest.py", line 1041 in runtest_inner
  File "./Lib/test/regrtest.py", line 835 in runtest
  File "./Lib/test/regrtest.py", line 659 in main
  File "./Lib/test/regrtest.py", line 1619 in <module>
*** Error code 1

Stop in /usr/home/db3l/buildarea/3.x.bolen-freebsd7/build.
program finished with exit code 1
elapsedTime=8703.824572
-----------------------------------------------
http://www.python.org/dev/buildbot/all/builders/x86%20FreeBSD%207.2%203.x/builds/1695/steps/test/logs/stdio
msg133909 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-04-16 23:17
I already read somewhere that on FreeBSD, any thread can receive a signal, not only the main thread. I suppose that it should be the same on Linux, but Linux tries maybe to send a signal to the main thread if the main thread and other threads are calling a system call.

In this case, "_read" thread gets the SIGARLM signal and so its os.read() system call is interrupted. It means that os.read() is blocked at least one second, whereas wio.write() is supposed to send data to unblock _read() thread.
msg133910 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-04-16 23:23
One solution to fix this problem is to use pthread_sigmask() on the _read() thread to not handle SIGARLM. For example, the faulthandler uses the following code to not handle any thread in its timeout thread:

#ifdef HAVE_PTHREAD_H
    sigset_t set;

    /* we don't want to receive any signal */
    sigfillset(&set);
#if defined(HAVE_PTHREAD_SIGMASK) && !defined(HAVE_BROKEN_PTHREAD_SIGMASK)
    pthread_sigmask(SIG_SETMASK, &set, NULL);
#else
    sigprocmask(SIG_SETMASK, &set, NULL);
#endif
#endif
msg133923 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-04-17 13:30
Agreed with your analysis. The problem is that the signal module doesn't expose pthread_sigmask. We could grab Jean-Paul's implementation from http://bazaar.launchpad.net/~exarkun/python-signalfd/trunk/view/head:/signalfd/_signalfd.c (although I'm not sure why the method is called "sigprocmask" while it calls pthread_sigmask).
msg134863 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-04-30 13:23
New changeset 28b9702a83d1 by Victor Stinner in branch 'default':
Issue #8407, issue #11859: Add signal.pthread_sigmask() function to fetch
http://hg.python.org/cpython/rev/28b9702a83d1
msg134940 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-05-01 22:47
The issue is race condition and was rare (I only saw it once on FreeBSD 7.2 3.x buildbot). I suppose that it is fixed, I'm unable to check (I am unable to reproduce the bug in my FreeBSD 8 VM). Reopen the issue if it is not fixed yet.
History
Date User Action Args
2011-07-04 12:32:50vstinnersetstatus: open -> closed
2011-05-01 22:47:51vstinnersetresolution: fixed
dependencies: - expose pthread_sigmask(), pthread_kill(), sigpending() and sigwait() in the signal module
messages: + msg134940
2011-04-30 13:23:02vstinnersetmessages: + msg134863
2011-04-20 00:22:56vstinnersetdependencies: + expose pthread_sigmask(), pthread_kill(), sigpending() and sigwait() in the signal module
2011-04-17 13:30:12pitrousetnosy: + exarkun
messages: + msg133923
2011-04-16 23:23:54vstinnersetmessages: + msg133910
2011-04-16 23:17:27vstinnersetmessages: + msg133909
2011-04-16 22:57:09vstinnercreate