classification
Title: locks cannot be interrupted on OpenBSD
Type: Stage:
Components: Tests Versions: Python 3.3, Python 3.4
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: neologix, python-dev, rpointel, vstinner
Priority: normal Keywords:

Created on 2014-02-08 18:19 by rpointel, last changed 2014-02-18 08:36 by rpointel. This issue is now closed.

Files
File name Uploaded Description Edit
interrupt_lock_acquire.py vstinner, 2014-02-12 00:43
Messages (15)
msg210672 - (view) Author: Remi Pointel (rpointel) * Date: 2014-02-08 18:19
Hi,

I have 2 tests which "failed" on OpenBSD (tested on i386, amd64 and sparc64) in:

- test_threadsignals.py/test_lock_acquire_interruption
- test_threadsignals.py/test_rlock_acquire_interruption


======================================================================
FAIL: test_lock_acquire_interruption (__main__.ThreadSignals)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "Lib/test/test_threadsignals.py", line 94, in test_lock_acquire_interruption
    self.assertLess(dt, 3.0)
AssertionError: 5.020112991333008 not less than 3.0

======================================================================
FAIL: test_rlock_acquire_interruption (__main__.ThreadSignals)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "Lib/test/test_threadsignals.py", line 122, in test_rlock_acquire_interruption
    self.assertLess(dt, 3.0)
AssertionError: 5.0101478099823 not less than 3.0

On the 3 machines it took ~ 5sec.
I'm asking why it's 3 seconds by default in the tests? Could we modify the value to 6 instead of 3, is it acceptable?

Thanks for your response,

Remi.
msg210678 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2014-02-08 19:36
> I'm asking why it's 3 seconds by default in the tests?
> Could we modify the value to 6 instead of 3, is it acceptable?

No, see this comment:
"""
  self.assertRaises(KeyboardInterrupt, lock.acquire, timeout=5)
  dt = time.time() - t1
  # Checking that KeyboardInterrupt was raised is not sufficient.
  # We want to assert that lock.acquire() was interrupted because
  # of the signal, not that the signal handler was called immediately
  # after timeout return of lock.acquire() (which can fool assertRaises).
"""

5s corresponds exactly to the timeout passed to acquire(), which means that it didn't get interrupted, which is precisely the goal of the test.
msg210682 - (view) Author: Remi Pointel (rpointel) * Date: 2014-02-08 21:30
arf, sorry for the noise... I didn't seen this.
msg210684 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2014-02-08 21:42
What is your OpenBSD version? Before 5.2, there were many issues with signals.
msg210685 - (view) Author: Remi Pointel (rpointel) * Date: 2014-02-08 21:43
I'm in -current (5.5-beta).
msg210995 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2014-02-11 18:37
So this looks like a bug.

When running the test, is faulthandler enabled (--timeout option)?
msg210996 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2014-02-11 18:38
> So this looks like a bug.

I mean an OpenBSD bug.
msg211017 - (view) Author: Remi Pointel (rpointel) * Date: 2014-02-11 20:45
Thanks for your response.

> When running the test, is faulthandler enabled (--timeout option)?

It seems yes:

#############################################################################################
$ PYTHONPATH=. ./python Tools/scripts/run_tests.py -j 1  -W --timeout=3600 test_threadsignals
/home/remi/dev/cpython/python -W default -bb -E -R -m test -r -w -u all,-largefile,-audio,-gui -j 1 -W --timeout=3600 test_threadsignals
Using random seed 8501335
[1/1] test_threadsignals
test_interrupted_timed_acquire (test.test_threadsignals.ThreadSignals) ... ok
test_lock_acquire_interruption (test.test_threadsignals.ThreadSignals) ... FAIL
test_lock_acquire_retries_on_intr (test.test_threadsignals.ThreadSignals) ... ok
test_rlock_acquire_interruption (test.test_threadsignals.ThreadSignals) ... FAIL
test_rlock_acquire_retries_on_intr (test.test_threadsignals.ThreadSignals) ... ok
test_signals (test.test_threadsignals.ThreadSignals) ... ok

======================================================================
FAIL: test_lock_acquire_interruption (test.test_threadsignals.ThreadSignals)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/remi/dev/cpython/Lib/test/test_threadsignals.py", line 94, in test_lock_acquire_interruption
    self.assertLess(dt, 3.0)
AssertionError: 5.019763469696045 not less than 3.0

======================================================================
FAIL: test_rlock_acquire_interruption (test.test_threadsignals.ThreadSignals)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/remi/dev/cpython/Lib/test/test_threadsignals.py", line 122, in test_rlock_acquire_interruption
    self.assertLess(dt, 3.0)
AssertionError: 5.020080804824829 not less than 3.0

----------------------------------------------------------------------
Ran 6 tests in 13.295s

FAILED (failures=2)
test test_threadsignals failed
1 test failed:
    test_threadsignals
Re-running failed tests in verbose mode
Re-running test 'test_threadsignals' in verbose mode
test_interrupted_timed_acquire (test.test_threadsignals.ThreadSignals) ... ok
test_lock_acquire_interruption (test.test_threadsignals.ThreadSignals) ... FAIL
test_lock_acquire_retries_on_intr (test.test_threadsignals.ThreadSignals) ... ok
test_rlock_acquire_interruption (test.test_threadsignals.ThreadSignals) ... FAIL
test_rlock_acquire_retries_on_intr (test.test_threadsignals.ThreadSignals) ... ok
test_signals (test.test_threadsignals.ThreadSignals) ... ok

======================================================================
FAIL: test_lock_acquire_interruption (test.test_threadsignals.ThreadSignals)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/remi/dev/cpython/Lib/test/test_threadsignals.py", line 94, in test_lock_acquire_interruption
    self.assertLess(dt, 3.0)
AssertionError: 5.019940614700317 not less than 3.0

======================================================================
FAIL: test_rlock_acquire_interruption (test.test_threadsignals.ThreadSignals)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/remi/dev/cpython/Lib/test/test_threadsignals.py", line 122, in test_rlock_acquire_interruption
    self.assertLess(dt, 3.0)
AssertionError: 5.020142078399658 not less than 3.0

----------------------------------------------------------------------
Ran 6 tests in 13.293s

FAILED (failures=2)
test test_threadsignals failed

#############################################################################################

Do I miss a thing?
msg211051 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2014-02-12 00:43
To make test_lock_acquire_interruption and test_rlock_acquire_interruption more reliable, the test should maybe run a fresh Python process to get a well known number of threads. Some Python modules create C threads like Tkinter. faulthandler creates also an hiding C thread, but it is supposed to ignore all signals (it calls pthread_sigmask).

@Remi: Can you please test interrupt_lock_acquire.py? It should be the same than the unit test, but in a fresh Python process.
msg211057 - (view) Author: Remi Pointel (rpointel) * Date: 2014-02-12 06:49
> Can you please test interrupt_lock_acquire.py? It should be the same than the unit test, but in a fresh Python process.

Thanks for your response, this is what I have:

$ PYTHONPATH=. ./python ./interrupt_lock_acquire.py 
Traceback (most recent call last):
  File "./interrupt_lock_acquire.py", line 30, in <module>
    raise Exception("dt > 3: dt=%s" % dt)
Exception: dt > 3: dt=5.011708974838257

Remi.
msg211061 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2014-02-12 08:17
Well, then it's definitely an OpenBSD bug, and we can't do anything
more than skip the test.
msg211462 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2014-02-18 00:23
> So this looks like a bug.

Or a different design choice?

It looks like the kernel restarts the wait if it was interrupted.

"ignore interruptions other than cancelation"
http://ftp.cc.uoc.gr/mirrors/OpenBSD/src/lib/librthread/rthread_sem.c

"test that sem_timedwait() resumes after handling a signal"
http://mirrors.ircam.fr/pub/OpenBSD/src/regress/lib/libpthread/semaphore/sem_timedwait/sem_timedwait.c

POSIX says that "The sem_timedwait() function shall fail if: [EINTR] A signal interrupted this function."
http://pubs.opengroup.org/onlinepubs/009604599/functions/sem_timedwait.html

Right now, the best we can do is to skip the test on OpenBSD (maybe only on OpenBSD <= 5.4 to see if it changes later).
msg211483 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2014-02-18 08:20
New changeset a3d54bb04cbb by Victor Stinner in branch 'default':
Issue #20564: Skip tests on lock+signals on OpenBSD
http://hg.python.org/cpython/rev/a3d54bb04cbb
msg211485 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2014-02-18 08:21
@Remi: You may report the feature request to OpenBSD kernel if you want. Until OpenBSD implements it, the test is now skipped in Python.
msg211490 - (view) Author: Remi Pointel (rpointel) * Date: 2014-02-18 08:36
> You may report the feature request to OpenBSD kernel if you want. Until OpenBSD implements it, the test is now skipped in Python.

Yes, it's done yet.
However, it's good to skip it until it's implemented.
History
Date User Action Args
2014-02-18 08:36:18rpointelsetmessages: + msg211490
2014-02-18 08:21:18vstinnersetstatus: open -> closed
resolution: fixed
messages: + msg211485
2014-02-18 08:20:29python-devsetnosy: + python-dev
messages: + msg211483
2014-02-18 08:18:02vstinnersettitle: test_threadsignals.py "failed" on OpenBSD because too slow (> 3sec) -> locks cannot be interrupted on OpenBSD
2014-02-18 00:23:59vstinnersetmessages: + msg211462
2014-02-12 08:17:35neologixsetmessages: + msg211061
2014-02-12 06:49:27rpointelsetmessages: + msg211057
2014-02-12 00:43:32vstinnersetfiles: + interrupt_lock_acquire.py

messages: + msg211051
2014-02-11 20:45:36rpointelsetmessages: + msg211017
2014-02-11 18:38:32neologixsetmessages: + msg210996
2014-02-11 18:37:54neologixsetmessages: + msg210995
2014-02-08 21:43:28rpointelsetmessages: + msg210685
2014-02-08 21:42:39vstinnersetnosy: + vstinner
messages: + msg210684
2014-02-08 21:30:42rpointelsetmessages: + msg210682
2014-02-08 19:36:55neologixsetnosy: + neologix
messages: + msg210678
2014-02-08 18:19:44rpointelcreate