classification
Title: PyThread_acquire_lock_timed() should recompute the timeout when interrupted by a signal
Type: Stage: resolved
Components: Versions: Python 3.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: neologix, pitrou, vstinner
Priority: normal Keywords: patch

Created on 2017-06-26 13:50 by vstinner, last changed 2017-10-25 00:15 by vstinner.

Files
File name Uploaded Description Edit
interrupted_lock.py vstinner, 2017-10-24 10:21
test.patch vstinner, 2017-10-24 10:30
Pull Requests
URL Status Linked Edit
PR 4103 merged vstinner, 2017-10-24 10:28
Messages (10)
msg296896 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-06-26 13:50
The current code of PyThread_acquire_lock_timed() (the implementation not using semaphore) doesn't compute correctly the timeout when pthread_cond_timedwait() is interrupted by a signal. We should recompute the timeout using a deadline.

Something like select.select():

if (tvp)
    deadline = _PyTime_GetMonotonicClock() + timeout;

do {
    ... use tvp
    if (errno != EINTR)
        break;

    /* select() was interrupted by a signal */
    if (PyErr_CheckSignals())
        goto finally;

    if (tvp) {
        timeout = deadline - _PyTime_GetMonotonicClock();
        if (timeout < 0) {
            n = 0;
            break;
        }
        _PyTime_AsTimeval_noraise(timeout, &tv, _PyTime_ROUND_CEILING);
        /* retry select() with the recomputed timeout */
    }
} while (1);
msg296898 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-06-26 13:51
See also the old bpo-12822: "NewGIL should use CLOCK_MONOTONIC if possible.".
msg296899 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-06-26 13:51
See also the PEP 475 "Retry system calls failing with EINTR" and PEP 418 (time.monotonic).
msg304892 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-10-24 10:21
interrupted_lock.py: test threading.Lock.acquire(timeout=1.0) with SIGALRM sent every 1 ms (so up to 1000 times in total). Example:

haypo@selma$ ./python interrupted_lock.py 
acquire(timeout=1.0) took 1.0 seconds and got 1000 signals

Oh, in fact, threading.Lock.acquire(timeout=1.0) already recomputes the timeout when interrupted.

In Python stdlib, PyThread_acquire_lock_timed() is only called from one place with intr_flag=0: faulthandler watchdog thread, but this thread blocks all signals:

    /* we don't want to receive any signal */
    sigfillset(&set);
    pthread_sigmask(SIG_SETMASK, &set, NULL);
msg304893 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-10-24 10:28
Ah, I found another caller of PyThread_acquire_lock_timed() with a timeout > 0 and intr_flag=0: _enter_buffered_busy() of Modules/_io/bufferedio.c:

        /* When finalizing, we don't want a deadlock to happen with daemon
         * threads abruptly shut down while they owned the lock.
         * Therefore, only wait for a grace period (1 s.).
         * Note that non-daemon threads have already exited here, so this
         * shouldn't affect carefully written threaded I/O code.
         */
        st = PyThread_acquire_lock_timed(self->lock, (PY_TIMEOUT_T)1e6, 0);
msg304895 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-10-24 10:30
I wrote PR 4103 to fix the pthread+semaphore implementation of PyThread_acquire_lock_timed().

Apply PR 4103, apply attached test.patch, recompile Python, and run interrupted_lock.py to test the PR. Result:

haypo@selma$ ./python interrupted_lock.py 
acquire(timeout=1.0) took 1.0 seconds and got 911 signals
msg304896 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-10-24 10:31
To check if you are using pthread+semaphore, use:

haypo@selma$ ./python -c 'import sys; print(sys.thread_info)'
sys.thread_info(name='pthread', lock='semaphore', version='NPTL 2.25')

Here you have pthread+semaphore. It's Fedora 26 running Linux kernel 4.13.
msg304951 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-10-24 23:53
New changeset 850a18e03e8f8309bc8c39adc6e7d51a4568cd9a by Victor Stinner in branch 'master':
bpo-30768: Recompute timeout on interrupted lock (GH-4103)
https://github.com/python/cpython/commit/850a18e03e8f8309bc8c39adc6e7d51a4568cd9a
msg304952 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-10-25 00:13
I merged my PR. Thanks Antoine Pitrou for the review!

This change only impacts the io.BufferedWriter and io.BufferedReader during Python finalization. It has no effect on theading.Lock.acquire(). It might impact faulthandler.dump_traceback_later(), but in practice, it shouldn't change anything since the internal faulthandler watchdog thread blocks all signals.

If I understand correctly, if the system clock is stepped back by 1 hour during Python finalization, after PyThread_acquire_lock_timed() computed the deadline, but before sem_timedwait() completed, _enter_buffered_busy() can be blocked during 1 hour.

Moving the system clock backward by 1 hour occurs once a year on the DST change. But the race condition is unlikely since the size of the time window is only 1 second. The DST change should occur at Python shutdown when an io.BufferedReader or io.BufferedWriter is used.

Since the race condition seems very unlikely and was never reported by another user, I propose to not backward this change. Moreover, I'm not confident to modify locks in a stable release :-)
msg304953 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-10-25 00:15
Oh, the pthread condvar+mutex implementation still has the bug, so I reopen the issue.
History
Date User Action Args
2017-10-25 00:15:14vstinnersetstatus: closed -> open
resolution: fixed ->
messages: + msg304953
2017-10-25 00:13:12vstinnersetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2017-10-25 00:13:03vstinnersetmessages: + msg304952
2017-10-24 23:53:35vstinnersetmessages: + msg304951
2017-10-24 10:31:59vstinnersetmessages: + msg304896
2017-10-24 10:30:17vstinnersetfiles: + test.patch

messages: + msg304895
2017-10-24 10:28:11vstinnersetkeywords: + patch
stage: patch review
pull_requests: + pull_request4073
2017-10-24 10:28:03vstinnersetmessages: + msg304893
2017-10-24 10:21:07vstinnersetfiles: + interrupted_lock.py

messages: + msg304892
2017-06-26 13:51:48vstinnersetmessages: + msg296899
2017-06-26 13:51:09vstinnersetmessages: + msg296898
2017-06-26 13:50:01vstinnercreate