Navigation Menu

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Condition.wait() doesn't raise KeyboardInterrupt #53090

Closed
hobb0001 mannequin opened this issue May 28, 2010 · 25 comments
Closed

Condition.wait() doesn't raise KeyboardInterrupt #53090

hobb0001 mannequin opened this issue May 28, 2010 · 25 comments
Labels
stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@hobb0001
Copy link
Mannequin

hobb0001 mannequin commented May 28, 2010

BPO 8844
Nosy @gpshead, @pitrou, @vstinner, @rnk, @superbobry
Files
  • lock-interrupt.diff: patch to make Python-level lock acquires interruptible
  • lock-interrupt.diff: Updated patch for RLock and docs
  • lock-interrupt.diff: retry lock acquisitions in RLock
  • lock-interrupt-v4.diff
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2011-02-16.15:40:23.691>
    created_at = <Date 2010-05-28.16:48:10.841>
    labels = ['type-bug', 'library']
    title = "Condition.wait() doesn't raise KeyboardInterrupt"
    updated_at = <Date 2016-09-28.19:35:55.995>
    user = 'https://bugs.python.org/hobb0001'

    bugs.python.org fields:

    activity = <Date 2016-09-28.19:35:55.995>
    actor = 'gregory.p.smith'
    assignee = 'none'
    closed = True
    closed_date = <Date 2011-02-16.15:40:23.691>
    closer = 'pitrou'
    components = ['Library (Lib)']
    creation = <Date 2010-05-28.16:48:10.841>
    creator = 'hobb0001'
    dependencies = []
    files = ['17929', '17931', '17935', '18536']
    hgrepos = []
    issue_num = 8844
    keywords = ['patch']
    message_count = 25.0
    messages = ['106678', '106680', '106685', '106705', '106715', '109883', '109884', '109898', '109915', '109932', '109935', '110007', '110105', '110110', '110156', '110653', '113786', '113941', '113953', '124088', '125715', '125719', '260732', '277640', '277658']
    nosy_count = 8.0
    nosy_names = ['gregory.p.smith', 'pitrou', 'vstinner', 'jyasskin', 'rnk', 'hobb0001', 'superbobry', 'Yuri.Bochkarev']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue8844'
    versions = ['Python 3.2']

    @hobb0001
    Copy link
    Mannequin Author

    hobb0001 mannequin commented May 28, 2010

    Condition.wait() without a timeout will never raise a KeyboardInterrupt:

    cond = threading.Condition()
    cond.acquire()
    cond.wait()

    *** Pressing Ctrl-C now does nothing ***

    If you pass a timeout to Condition.wait(), however, it does behave as expected:

    cond.wait()
    ^CTraceback (most recent call last):
      File "/usr/lib/python3.1/threading.py", line 242, in wait
        _sleep(delay)
    KeyboardInterrupt

    This may affect other problems reported with multiprocessing pools. Most notably:
    http://bugs.python.org/issue8296
    http://stackoverflow.com/questions/1408356

    @hobb0001 hobb0001 mannequin added stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels May 28, 2010
    @pitrou
    Copy link
    Member

    pitrou commented May 28, 2010

    In 3.2, even using a timeout doesn't make the call interruptible.
    The solution would be to fix the internal locking APIs so that they handle incoming signals properly, and are able to return an error status.

    @pitrou
    Copy link
    Member

    pitrou commented May 28, 2010

    3.2 is interesting in that it introduces a new internal API: PyThread_acquire_lock_timed(). We can therefore change that API again before release without risking any compatibility breakage. Reid, would you want to work on this?

    @rnk
    Copy link
    Mannequin

    rnk mannequin commented May 29, 2010

    I'd like to fix it, but I don't know if I'll be able to in time. It was something that bugged me while running the threading tests while working on Unladen.

    I'm imagining (for POSIX platforms) adding some kind of check for signals when the system call returns EINTR. If the signal handler raises an exception, like an interrupt should raise a KeyboardInterrupt, we can just give a different return code and propagate the exception.

    It also seems like this behavior can be extended gradually to different platforms, since I don't have the resources to change and test every threading implementation.

    @pitrou
    Copy link
    Member

    pitrou commented May 29, 2010

    I'm imagining (for POSIX platforms) adding some kind of check for
    signals when the system call returns EINTR. If the signal handler
    raises an exception, like an interrupt should raise a
    KeyboardInterrupt, we can just give a different return code and
    propagate the exception.

    Yes, this is what I'm proposing too.

    It also seems like this behavior can be extended gradually to
    different platforms, since I don't have the resources to change and
    test every threading implementation.

    There is only one active POSIX threading implementation in 3.2, in
    Python/thread_pthread.h.
    (and the only non-POSIX one is for Windows)

    @rnk
    Copy link
    Mannequin

    rnk mannequin commented Jul 10, 2010

    Here's a patch that makes Python-level lock acquisitions interruptible for py3k. There are many users of the C-level lock API, most of whom are not set up to deal with lock acquisition failure. I decided to make a new API function and leave the others alone.

    If possible, I think this should go out with 3.2.

    In that case, I was wondering if I should merge PyThread_acquire_lock_timed with my new PyThread_acquire_lock_timed_intr, since PyThread_acquire_lock_timed wasn't available in 3.1. Although it did go out in 2.7, we don't promise C API compatibility with the 2.x series, so I don't think it matters.

    I've tested this patch on Mac OS X and Linux. The whole test suite passes on both, along with the test that I added to test_threadsignals.py.

    I added a noop compatibility wrapper to thread_nt.h, but I haven't tested it or built it. When I get around to testing/fixing the subprocess patch on Windows, I'll make sure this works and the test is skipped.

    @pitrou
    Copy link
    Member

    pitrou commented Jul 10, 2010

    Oh, nice!

    I was wondering if I should merge PyThread_acquire_lock_timed with my
    new PyThread_acquire_lock_timed_intr, since PyThread_acquire_lock_timed
    wasn't available in 3.1.

    Yes, I think you should.

    I haven't tried the patch, but it seems you got the logic right. There's a problem sometimes that you're using 2 spaces for indent rather than 4. Also, you forgot to update the RLock implementation in _threadmodule.c (and perhaps add another test for it).

    (there are other modules which use the PyThread_acquire_lock API, but most of the time the locks are held for a short time (and shouldn't deadlock), which makes converting them less of a priority)

    @rnk
    Copy link
    Mannequin

    rnk mannequin commented Jul 10, 2010

    Here is a new version of a patch that updates recursive locks to have the same behavior. The pure Python RLock implementaiton should be interruptible by virtue of the base lock acquire primitive being interruptible.

    I've also updated the relevant documentation I could find. I've surely missed some, though.

    I also got rid of the _intr version of lock acquisition and simply added a new parameter to the _timed variant. The only two callers of it are the ones I updated in the _thread module.

    @pitrou
    Copy link
    Member

    pitrou commented Jul 10, 2010

    The RLock version gets different semantics in your patch: it doesn't retry acquiring when a signal is caught. Is there a reason for that?

    By the way, your patch works fine under Windows.

    @rnk
    Copy link
    Mannequin

    rnk mannequin commented Jul 10, 2010

    Oops, copy/paste oversight. =/ I wrote a test to verify that it handles signals, and then retries the lock acquire.

    @rnk
    Copy link
    Mannequin

    rnk mannequin commented Jul 10, 2010

    Also, thanks for the quick reviews!

    @pitrou
    Copy link
    Member

    pitrou commented Jul 11, 2010

    The latest patch looks good to me.

    @pitrou pitrou self-assigned this Jul 11, 2010
    @pitrou
    Copy link
    Member

    pitrou commented Jul 12, 2010

    Actually, there is a problem in Lock.acquire and RLock.acquire. If a signal occurs and signal handling returns successfully, acquiring the lock will be retried without decrementing the timeout first. Therefore, we may end up waiting longer than the user wanted.

    I'm not sure how to tackle that: either we accept that an incoming signal will make the wait longer, or we fix it by properly decrementing the timeout (which will complicate things a bit, especially for cross-platform time querying - but see bpo-9079 which might help us).

    @gpshead
    Copy link
    Member

    gpshead commented Jul 12, 2010

    Accepting that the timeout is not perfect in the face of signals as a caveat is fine (document the possibility) and can be improved later.

    @rnk
    Copy link
    Mannequin

    rnk mannequin commented Jul 13, 2010

    Alternatively, do you think it would be better to ignore interrupts when a timeout is passed? If a timeout is passed, the lock acquire will eventually fail in a deadlock situation, and the signal will be handled in the eval loop.

    However, if the timeout is sufficiently long, this is still a problem.

    I'd prefer to do that or use gettimeofday from _time than leave this as is.

    @rnk
    Copy link
    Mannequin

    rnk mannequin commented Jul 18, 2010

    Waiting until the portability hacks for gettimeofday make it into core Python.

    @pitrou
    Copy link
    Member

    pitrou commented Aug 13, 2010

    The portability API is now available (see Include/pytime.h).

    @rnk
    Copy link
    Mannequin

    rnk mannequin commented Aug 15, 2010

    Added a patch that adds support for recomputing the timeout, plus a test for it.

    Can this still make it into 3.2, or is it too disruptive at this point in the release process?

    @pitrou
    Copy link
    Member

    pitrou commented Aug 15, 2010

    Added a patch that adds support for recomputing the timeout, plus a test for it.

    Can this still make it into 3.2, or is it too disruptive at this point
    in the release process?

    No problem at this point, we're not yet in beta phase.
    I haven't looked at the patch itself, but thank you.

    @pitrou pitrou removed their assignment Dec 15, 2010
    @pitrou
    Copy link
    Member

    pitrou commented Dec 15, 2010

    Committed in r87292. Thank you for doing this!

    @pitrou pitrou closed this as completed Dec 15, 2010
    @vstinner
    Copy link
    Member

    vstinner commented Jan 7, 2011

    The test fails on FreeBSD 6.2 (x86 FreeBSD py3k buildbot) since r87292 => bpo-10720. I reopen the issue.

    @vstinner vstinner reopened this Jan 7, 2011
    @pitrou
    Copy link
    Member

    pitrou commented Jan 7, 2011

    Well, since bpo-10720 exists, while do you reopen this one?

    @pitrou pitrou closed this as completed Feb 16, 2011
    @superbobry
    Copy link
    Mannequin

    superbobry mannequin commented Feb 23, 2016

    Is it possible to backport this patch to 2.7?

    @vstinner
    Copy link
    Member

    Is it possible to backport this patch to 2.7?

    No.

    Python 2.7 supports many implementations of threads:

    $ ls Python/thread_*h -1
    Python/thread_atheos.h
    Python/thread_beos.h
    Python/thread_cthread.h
    Python/thread_foobar.h
    Python/thread_lwp.h
    Python/thread_nt.h
    Python/thread_os2.h
    Python/thread_pth.h
    Python/thread_pthread.h
    Python/thread_sgi.h
    Python/thread_solaris.h
    Python/thread_wince.h

    Supporting signals on all implementations would be a huge work.

    Handling correctly signals is also a complex task. For example, the PEP-475 changed how Python 3.5 handles signals (retry interrupted syscall), the change is larger than just the threading module. This PEP has an impact on the whole CPython code base (C and Python).

    In Python, unit tests are important. The change b750c45a772c added many unit tests, I expect that it will be tricky to make them reliable on all platforms supported by Python 2.7. I dislike the idea of backporting the feature just for a few platforms.

    I mean that the change b750c45a772c alone is not enough, this change depends on many earlier changes used to cleanup/prepare the code to support this change. It is also followed by many changes to enhance how Python handles signals. It took multiple years to fix all subtle issues, race conditions, etc.

    There is also a significant risk of breaking applications relying on the current behaviour of Python 2.7 (locks are not interrupted by signals).

    @gpshead
    Copy link
    Member

    gpshead commented Sep 28, 2016

    As a data point confirming that: we tried backporting the much nicer non-polling condition implementation (https://github.com/python/cpython/commit/15801a1d52c25fa2a19d649ea2671080f138fca1.patch) to our Python 2.7 interpreter at work but ran into actual code that failed as a result of the interruption behavior changing.

    While it is a nice goal, it is a non-trivial change. Not something we could ever do within a stable release (2.7).

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants