Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Race condition in how trip_signal writes to wakeup fd #74224

Closed
njsmith opened this issue Apr 11, 2017 · 27 comments
Closed

Race condition in how trip_signal writes to wakeup fd #74224

njsmith opened this issue Apr 11, 2017 · 27 comments
Labels
3.7 (EOL) end of life

Comments

@njsmith
Copy link
Contributor

njsmith commented Apr 11, 2017

BPO 30038
Nosy @pitrou, @vstinner, @ned-deily, @njsmith, @1st1, @zooba
PRs
  • bpo-30038: fix race condition in signal delivery + wakeup fd #1082
  • [3.6] bpo-30038: fix race condition in signal delivery + wakeup fd (#1082) #2075
  • bpo-30038: add Misc/NEWS entry. #2257
  • bpo-30038: add Misc/NEWS entry. #2258
  • bpo-30038: add Misc/NEWS entry. #2259
  • Files
  • wakeup-fd-racer.py
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2017-06-29.22:35:42.844>
    created_at = <Date 2017-04-11.09:11:22.985>
    labels = ['3.7']
    title = 'Race condition in how trip_signal writes to wakeup fd'
    updated_at = <Date 2017-06-29.22:35:42.844>
    user = 'https://github.com/njsmith'

    bugs.python.org fields:

    activity = <Date 2017-06-29.22:35:42.844>
    actor = 'njs'
    assignee = 'none'
    closed = True
    closed_date = <Date 2017-06-29.22:35:42.844>
    closer = 'njs'
    components = []
    creation = <Date 2017-04-11.09:11:22.985>
    creator = 'njs'
    dependencies = []
    files = ['46797']
    hgrepos = []
    issue_num = 30038
    keywords = []
    message_count = 27.0
    messages = ['291467', '291474', '291476', '291478', '291479', '291480', '291484', '291487', '291488', '291525', '293790', '293791', '293839', '293842', '295488', '295494', '295498', '295499', '295501', '295536', '295591', '295622', '295627', '295842', '296230', '296233', '297324']
    nosy_count = 6.0
    nosy_names = ['pitrou', 'vstinner', 'ned.deily', 'njs', 'yselivanov', 'steve.dower']
    pr_nums = ['1082', '2075', '2257', '2258', '2259']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue30038'
    versions = ['Python 3.5', 'Python 3.6', 'Python 3.7']

    @njsmith
    Copy link
    Contributor Author

    njsmith commented Apr 11, 2017

    In trip_signal [1], the logic goes:

    1. set the flag saying that this particular signal was tripped
    2. write to the wakeup fd
    3. set the global is_tripped flag saying "at least one signal was tripped", and do Py_AddPendingCall (which sets some global flags that the bytecode interpreter checks on every pass through the loop)

    So the problem here is that it's step (2) that wakes up the main thread to check for signals, but it's step (3) that actually arranges for the Python-level signal handler to run. (Step (1) turns out to be irrelevant, because no-one looks at the per-signal flags unless the global is_tripped flag is set. This might be why no-one noticed this bug through code inspection though – I certainly missed it, despite explicitly checking for it several times!)

    The result is that the following sequence of events is possible:

    • signal arrives (e.g. SIGINT)
    • trip_signal writes to the wakeup fd
    • the main thread blocked in IO wait sees this, and wakes up
    • the main thread checks for signals, and doesn't find any
    • the main thread empties the wakeup fd
    • the main thread goes back to sleep
    • trip_signal sets the flags to request the Python-level signal handler be run
    • the main thread doesn't notice, because it's asleep

    It turns out that this is a real thing that can actually happen; it's causing an annoying intermittent failure in the trio testsuite on appveyor; and under the correct conditions I can reproduce it very reliably in my local Windows VM. See [2].

    I think the fix is just to swap the order of steps (2) and (3), so we write to the wakeup fd last. Unfortunately I can't easily test this because I don't have a way to build CPython on Windows. But [2] has some IMHO pretty compelling evidence that this is what's happening.

    [1]

    static void
    trip_signal(int sig_num)
    {
    unsigned char byte;
    int fd;
    Py_ssize_t rc;
    Handlers[sig_num].tripped = 1;
    #ifdef MS_WINDOWS
    fd = Py_SAFE_DOWNCAST(wakeup.fd, SOCKET_T, int);
    #else
    fd = wakeup_fd;
    #endif
    if (fd != INVALID_FD) {
    byte = (unsigned char)sig_num;
    #ifdef MS_WINDOWS
    if (wakeup.use_send) {
    do {
    rc = send(fd, &byte, 1, 0);
    } while (rc < 0 && errno == EINTR);
    /* we only have a storage for one error in the wakeup structure */
    if (rc < 0 && !wakeup.send_err_set) {
    wakeup.send_err_set = 1;
    wakeup.send_errno = errno;
    wakeup.send_win_error = GetLastError();
    Py_AddPendingCall(report_wakeup_send_error, NULL);
    }
    }
    else
    #endif
    {
    byte = (unsigned char)sig_num;
    /* _Py_write_noraise() retries write() if write() is interrupted by
    a signal (fails with EINTR). */
    rc = _Py_write_noraise(fd, &byte, 1);
    if (rc < 0) {
    Py_AddPendingCall(report_wakeup_write_error,
    (void *)(intptr_t)errno);
    }
    }
    }
    if (!is_tripped) {
    /* Set is_tripped after setting .tripped, as it gets
    cleared in PyErr_CheckSignals() before .tripped. */
    is_tripped = 1;
    Py_AddPendingCall(checksignals_witharg, NULL);
    }
    }

    [2] python-trio/trio#119

    @njsmith njsmith added the 3.7 (EOL) end of life label Apr 11, 2017
    @vstinner
    Copy link
    Member

    Last time I had to make a major change related to signal handling, it was in the asyncio module because of a race conditon which occurred on FreeBSD.

    commit fe5649c
    Author: Victor Stinner <victor.stinner@gmail.com>
    Date: Thu Jul 17 22:43:40 2014 +0200

    Python issue bpo-21645, Tulip issue 192: Rewrite signal handling
    
    Since Python 3.3, the C signal handler writes the signal number into the wakeup
    file descriptor and then schedules the Python call using Py_AddPendingCall().
    
    asyncio uses the wakeup file descriptor to wake up the event loop, and relies
    on Py_AddPendingCall() to schedule the final callback with call_soon().
    
    If the C signal handler is called in a thread different than the thread of the
    event loop, the loop is awaken but Py_AddPendingCall() was not called yet. In
    this case, the event loop has nothing to do and go to sleep again.
    Py_AddPendingCall() is called while the event loop is sleeping again and so the
    final callback is not scheduled immediatly.
    
    This patch changes how asyncio handles signals. Instead of relying on
    Py_AddPendingCall() and the wakeup file descriptor, asyncio now only relies on
    the wakeup file descriptor. asyncio reads signal numbers from the wakeup file
    descriptor to call its signal handler.
    

    @njsmith
    Copy link
    Contributor Author

    njsmith commented Apr 11, 2017

    Right. My claim would be that the PR I just submitted is the correct fix for bpo-21645 as well.

    The approach asyncio uses is very elegant, but unfortunately it assumes that the wakeup fd has infinite buffer, which isn't true. If enough signals or other callbacks are scheduled to the event loop quickly enough, then the buffer can overflow and signals can get lost. This is a very classic issue and source of confusion for everyone learning Unix – the reason that there are traditionally 32 unix signals, and that signals can be coalesced and SIGCHLD is so annoying, is that the kernel treats pending signals as flags, not a queue; this means that in the worst case it only takes 32 bits to store all pending signals. (Even the posix committee screwed this up when designing the real-time signals system, which is almost unusable as a result.) On Unix, if you send a process 10 SIGHUP and 1 SIGCHLD, then it might get only 1 SIGHUP and 1 SIGCHLD, but it's guaranteed to get that. With asyncio's model, it might get 10 SIGHUP and 0 SIGCHLD, or even 0 SIGHUP and 0 SIGCHLD (since other methods like call_soon_threadsafe also write to the wakeup fd).

    In any case, this is why other async libraries (at least twisted, libuv, tornado, trio) treat the wakeup fd as a boolean empty-or-not, and pass the actual information via some other out-of-band mechanism that involves a Python-level signal handler. So the patch here is necessary for everyone else to safely use set_wakeup_fd.

    @vstinner
    Copy link
    Member

    Previous changes in signal handling. It's the commit c13ef66 of the issue bpo-8407 which changed the order: <add_pending, write into wakeup fd> became <write into wakeup fd, add_pending>.

    I really *hate* having to think to these evil things which are signals and threads... Signals and threads don't go well altogether :-p It reminds me the "Ghosts of Unix past, part 3: Unfixable designs" article...
    https://lwn.net/Articles/414618/

    commit 6c9b35b
    Author: Victor Stinner <victor.stinner@haypocalc.com>
    Date: Mon Apr 18 16:25:56 2011 +0200

    Issue bpo-11768: The signal handler of the signal module only calls
    Py_AddPendingCall() for the first signal to fix a deadlock on reentrant or
    parallel calls. PyErr_SetInterrupt() writes also into the wake up file.
    

    commit c13ef66
    Author: Victor Stinner <victor.stinner@haypocalc.com>
    Date: Wed May 25 02:35:58 2011 +0200

    Issue bpo-8407: Fix the signal handler of the signal module: if it is called
    twice, it now writes the number of the second signal into the wakeup fd.
    

    @njsmith
    Copy link
    Contributor Author

    njsmith commented Apr 11, 2017

    Err, libuv obviously doesn't use a Python-level signal handler. I just meant to include them as another example of a library I checked that uses a self-pipe to handle signals but relies on out-of-band information to transmit what the actual signal is :-).

    @njsmith
    Copy link
    Contributor Author

    njsmith commented Apr 11, 2017

    I think the idea in c13ef66 wasn't so much that we wanted the wakeup fd to be written to first, as that the way the code was written back then, the presence of 'if (is_tripped) return;' meant that it wasn't getting written to *at all* in some cases. Since then the code got restructured and the early return was removed, so this isn't an issue anymore.

    @njsmith
    Copy link
    Contributor Author

    njsmith commented Apr 11, 2017

    If it helps, notice that the SetEvent(sigint_event) call used to wake up the main thread on windows is also performed unconditionally and after the call to Py_AddPendingEvent. From the point of view of twisted/tornado/trio, this is exactly the same as the write to the wakeup fd -- the only reason we use the wakeup fd instead of the sigint_event is that it's more convenient to wait on an fd than on an event object.

    @vstinner
    Copy link
    Member

    "I think the idea in c13ef66 wasn't so much that we wanted the wakeup fd to be written to first, as that the way the code was written back then, the presence of 'if (is_tripped) return;' meant that it wasn't getting written to *at all* in some cases. Since then the code got restructured and the early return was removed, so this isn't an issue anymore."

    I agree. It wasn't deliberate to change the order, it's was more to write a short patch fixing a very specific use case.

    @vstinner
    Copy link
    Member

    I know that it can be very difficult to write such test, but can you please try to write a script trying to reproduce the describe the race condition?

    Later, we can run the script in a test to check for non-regression.

    @njsmith
    Copy link
    Contributor Author

    njsmith commented Apr 12, 2017

    The attached script wakeup-fd-racer.py fails consistently for me using cpython 3.6.0 on my windows 10 vm:

    python wakeup-fd-racer.py
    Attempt 0: start
    Attempt 0: FAILED, took 10.016000000000076 seconds
    select_calls = 2

    (It may help that the VM only has 1 CPU? But the same test passes on Linux even when I use taskset to restrict it to 1 cpu. Maybe Windows has some scheduling heuristic where one thread writing to a socket when another thread is blocked on it triggers an immediate context switch.)

    @vstinner
    Copy link
    Member

    New changeset 4ae0149 by Victor Stinner (Nathaniel J. Smith) in branch 'master':
    bpo-30038: fix race condition in signal delivery + wakeup fd (bpo-1082)
    4ae0149

    @vstinner
    Copy link
    Member

    While I suggest you to *not* use an event loop (wakeup fd pipe/socket handle with select) and signal.signal(), you are true that there is a race condition if you use select() with signal.signal() so I merged your change.

    Do you consider that it's worth it to backport the change to 3.5 and 3.6?

    Python 2.7 is not affected, right?

    @njsmith
    Copy link
    Contributor Author

    njsmith commented May 17, 2017

    While I suggest you to *not* use an event loop (wakeup fd pipe/socket handle with select) and signal.signal(), you are true that there is a race condition if you use select() with signal.signal() so I merged your change.

    Unfortunately this is the only 100% reliable way to do signal handling with an event loop so I guess you are unlikely to convince any of the projects that use it this way to change their minds :-)

    Do you consider that it's worth it to backport the change to 3.5 and 3.6?

    I guess it'd be nice? At least for 3.6? I just had yet another random PR fail its tests due to this bug -- it's turning out to be quite difficult to make my tests reliable :-/. (BTW do you happen to know any tricks to force CPython to do an immediate PyErr_CheckSignals on Windows?)

    @njsmith
    Copy link
    Contributor Author

    njsmith commented May 17, 2017

    (BTW do you happen to know any tricks to force CPython to do an immediate PyErr_CheckSignals on Windows?)

    Never mind on this... it looks like calling repr() on any object is sufficient.

    @njsmith
    Copy link
    Contributor Author

    njsmith commented Jun 9, 2017

    I guess now would be a good time to decide whether this should be backported to 3.6, with 3.6.2 coming out in a few days :-). (Or if not, then it can probably be closed?)

    @vstinner
    Copy link
    Member

    vstinner commented Jun 9, 2017

    @ned Deily: Hey, Nathaniel wants to backport the commit 4ae0149 which changes how the C signal handler of CPython.

    There is a low risk of regression, it can be seen as a backward incompatible change.

    I'm not super excited to backport such change. I consider that there are other solutions to handle signals without hitting this bug. For example, asyncio works well on Python 3.5, there is no known race condition. Basically, the only impacted application is Nathaniel's Trio project.

    @yury: And you, what do you think?

    @njsmith
    Copy link
    Contributor Author

    njsmith commented Jun 9, 2017

    I think you mean it's backwards *compatible*? There's definitely no change in APIs or ABIs or anything, all that changes is the order of some statements inside Python's signal handler. (We used to we write to the magic wakeup fd and then set a flag saying a signal arrived; now we do the same things but in the opposite order. I wouldn't call it zero risk, just on the general principle that signal handlers are tricksy beasts, but I think it's about as low risk as a signal handler change can be.)

    AFAICT the race condition also affects twisted and tornado, though I don't think they've noticed. The main symptom is that it makes control-C handling flaky on Windows, so it's a difficult thing to test for.

    I would prefer to see it backported to 3.6, but it's a weird edge case bug so the world is not going to end if the fix has to wait for 3.7.

    @vstinner
    Copy link
    Member

    vstinner commented Jun 9, 2017

    I think you mean it's backwards *compatible*?

    I mean incompatible since it changes the behaviour in a subtle way.

    @njsmith
    Copy link
    Contributor Author

    njsmith commented Jun 9, 2017

    But, by that definition, like... every change is backwards incompatible.

    I'm pretty confident that no-one was relying on this race condition. I can't even figure out what that would mean.

    @ned-deily
    Copy link
    Member

    It sounds like a bug to me, and thus suitable to backport, but it certainly would be worth getting other opinions since this is a tricky and critical area. Antoine, Steve, others: thoughts?

    @1st1
    Copy link
    Member

    1st1 commented Jun 9, 2017

    It sounds like a bug to me, and thus suitable to backport, but it certainly would be worth getting other opinions since this is a tricky and critical area.

    I think it's safe to merge this in 3.6.

    @vstinner
    Copy link
    Member

    Ned Deily and Yury Selivanov are in favor of a backport, thank you for looking at this one guys. So I created the PR 2075. I will merge it once the CI pass.

    @vstinner
    Copy link
    Member

    New changeset 0b7629c by Victor Stinner in branch '3.6':
    bpo-30038: fix race condition in signal delivery + wakeup fd (bpo-1082) (bpo-2075)
    0b7629c

    @ned-deily
    Copy link
    Member

    Misc/NEWS entries for 3.6 and master for these changes?

    @ned-deily
    Copy link
    Member

    New changeset 6a98a04 by Ned Deily in branch 'master':
    bpo-30038: add Misc/NEWS entry. (bpo-2258)
    6a98a04

    @ned-deily
    Copy link
    Member

    New changeset 0a794a3 by Ned Deily in branch '3.6':
    bpo-30038: add Misc/NEWS entry.
    0a794a3

    @pitrou
    Copy link
    Member

    pitrou commented Jun 29, 2017

    Why is this PR not closed?

    @njsmith njsmith closed this as completed Jun 29, 2017
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants