classification
Title: Race condition in how trip_signal writes to wakeup fd
Type: Stage: resolved
Components: Versions: Python 3.7, Python 3.6, Python 3.5
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: ned.deily, njs, pitrou, steve.dower, vstinner, yselivanov
Priority: normal Keywords:

Created on 2017-04-11 09:11 by njs, last changed 2017-06-29 22:35 by njs. This issue is now closed.

Files
File name Uploaded Description Edit
wakeup-fd-racer.py njs, 2017-04-12 02:13
Pull Requests
URL Status Linked Edit
PR 1082 merged njs, 2017-04-11 09:57
PR 2075 merged vstinner, 2017-06-10 08:56
PR 2257 closed ned.deily, 2017-06-17 03:02
PR 2258 merged ned.deily, 2017-06-17 03:31
PR 2259 closed ned.deily, 2017-06-17 06:06
Messages (27)
msg291467 - (view) Author: Nathaniel Smith (njs) * (Python committer) Date: 2017-04-11 09:11
In trip_signal [1], the logic goes:

1) set the flag saying that this particular signal was tripped
2) write to the wakeup fd
3) set the global is_tripped flag saying "at least one signal was tripped", and do Py_AddPendingCall (which sets some global flags that the bytecode interpreter checks on every pass through the loop)

So the problem here is that it's step (2) that wakes up the main thread to check for signals, but it's step (3) that actually arranges for the Python-level signal handler to run. (Step (1) turns out to be irrelevant, because no-one looks at the per-signal flags unless the global is_tripped flag is set. This might be why no-one noticed this bug through code inspection though – I certainly missed it, despite explicitly checking for it several times!)

The result is that the following sequence of events is possible:

- signal arrives (e.g. SIGINT)
- trip_signal writes to the wakeup fd
- the main thread blocked in IO wait sees this, and wakes up
- the main thread checks for signals, and doesn't find any
- the main thread empties the wakeup fd
- the main thread goes back to sleep
- trip_signal sets the flags to request the Python-level signal handler be run
- the main thread doesn't notice, because it's asleep

It turns out that this is a real thing that can actually happen; it's causing an annoying intermittent failure in the trio testsuite on appveyor; and under the correct conditions I can reproduce it very reliably in my local Windows VM. See [2].

I think the fix is just to swap the order of steps (2) and (3), so we write to the wakeup fd last. Unfortunately I can't easily test this because I don't have a way to build CPython on Windows. But [2] has some IMHO pretty compelling evidence that this is what's happening.

[1] https://github.com/python/cpython/blob/6fab78e9027f9ebd6414995580781b480433e595/Modules/signalmodule.c#L238-L291
[2] https://github.com/python-trio/trio/issues/119
msg291474 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-04-11 10:20
Last time I had to make a major change related to signal handling, it was in the asyncio module because of a race conditon which occurred on FreeBSD.

commit fe5649c7b7bf52147480d6b1124a3d8e3597aee3
Author: Victor Stinner <victor.stinner@gmail.com>
Date:   Thu Jul 17 22:43:40 2014 +0200

    Python issue #21645, Tulip issue 192: Rewrite signal handling
    
    Since Python 3.3, the C signal handler writes the signal number into the wakeup
    file descriptor and then schedules the Python call using Py_AddPendingCall().
    
    asyncio uses the wakeup file descriptor to wake up the event loop, and relies
    on Py_AddPendingCall() to schedule the final callback with call_soon().
    
    If the C signal handler is called in a thread different than the thread of the
    event loop, the loop is awaken but Py_AddPendingCall() was not called yet. In
    this case, the event loop has nothing to do and go to sleep again.
    Py_AddPendingCall() is called while the event loop is sleeping again and so the
    final callback is not scheduled immediatly.
    
    This patch changes how asyncio handles signals. Instead of relying on
    Py_AddPendingCall() and the wakeup file descriptor, asyncio now only relies on
    the wakeup file descriptor. asyncio reads signal numbers from the wakeup file
    descriptor to call its signal handler.
msg291476 - (view) Author: Nathaniel Smith (njs) * (Python committer) Date: 2017-04-11 10:35
Right. My claim would be that the PR I just submitted is the correct fix for bpo-21645 as well.

The approach asyncio uses is very elegant, but unfortunately it assumes that the wakeup fd has infinite buffer, which isn't true. If enough signals or other callbacks are scheduled to the event loop quickly enough, then the buffer can overflow and signals can get lost. This is a very classic issue and source of confusion for everyone learning Unix – the reason that there are traditionally 32 unix signals, and that signals can be coalesced and SIGCHLD is so annoying, is that the kernel treats pending signals as flags, not a queue; this means that in the worst case it only takes 32 bits to store all pending signals. (Even the posix committee screwed this up when designing the real-time signals system, which is almost unusable as a result.) On Unix, if you send a process 10 SIGHUP and 1 SIGCHLD, then it might get only 1 SIGHUP and 1 SIGCHLD, but it's guaranteed to get that. With asyncio's model, it might get 10 SIGHUP and 0 SIGCHLD, or even 0 SIGHUP and 0 SIGCHLD (since other methods like call_soon_threadsafe also write to the wakeup fd).

In any case, this is why other async libraries (at least twisted, libuv, tornado, trio) treat the wakeup fd as a boolean empty-or-not, and pass the actual information via some other out-of-band mechanism that involves a Python-level signal handler. So the patch here is necessary for everyone else to safely use set_wakeup_fd.
msg291478 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-04-11 10:39
Previous changes in signal handling. It's the commit c13ef66649985025382c64f6af8e3b956411e05b of the issue #8407 which changed the order: <add_pending, write into wakeup fd> became <write into wakeup fd, add_pending>.

I really *hate* having to think to these evil things which are signals and threads... Signals and threads don't go well altogether :-p It reminds me the "Ghosts of Unix past, part 3: Unfixable designs" article...
https://lwn.net/Articles/414618/

commit 6c9b35bfe2585af08ea6480294e096e2d2397fe3
Author: Victor Stinner <victor.stinner@haypocalc.com>
Date:   Mon Apr 18 16:25:56 2011 +0200

    Issue #11768: The signal handler of the signal module only calls
    Py_AddPendingCall() for the first signal to fix a deadlock on reentrant or
    parallel calls. PyErr_SetInterrupt() writes also into the wake up file.

commit c13ef66649985025382c64f6af8e3b956411e05b
Author: Victor Stinner <victor.stinner@haypocalc.com>
Date:   Wed May 25 02:35:58 2011 +0200

    Issue #8407: Fix the signal handler of the signal module: if it is called
    twice, it now writes the number of the second signal into the wakeup fd.
msg291479 - (view) Author: Nathaniel Smith (njs) * (Python committer) Date: 2017-04-11 10:40
Err, libuv obviously doesn't use a Python-level signal handler. I just meant to include them as another example of a library I checked that uses a self-pipe to handle signals but relies on out-of-band information to transmit what the actual signal is :-).
msg291480 - (view) Author: Nathaniel Smith (njs) * (Python committer) Date: 2017-04-11 10:43
I think the idea in c13ef6664998 wasn't so much that we wanted the wakeup fd to be written to first, as that the way the code was written back then, the presence of 'if (is_tripped) return;' meant that it wasn't getting written to *at all* in some cases. Since then the code got restructured and the early return was removed, so this isn't an issue anymore.
msg291484 - (view) Author: Nathaniel Smith (njs) * (Python committer) Date: 2017-04-11 11:00
If it helps, notice that the SetEvent(sigint_event) call used to wake up the main thread on windows is also performed unconditionally and after the call to Py_AddPendingEvent. From the point of view of twisted/tornado/trio, this is exactly the same as the write to the wakeup fd -- the only reason we use the wakeup fd instead of the sigint_event is that it's more convenient to wait on an fd than on an event object.
msg291487 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-04-11 12:12
"I think the idea in c13ef6664998 wasn't so much that we wanted the wakeup fd to be written to first, as that the way the code was written back then, the presence of 'if (is_tripped) return;' meant that it wasn't getting written to *at all* in some cases. Since then the code got restructured and the early return was removed, so this isn't an issue anymore."

I agree. It wasn't deliberate to change the order, it's was more to write a short patch fixing a very specific use case.
msg291488 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-04-11 12:13
I know that it can be very difficult to write such test, but can you please try to write a script trying to reproduce the describe the race condition?

Later, we can run the script in a test to check for non-regression.
msg291525 - (view) Author: Nathaniel Smith (njs) * (Python committer) Date: 2017-04-12 02:13
The attached script wakeup-fd-racer.py fails consistently for me using cpython 3.6.0 on my windows 10 vm:

> python wakeup-fd-racer.py
Attempt 0: start
Attempt 0: FAILED, took 10.016000000000076 seconds
select_calls = 2

(It may help that the VM only has 1 CPU? But the same test passes on Linux even when I use taskset to restrict it to 1 cpu. Maybe Windows has some scheduling heuristic where one thread writing to a socket when another thread is blocked on it triggers an immediate context switch.)
msg293790 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-05-16 21:12
New changeset 4ae01496971624c75080431806ed1c08e00f22c7 by Victor Stinner (Nathaniel J. Smith) in branch 'master':
bpo-30038: fix race condition in signal delivery + wakeup fd (#1082)
https://github.com/python/cpython/commit/4ae01496971624c75080431806ed1c08e00f22c7
msg293791 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-05-16 21:16
While I suggest you to *not* use an event loop (wakeup fd pipe/socket handle with select) and signal.signal(), you are true that there is a race condition if you use select() with signal.signal() so I merged your change.

Do you consider that it's worth it to backport the change to 3.5 and 3.6?

Python 2.7 is not affected, right?
msg293839 - (view) Author: Nathaniel Smith (njs) * (Python committer) Date: 2017-05-17 05:43
> While I suggest you to *not* use an event loop (wakeup fd pipe/socket handle with select) and signal.signal(), you are true that there is a race condition if you use select() with signal.signal() so I merged your change.

Unfortunately this is the only 100% reliable way to do signal handling with an event loop so I guess you are unlikely to convince any of the projects that use it this way to change their minds :-)

> Do you consider that it's worth it to backport the change to 3.5 and 3.6?

I guess it'd be nice? At least for 3.6? I just had yet another random PR fail its tests due to this bug -- it's turning out to be quite difficult to make my tests reliable :-/. (BTW do you happen to know any tricks to force CPython to do an immediate PyErr_CheckSignals on Windows?)
msg293842 - (view) Author: Nathaniel Smith (njs) * (Python committer) Date: 2017-05-17 06:06
> (BTW do you happen to know any tricks to force CPython to do an immediate PyErr_CheckSignals on Windows?)

Never mind on this... it looks like calling repr() on any object is sufficient.
msg295488 - (view) Author: Nathaniel Smith (njs) * (Python committer) Date: 2017-06-09 04:50
I guess now would be a good time to decide whether this should be backported to 3.6, with 3.6.2 coming out in a few days :-). (Or if not, then it can probably be closed?)
msg295494 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-06-09 08:16
@Ned Deily: Hey, Nathaniel wants to backport the commit 4ae01496971624c75080431806ed1c08e00f22c7 which changes how the C signal handler of CPython.

There is a low risk of regression, it can be seen as a backward incompatible change.

I'm not super excited to backport such change. I consider that there are other solutions to handle signals without hitting this bug. For example, asyncio works well on Python 3.5, there is no known race condition. Basically, the only impacted application is Nathaniel's Trio project.

@Yury: And you, what do you think?
msg295498 - (view) Author: Nathaniel Smith (njs) * (Python committer) Date: 2017-06-09 08:35
I think you mean it's backwards *compatible*? There's definitely no change in APIs or ABIs or anything, all that changes is the order of some statements inside Python's signal handler. (We used to we write to the magic wakeup fd and then set a flag saying a signal arrived; now we do the same things but in the opposite order. I wouldn't call it zero risk, just on the general principle that signal handlers are tricksy beasts, but I think it's about as low risk as a signal handler change can be.)

AFAICT the race condition also affects twisted and tornado, though I don't think they've noticed. The main symptom is that it makes control-C handling flaky on Windows, so it's a difficult thing to test for.

I would prefer to see it backported to 3.6, but it's a weird edge case bug so the world is not going to end if the fix has to wait for 3.7.
msg295499 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-06-09 08:36
> I think you mean it's backwards *compatible*?

I mean incompatible since it changes the behaviour in a subtle way.
msg295501 - (view) Author: Nathaniel Smith (njs) * (Python committer) Date: 2017-06-09 08:55
But, by that definition, like... every change is backwards incompatible.

I'm pretty confident that no-one was relying on this race condition. I can't even figure out what that would mean.
msg295536 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2017-06-09 15:00
It sounds like a bug to me, and thus suitable to backport, but it certainly would be worth getting other opinions since this is a tricky and critical area.  Antoine, Steve, others: thoughts?
msg295591 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2017-06-09 22:37
> It sounds like a bug to me, and thus suitable to backport, but it certainly would be worth getting other opinions since this is a tricky and critical area.

I think it's safe to merge this in 3.6.
msg295622 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-06-10 08:57
Ned Deily and Yury Selivanov are in favor of a backport, thank you for looking at this one guys. So I created the PR 2075. I will merge it once the CI pass.
msg295627 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-06-10 09:20
New changeset 0b7629cd846e3e160ac07cafc4e97644024f11b8 by Victor Stinner in branch '3.6':
bpo-30038: fix race condition in signal delivery + wakeup fd (#1082) (#2075)
https://github.com/python/cpython/commit/0b7629cd846e3e160ac07cafc4e97644024f11b8
msg295842 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2017-06-13 03:10
Misc/NEWS entries for 3.6 and master for these changes?
msg296230 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2017-06-17 03:34
New changeset 6a98a04e21fc3209b2170c3930ce4488e89680e3 by Ned Deily in branch 'master':
bpo-30038: add Misc/NEWS entry. (#2258)
https://github.com/python/cpython/commit/6a98a04e21fc3209b2170c3930ce4488e89680e3
msg296233 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2017-06-17 06:43
New changeset 0a794a3256b24ccf57b18ec9964f2367ac1f3d30 by Ned Deily in branch '3.6':
bpo-30038: add Misc/NEWS entry.
https://github.com/python/cpython/commit/0a794a3256b24ccf57b18ec9964f2367ac1f3d30
msg297324 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2017-06-29 22:24
Why is this PR not closed?
History
Date User Action Args
2017-06-29 22:35:42njssetstatus: open -> closed
resolution: fixed
stage: resolved
2017-06-29 22:24:08pitrousetmessages: + msg297324
2017-06-17 06:43:15ned.deilysetmessages: + msg296233
2017-06-17 06:06:36ned.deilysetpull_requests: + pull_request2309
2017-06-17 03:34:34ned.deilysetmessages: + msg296230
2017-06-17 03:31:57ned.deilysetpull_requests: + pull_request2308
2017-06-17 03:02:00ned.deilysetpull_requests: + pull_request2307
2017-06-13 03:10:17ned.deilysetmessages: + msg295842
2017-06-10 09:20:05vstinnersetmessages: + msg295627
2017-06-10 08:57:31vstinnersetmessages: + msg295622
2017-06-10 08:56:46vstinnersetpull_requests: + pull_request2138
2017-06-09 22:37:25yselivanovsetmessages: + msg295591
2017-06-09 15:00:11ned.deilysetnosy: + pitrou, steve.dower
messages: + msg295536
2017-06-09 08:55:24njssetmessages: + msg295501
2017-06-09 08:36:13vstinnersetmessages: + msg295499
2017-06-09 08:35:11njssetmessages: + msg295498
2017-06-09 08:16:03vstinnersetnosy: + ned.deily, yselivanov
messages: + msg295494
2017-06-09 04:50:16njssetmessages: + msg295488
2017-05-17 06:06:08njssetmessages: + msg293842
2017-05-17 05:43:27njssetmessages: + msg293839
2017-05-16 21:16:50vstinnersetmessages: + msg293791
2017-05-16 21:12:13vstinnersetmessages: + msg293790
2017-04-12 02:13:39njssetfiles: + wakeup-fd-racer.py

messages: + msg291525
2017-04-11 12:13:56vstinnersetmessages: + msg291488
2017-04-11 12:12:46vstinnersetmessages: + msg291487
2017-04-11 11:00:58njssetmessages: + msg291484
2017-04-11 10:43:31njssetmessages: + msg291480
2017-04-11 10:40:27njssetmessages: + msg291479
2017-04-11 10:39:41vstinnersetmessages: + msg291478
2017-04-11 10:35:34njssetmessages: + msg291476
2017-04-11 10:20:13vstinnersetmessages: + msg291474
2017-04-11 09:57:25njssetpull_requests: + pull_request1226
2017-04-11 09:11:23njscreate