Race condition in how trip_signal writes to wakeup fd #74224

njsmith · 2017-04-11T09:11:23Z

BPO	30038
Nosy	@pitrou, @vstinner, @ned-deily, @njsmith, @1st1, @zooba
PRs	bpo-30038: fix race condition in signal delivery + wakeup fd #1082 [3.6] bpo-30038: fix race condition in signal delivery + wakeup fd (#1082) #2075 bpo-30038: add Misc/NEWS entry. #2257 bpo-30038: add Misc/NEWS entry. #2258 bpo-30038: add Misc/NEWS entry. #2259
Files	wakeup-fd-racer.py

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2017-06-29.22:35:42.844>
created_at = <Date 2017-04-11.09:11:22.985>
labels = ['3.7']
title = 'Race condition in how trip_signal writes to wakeup fd'
updated_at = <Date 2017-06-29.22:35:42.844>
user = 'https://github.com/njsmith'

bugs.python.org fields:

activity = <Date 2017-06-29.22:35:42.844>
actor = 'njs'
assignee = 'none'
closed = True
closed_date = <Date 2017-06-29.22:35:42.844>
closer = 'njs'
components = []
creation = <Date 2017-04-11.09:11:22.985>
creator = 'njs'
dependencies = []
files = ['46797']
hgrepos = []
issue_num = 30038
keywords = []
message_count = 27.0
messages = ['291467', '291474', '291476', '291478', '291479', '291480', '291484', '291487', '291488', '291525', '293790', '293791', '293839', '293842', '295488', '295494', '295498', '295499', '295501', '295536', '295591', '295622', '295627', '295842', '296230', '296233', '297324']
nosy_count = 6.0
nosy_names = ['pitrou', 'vstinner', 'ned.deily', 'njs', 'yselivanov', 'steve.dower']
pr_nums = ['1082', '2075', '2257', '2258', '2259']
priority = 'normal'
resolution = 'fixed'
stage = 'resolved'
status = 'closed'
superseder = None
type = None
url = 'https://bugs.python.org/issue30038'
versions = ['Python 3.5', 'Python 3.6', 'Python 3.7']

njsmith · 2017-04-11T09:11:23Z

In trip_signal [1], the logic goes:

set the flag saying that this particular signal was tripped
write to the wakeup fd
set the global is_tripped flag saying "at least one signal was tripped", and do Py_AddPendingCall (which sets some global flags that the bytecode interpreter checks on every pass through the loop)

So the problem here is that it's step (2) that wakes up the main thread to check for signals, but it's step (3) that actually arranges for the Python-level signal handler to run. (Step (1) turns out to be irrelevant, because no-one looks at the per-signal flags unless the global is_tripped flag is set. This might be why no-one noticed this bug through code inspection though – I certainly missed it, despite explicitly checking for it several times!)

The result is that the following sequence of events is possible:

signal arrives (e.g. SIGINT)
trip_signal writes to the wakeup fd
the main thread blocked in IO wait sees this, and wakes up
the main thread checks for signals, and doesn't find any
the main thread empties the wakeup fd
the main thread goes back to sleep
trip_signal sets the flags to request the Python-level signal handler be run
the main thread doesn't notice, because it's asleep

It turns out that this is a real thing that can actually happen; it's causing an annoying intermittent failure in the trio testsuite on appveyor; and under the correct conditions I can reproduce it very reliably in my local Windows VM. See [2].

I think the fix is just to swap the order of steps (2) and (3), so we write to the wakeup fd last. Unfortunately I can't easily test this because I don't have a way to build CPython on Windows. But [2] has some IMHO pretty compelling evidence that this is what's happening.

[1]

cpython/Modules/signalmodule.c

Lines 238 to 291 in 6fab78e

    
           static void 
        
           trip_signal(int sig_num) 
        
           { 
        
               unsigned char byte; 
        
               int fd; 
        
               Py_ssize_t rc; 
        
               Handlers[sig_num].tripped = 1; 
        
           #ifdef MS_WINDOWS 
        
               fd = Py_SAFE_DOWNCAST(wakeup.fd, SOCKET_T, int); 
        
           #else 
        
               fd = wakeup_fd; 
        
           #endif 
        
               if (fd != INVALID_FD) { 
        
                   byte = (unsigned char)sig_num; 
        
           #ifdef MS_WINDOWS 
        
                   if (wakeup.use_send) { 
        
                       do { 
        
                           rc = send(fd, &byte, 1, 0); 
        
                       } while (rc < 0 && errno == EINTR); 
        
                       /* we only have a storage for one error in the wakeup structure */ 
        
                       if (rc < 0 && !wakeup.send_err_set) { 
        
                           wakeup.send_err_set = 1; 
        
                           wakeup.send_errno = errno; 
        
                           wakeup.send_win_error = GetLastError(); 
        
                           Py_AddPendingCall(report_wakeup_send_error, NULL); 
        
                       } 
        
                   } 
        
                   else 
        
           #endif 
        
                   { 
        
                       byte = (unsigned char)sig_num; 
        
                       /* _Py_write_noraise() retries write() if write() is interrupted by 
        
                          a signal (fails with EINTR). */ 
        
                       rc = _Py_write_noraise(fd, &byte, 1); 
        
                       if (rc < 0) { 
        
                           Py_AddPendingCall(report_wakeup_write_error, 
        
                                             (void *)(intptr_t)errno); 
        
                       } 
        
                   } 
        
               } 
        
               if (!is_tripped) { 
        
                   /* Set is_tripped after setting .tripped, as it gets 
        
                      cleared in PyErr_CheckSignals() before .tripped. */ 
        
                   is_tripped = 1; 
        
                   Py_AddPendingCall(checksignals_witharg, NULL); 
        
               } 
        
           }

[2] python-trio/trio#119

vstinner · 2017-04-11T10:20:13Z

Last time I had to make a major change related to signal handling, it was in the asyncio module because of a race conditon which occurred on FreeBSD.

commit fe5649c
Author: Victor Stinner <victor.stinner@gmail.com>
Date: Thu Jul 17 22:43:40 2014 +0200

Python issue bpo-21645, Tulip issue 192: Rewrite signal handling

Since Python 3.3, the C signal handler writes the signal number into the wakeup
file descriptor and then schedules the Python call using Py_AddPendingCall().

asyncio uses the wakeup file descriptor to wake up the event loop, and relies
on Py_AddPendingCall() to schedule the final callback with call_soon().

If the C signal handler is called in a thread different than the thread of the
event loop, the loop is awaken but Py_AddPendingCall() was not called yet. In
this case, the event loop has nothing to do and go to sleep again.
Py_AddPendingCall() is called while the event loop is sleeping again and so the
final callback is not scheduled immediatly.

This patch changes how asyncio handles signals. Instead of relying on
Py_AddPendingCall() and the wakeup file descriptor, asyncio now only relies on
the wakeup file descriptor. asyncio reads signal numbers from the wakeup file
descriptor to call its signal handler.

njsmith · 2017-04-11T10:35:35Z

Right. My claim would be that the PR I just submitted is the correct fix for bpo-21645 as well.

The approach asyncio uses is very elegant, but unfortunately it assumes that the wakeup fd has infinite buffer, which isn't true. If enough signals or other callbacks are scheduled to the event loop quickly enough, then the buffer can overflow and signals can get lost. This is a very classic issue and source of confusion for everyone learning Unix – the reason that there are traditionally 32 unix signals, and that signals can be coalesced and SIGCHLD is so annoying, is that the kernel treats pending signals as flags, not a queue; this means that in the worst case it only takes 32 bits to store all pending signals. (Even the posix committee screwed this up when designing the real-time signals system, which is almost unusable as a result.) On Unix, if you send a process 10 SIGHUP and 1 SIGCHLD, then it might get only 1 SIGHUP and 1 SIGCHLD, but it's guaranteed to get that. With asyncio's model, it might get 10 SIGHUP and 0 SIGCHLD, or even 0 SIGHUP and 0 SIGCHLD (since other methods like call_soon_threadsafe also write to the wakeup fd).

In any case, this is why other async libraries (at least twisted, libuv, tornado, trio) treat the wakeup fd as a boolean empty-or-not, and pass the actual information via some other out-of-band mechanism that involves a Python-level signal handler. So the patch here is necessary for everyone else to safely use set_wakeup_fd.

vstinner · 2017-04-11T10:39:41Z

Previous changes in signal handling. It's the commit c13ef66 of the issue bpo-8407 which changed the order: <add_pending, write into wakeup fd> became <write into wakeup fd, add_pending>.

I really *hate* having to think to these evil things which are signals and threads... Signals and threads don't go well altogether :-p It reminds me the "Ghosts of Unix past, part 3: Unfixable designs" article...
https://lwn.net/Articles/414618/

commit 6c9b35b
Author: Victor Stinner <victor.stinner@haypocalc.com>
Date: Mon Apr 18 16:25:56 2011 +0200

Issue bpo-11768: The signal handler of the signal module only calls
Py_AddPendingCall() for the first signal to fix a deadlock on reentrant or
parallel calls. PyErr_SetInterrupt() writes also into the wake up file.

commit c13ef66
Author: Victor Stinner <victor.stinner@haypocalc.com>
Date: Wed May 25 02:35:58 2011 +0200

Issue bpo-8407: Fix the signal handler of the signal module: if it is called
twice, it now writes the number of the second signal into the wakeup fd.

njsmith · 2017-04-11T10:40:27Z

Err, libuv obviously doesn't use a Python-level signal handler. I just meant to include them as another example of a library I checked that uses a self-pipe to handle signals but relies on out-of-band information to transmit what the actual signal is :-).

njsmith · 2017-04-11T10:43:32Z

I think the idea in c13ef66 wasn't so much that we wanted the wakeup fd to be written to first, as that the way the code was written back then, the presence of 'if (is_tripped) return;' meant that it wasn't getting written to *at all* in some cases. Since then the code got restructured and the early return was removed, so this isn't an issue anymore.

njsmith · 2017-04-11T11:00:59Z

If it helps, notice that the SetEvent(sigint_event) call used to wake up the main thread on windows is also performed unconditionally and after the call to Py_AddPendingEvent. From the point of view of twisted/tornado/trio, this is exactly the same as the write to the wakeup fd -- the only reason we use the wakeup fd instead of the sigint_event is that it's more convenient to wait on an fd than on an event object.

vstinner · 2017-04-11T12:12:47Z

"I think the idea in c13ef66 wasn't so much that we wanted the wakeup fd to be written to first, as that the way the code was written back then, the presence of 'if (is_tripped) return;' meant that it wasn't getting written to *at all* in some cases. Since then the code got restructured and the early return was removed, so this isn't an issue anymore."

I agree. It wasn't deliberate to change the order, it's was more to write a short patch fixing a very specific use case.

vstinner · 2017-04-11T12:13:56Z

I know that it can be very difficult to write such test, but can you please try to write a script trying to reproduce the describe the race condition?

Later, we can run the script in a test to check for non-regression.

njsmith · 2017-04-12T02:13:38Z

The attached script wakeup-fd-racer.py fails consistently for me using cpython 3.6.0 on my windows 10 vm:

python wakeup-fd-racer.py
Attempt 0: start
Attempt 0: FAILED, took 10.016000000000076 seconds
select_calls = 2

(It may help that the VM only has 1 CPU? But the same test passes on Linux even when I use taskset to restrict it to 1 cpu. Maybe Windows has some scheduling heuristic where one thread writing to a socket when another thread is blocked on it triggers an immediate context switch.)

vstinner · 2017-05-16T21:12:14Z

New changeset 4ae0149 by Victor Stinner (Nathaniel J. Smith) in branch 'master':
bpo-30038: fix race condition in signal delivery + wakeup fd (bpo-1082)
4ae0149

vstinner · 2017-05-16T21:16:50Z

While I suggest you to *not* use an event loop (wakeup fd pipe/socket handle with select) and signal.signal(), you are true that there is a race condition if you use select() with signal.signal() so I merged your change.

Do you consider that it's worth it to backport the change to 3.5 and 3.6?

Python 2.7 is not affected, right?

njsmith · 2017-05-17T05:43:27Z

While I suggest you to *not* use an event loop (wakeup fd pipe/socket handle with select) and signal.signal(), you are true that there is a race condition if you use select() with signal.signal() so I merged your change.

Unfortunately this is the only 100% reliable way to do signal handling with an event loop so I guess you are unlikely to convince any of the projects that use it this way to change their minds :-)

Do you consider that it's worth it to backport the change to 3.5 and 3.6?

I guess it'd be nice? At least for 3.6? I just had yet another random PR fail its tests due to this bug -- it's turning out to be quite difficult to make my tests reliable :-/. (BTW do you happen to know any tricks to force CPython to do an immediate PyErr_CheckSignals on Windows?)

njsmith · 2017-05-17T06:06:09Z

(BTW do you happen to know any tricks to force CPython to do an immediate PyErr_CheckSignals on Windows?)

Never mind on this... it looks like calling repr() on any object is sufficient.

njsmith · 2017-06-09T04:50:17Z

I guess now would be a good time to decide whether this should be backported to 3.6, with 3.6.2 coming out in a few days :-). (Or if not, then it can probably be closed?)

vstinner · 2017-06-09T08:16:03Z

@ned Deily: Hey, Nathaniel wants to backport the commit 4ae0149 which changes how the C signal handler of CPython.

There is a low risk of regression, it can be seen as a backward incompatible change.

I'm not super excited to backport such change. I consider that there are other solutions to handle signals without hitting this bug. For example, asyncio works well on Python 3.5, there is no known race condition. Basically, the only impacted application is Nathaniel's Trio project.

@yury: And you, what do you think?

njsmith · 2017-06-09T08:35:11Z

I think you mean it's backwards *compatible*? There's definitely no change in APIs or ABIs or anything, all that changes is the order of some statements inside Python's signal handler. (We used to we write to the magic wakeup fd and then set a flag saying a signal arrived; now we do the same things but in the opposite order. I wouldn't call it zero risk, just on the general principle that signal handlers are tricksy beasts, but I think it's about as low risk as a signal handler change can be.)

AFAICT the race condition also affects twisted and tornado, though I don't think they've noticed. The main symptom is that it makes control-C handling flaky on Windows, so it's a difficult thing to test for.

I would prefer to see it backported to 3.6, but it's a weird edge case bug so the world is not going to end if the fix has to wait for 3.7.

vstinner · 2017-06-09T08:36:13Z

I think you mean it's backwards *compatible*?

I mean incompatible since it changes the behaviour in a subtle way.

njsmith · 2017-06-09T08:55:25Z

But, by that definition, like... every change is backwards incompatible.

I'm pretty confident that no-one was relying on this race condition. I can't even figure out what that would mean.

ned-deily · 2017-06-09T15:00:11Z

It sounds like a bug to me, and thus suitable to backport, but it certainly would be worth getting other opinions since this is a tricky and critical area. Antoine, Steve, others: thoughts?

1st1 · 2017-06-09T22:37:25Z

It sounds like a bug to me, and thus suitable to backport, but it certainly would be worth getting other opinions since this is a tricky and critical area.

I think it's safe to merge this in 3.6.

vstinner · 2017-06-10T08:57:31Z

Ned Deily and Yury Selivanov are in favor of a backport, thank you for looking at this one guys. So I created the PR 2075. I will merge it once the CI pass.

vstinner · 2017-06-10T09:20:06Z

New changeset 0b7629c by Victor Stinner in branch '3.6':
bpo-30038: fix race condition in signal delivery + wakeup fd (bpo-1082) (bpo-2075)
0b7629c

ned-deily · 2017-06-13T03:10:18Z

Misc/NEWS entries for 3.6 and master for these changes?

ned-deily · 2017-06-17T03:34:35Z

New changeset 6a98a04 by Ned Deily in branch 'master':
bpo-30038: add Misc/NEWS entry. (bpo-2258)
6a98a04

ned-deily · 2017-06-17T06:43:16Z

New changeset 0a794a3 by Ned Deily in branch '3.6':
bpo-30038: add Misc/NEWS entry.
0a794a3

pitrou · 2017-06-29T22:24:08Z

Why is this PR not closed?

njsmith added the 3.7 (EOL) end of life label Apr 11, 2017

njsmith closed this as completed Jun 29, 2017

ezio-melotti transferred this issue from another repository Apr 10, 2022

anntzer mentioned this issue May 25, 2023

Fix support for Ctrl-C on the macosx backend. matplotlib/matplotlib#25966

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Race condition in how trip_signal writes to wakeup fd #74224

Race condition in how trip_signal writes to wakeup fd #74224

njsmith commented Apr 11, 2017

njsmith commented Apr 11, 2017

vstinner commented Apr 11, 2017

njsmith commented Apr 11, 2017

vstinner commented Apr 11, 2017

njsmith commented Apr 11, 2017

njsmith commented Apr 11, 2017

njsmith commented Apr 11, 2017

vstinner commented Apr 11, 2017

vstinner commented Apr 11, 2017

njsmith commented Apr 12, 2017

vstinner commented May 16, 2017

vstinner commented May 16, 2017

njsmith commented May 17, 2017

njsmith commented May 17, 2017

njsmith commented Jun 9, 2017

vstinner commented Jun 9, 2017

njsmith commented Jun 9, 2017

vstinner commented Jun 9, 2017

njsmith commented Jun 9, 2017

ned-deily commented Jun 9, 2017

1st1 commented Jun 9, 2017

vstinner commented Jun 10, 2017

vstinner commented Jun 10, 2017

ned-deily commented Jun 13, 2017

ned-deily commented Jun 17, 2017

ned-deily commented Jun 17, 2017

pitrou commented Jun 29, 2017

Race condition in how trip_signal writes to wakeup fd #74224

Race condition in how trip_signal writes to wakeup fd #74224

Comments

njsmith commented Apr 11, 2017

njsmith commented Apr 11, 2017

vstinner commented Apr 11, 2017

njsmith commented Apr 11, 2017

vstinner commented Apr 11, 2017

njsmith commented Apr 11, 2017

njsmith commented Apr 11, 2017

njsmith commented Apr 11, 2017

vstinner commented Apr 11, 2017

vstinner commented Apr 11, 2017

njsmith commented Apr 12, 2017

vstinner commented May 16, 2017

vstinner commented May 16, 2017

njsmith commented May 17, 2017

njsmith commented May 17, 2017

njsmith commented Jun 9, 2017

vstinner commented Jun 9, 2017

njsmith commented Jun 9, 2017

vstinner commented Jun 9, 2017

njsmith commented Jun 9, 2017

ned-deily commented Jun 9, 2017

1st1 commented Jun 9, 2017

vstinner commented Jun 10, 2017

vstinner commented Jun 10, 2017

ned-deily commented Jun 13, 2017

ned-deily commented Jun 17, 2017

ned-deily commented Jun 17, 2017

pitrou commented Jun 29, 2017