Title: Race condition in use of _PyOS_SigintEvent on windows
Type: behavior Stage:
Components: Interpreter Core, Windows Versions: Python 3.7, Python 3.6, Python 3.5
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: njs, paul.moore, steve.dower, tim.golden, zach.ware
Priority: normal Keywords:

Created on 2017-04-24 06:55 by njs, last changed 2017-04-24 07:18 by eryksun.

Messages (2)
msg292196 - (view) Author: Nathaniel Smith (njs) * (Python committer) Date: 2017-04-24 06:55
As pointed out in this stackoverflow answer:
and since I seem to be collecting signal-handling bugs these days :-), there's a race condition in how the interpreter uses _PyOS_SigintEvent to allow control-C to break out of functions like time.sleep on Windows. Suppose we have a call to time.sleep(), and the user hits control-C while it's running.

What's supposed to happen is:

- the windows implementation of pysleep in Modules/timemodule.c does ResetEvent(hInterruptEvent)
- then it blocks waiting for the interrupt event to be set *or* the timeout to expire
- the C-level signal handler runs in a new thread, which sets the "hey a signal arrived" flag and then sets the event
- the main thread wakes up because the event is set, and runs PyErr_CheckSignals()
- this notices that that the signal has arrived and runs the Python-level handler, all is good

But what can happen instead is:

- before doing CALL_FUNCTION, the eval loop checks to see if any signals have arrived. They haven't.
- then the C implementation of time.sleep starts executing.
- then a signal arrives; the signal handler sets the flag and sets the event
- then the main thread clears the event again
- then it blocks waiting for the event to be set or the timeout to expire. But the C-level signal handler's already done and gone, so we don't realize that the flag is set and we should wake up and run the Python-level signal handler.

The solution is that immediately *after* calling ResetEvent(_PyOS_SigintEvent()) but *before* sleeping, we should call PyErr_CheckSignals(). This catches any signals that arrived before we called ResetEvent, and any signals that arrive after that will cause the event to become set and wake us up, so that eliminates the race condition.

This same race-y pattern seems to apply to appear in Modules/timemodule.c, Modules/_multiprocessing/semaphore.c, and Modules/_winapi.c. _winapi.c also handles the event in a weird way that doesn't make sense to me – if the user hits control-C it raises an OSError instead of running the signal handler? OTOH I *think* Modules/_io/winconsoleio.c already handles the race condition correctly, and I don't dare make a guess about whatever Parser/myreadline.c is doing.
msg292197 - (view) Author: Nathaniel Smith (njs) * (Python committer) Date: 2017-04-24 06:57
Oh, I should also say that this isn't actually affecting me, I just figured that once I was aware of the bug it was worth making a record here. Might be a good starter bug for someone trying to get into CPython's internals :-)
Date User Action Args
2017-04-24 07:18:05eryksunsetnosy: + paul.moore, tim.golden, zach.ware, steve.dower
type: behavior
components: + Windows
2017-04-24 06:57:05njssetmessages: + msg292197
2017-04-24 06:55:43njscreate