Message 299736 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	njs
Recipients	njs, vstinner
Date	2017-08-04.08:36:32
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1501835793.8.0.586485802562.issue31119@psf.upfronthosting.co.za>
In-reply-to

Content
Sometimes, CPython's signal handler (signalmodule.c:signal_handler) runs in a different thread from the main thread. On Unix this is rare but it does happen (esp. for SIGCHLD), and on Windows it happens 100% of the time. It turns out that there is a subtle race condition that means such signals can be lost. To notify the main thread that a signal has been received, and which signal it is, CPython's signal handler sets two global variables: Handlers[sig_num].tripped = 1; is_tripped = 1 And then PyErr_CheckSignals does: if (is_tripped) { is_tripped = 0; for (i = 1; i < NSIG; i++) { if (Handlers[i].is_tripped) { ... run signal handler ... } } } As some comments in the source note, this logic depends strongly on the assumption that when the signal handler sets the two flags, they are immediately visible to PyErr_CheckSignals, and that the two assignments happen in the given order. For example, suppose that is_tripped were set before Handlers[i].is_tripped. Then we could have the following sequence: 1. Thread A sets is_tripped = 1 2. Thread B runs PyErr_CheckSignals 3. Thread B notices is_tripped == 1, and sets it to 0 4. Thread B scans through all the Handlers, sees that none are set, and carries on 5. Thread A sets Handlers[i].is_tripped = 1 In this case the signal is lost until another signal arrives, which may not happen until an arbitrary amount of time has passed. Also, the fix for bpo-30038 (making sure that we set the flags before writing to the wakeup fd) assumes that setting the flags before writing to the wakeup fd means that, well, the flags will be written before the fd. However, when you have code running in separate threads, none of this is actually guaranteed. In general, the compiler is free to re-order writes, and more perniciously, the hardware memory hierarchy is also free to arbitrarily delay when writes to RAM made by one CPU become visible to another CPU, which can also create arbitrary reorderings. In an attempt to avoid these issues, signalmodule.c declares the signal flags as 'volatile sig_atomic_t'. However, the best one can hope for from this is that it'll stop the compiler from reordering writes. It does nothing to stop the hardware from reordering writes. Using volatile like this might be sufficient if the signal handler runs in the main thread, but not if it runs in another thread. The correct way to handle this is to use the primitives in pyatomic.h to get/set the tripped flags. ---- I noticed this because of the same test case that was failing due to bpo-30038... this thing has been frustrating me for months because it's intermittent and never seems to happen on my local VM. But specifically what I'm seeing at this point is: Thread A (the thread where the signal handler runs): A1. takes a lock. A2. with the lock held, calls the C function raise(SIGINT), which directly calls the CPython C-level signal handler. (Verified by consulting the Windows C runtime source.) A3. writes to the wakeup fd & sets both tripped flags, in whatever order A4. releases the lock Simultaneously, in thread B (the main thread): B1. observes that the wakeup fd has been written to B2. takes the lock B3. releases the lock B4. calls repr(), which unconditionally calls PyObject_Repr, which unconditionally calls PyErr_CheckSignals I expect the call in B4 to run the Python-level signal handler, but this does not happen. Instead it runs some time later, causing an unexpected KeyboardInterrupt. My reasoning is: the lock guarantees that no matter what the internal details of the C-level signal handler actually are, they have to all be complete before my code checks for signals. In excruciating detail: - the write to the wakeup fd must happen-after the lock is acquired (A1) - therefore, step B1 happens-after step A1 - B2 happens-after B1, so B2 happens-after A1. - B2 takes the same lock as A1, so if B2 happens-after A1, it must also happen-after A4, when the lock is released. - step B4 happens-after B2, therefore it happens-after A4 and A3. Therefore the tripped flags should be visible and the Python-level signal handler should run. Yet this does not happen, so something really weird is going on. The hypothesis that it's the lack of memory barriers both explains how this could happen -- specifically I think thread B may be running PyErr_CheckSignals as part of the wakeup-fd reading logic, and it's clearing is_tripped before Handlers[i].tripped becomes visible, like in my code above. And it also explains why it happens in like 1/50 runs in Appveyor, but never on my local Windows 10 VM that only has 1 virtual CPU. But regardless of whether this is the cause of my particular weird bug, it clearly is a bug and should be fixed.

Sometimes, CPython's signal handler (signalmodule.c:signal_handler) runs in a different thread from the main thread. On Unix this is rare but it does happen (esp. for SIGCHLD), and on Windows it happens 100% of the time. It turns out that there is a subtle race condition that means such signals can be lost.

To notify the main thread that a signal has been received, and which signal it is, CPython's signal handler sets two global variables:

Handlers[sig_num].tripped = 1;
is_tripped = 1

And then PyErr_CheckSignals does:

if (is_tripped) {
is_tripped = 0;
for (i = 1; i < NSIG; i++) {
if (Handlers[i].is_tripped) {
... run signal handler ...
}
}
}

As some comments in the source note, this logic depends strongly on the assumption that when the signal handler sets the two flags, they are immediately visible to PyErr_CheckSignals, and that the two assignments happen in the given order. For example, suppose that is_tripped were set before Handlers[i].is_tripped. Then we could have the following sequence:

1. Thread A sets is_tripped = 1
2. Thread B runs PyErr_CheckSignals
3. Thread B notices is_tripped == 1, and sets it to 0
4. Thread B scans through all the Handlers, sees that none are set, and carries on
5. Thread A sets Handlers[i].is_tripped = 1

In this case the signal is lost until another signal arrives, which may not happen until an arbitrary amount of time has passed.

Also, the fix for bpo-30038 (making sure that we set the flags before writing to the wakeup fd) assumes that setting the flags before writing to the wakeup fd means that, well, the flags will be written before the fd.

However, when you have code running in separate threads, none of this is actually guaranteed. In general, the compiler is free to re-order writes, and more perniciously, the hardware memory hierarchy is also free to arbitrarily delay when writes to RAM made by one CPU become visible to another CPU, which can also create arbitrary reorderings.

In an attempt to avoid these issues, signalmodule.c declares the signal flags as 'volatile sig_atomic_t'. However, the best one can hope for from this is that it'll stop the *compiler* from reordering writes. It does nothing to stop the *hardware* from reordering writes. Using volatile like this might be sufficient if the signal handler runs in the main thread, but not if it runs in another thread.

The correct way to handle this is to use the primitives in pyatomic.h to get/set the tripped flags.

----

I noticed this because of the same test case that was failing due to bpo-30038... this thing has been frustrating me for months because it's intermittent and never seems to happen on my local VM. But specifically what I'm seeing at this point is:

Thread A (the thread where the signal handler runs):
A1. takes a lock.
A2. with the lock held, calls the C function raise(SIGINT), which directly calls the CPython C-level signal handler. (Verified by consulting the Windows C runtime source.)
A3. writes to the wakeup fd & sets both tripped flags, in whatever order
A4. releases the lock

Simultaneously, in thread B (the main thread):
B1. observes that the wakeup fd has been written to
B2. takes the lock
B3. releases the lock
B4. calls repr(), which unconditionally calls PyObject_Repr, which unconditionally calls PyErr_CheckSignals

I expect the call in B4 to run the Python-level signal handler, but this does not happen. Instead it runs some time later, causing an unexpected KeyboardInterrupt.

My reasoning is: the lock guarantees that no matter what the internal details of the C-level signal handler actually are, they have to all be complete before my code checks for signals. In excruciating detail:

- the write to the wakeup fd must happen-after the lock is acquired (A1)
- therefore, step B1 happens-after step A1
- B2 happens-after B1, so B2 happens-after A1.
- B2 takes the same lock as A1, so if B2 happens-after A1, it must also happen-after A4, when the lock is released.
- step B4 happens-after B2, therefore it happens-after A4 and A3. Therefore the tripped flags should be visible and the Python-level signal handler should run.

Yet this does not happen, so *something* really weird is going on. The hypothesis that it's the lack of memory barriers both explains how this could happen -- specifically I think thread B may be running PyErr_CheckSignals as part of the wakeup-fd reading logic, and it's clearing is_tripped before Handlers[i].tripped becomes visible, like in my code above. And it also explains why it happens in like 1/50 runs in Appveyor, but never on my local Windows 10 VM that only has 1 virtual CPU.

But regardless of whether this is the cause of my particular weird bug, it clearly *is* a bug and should be fixed.

History
Date	User	Action	Args
2017-08-04 08:36:34	njs	set	recipients: + njs, vstinner
2017-08-04 08:36:33	njs	set	messageid: <1501835793.8.0.586485802562.issue31119@psf.upfronthosting.co.za>
2017-08-04 08:36:33	njs	link	issue31119 messages
2017-08-04 08:36:32	njs	create