Author chris.jerdonek
Recipients aeros, asvetlov, chris.jerdonek, njs, vstinner, yselivanov
Date 2020-05-10.03:27:33
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1589081253.96.0.230030852578.issue38323@roundup.psfhosted.org>
In-reply-to
Content
I came up with the script by (1) running the test locally and seeing the same hang, (2) moving the test function to its own script separate from the unit tests and seeing the same hang, and (3) successively stripping away code while continuing to check for the same hang. So it should be equivalent.

As for why it's related to signals, it's because of what the script does (it's not waiting on a subprocess). All it does is start an event loop and then do the following repeatedly:

1. starts a subprocess that sleeps indefinitely
2. create an empty future
3. set a SIGCHLD handler that calls set_result() on the future
4. use call_later() to terminate the future after 5 seconds
4. kill the subprocess
5. await on the future

Almost all of the time, (5) completes immediately (because the handler is called immediately). But sometimes, (5) takes 5 seconds (which means the timeout fired). And in the cases it takes 5 seconds, I'm able to observe both that (a) Python received the SIGCHLD right away, and (b) the signal handler only gets called when the loop is woken up by the call_later(). So during the await in (5), it seems like Python is holding onto the signal for 5 seconds without calling its signal handler.
History
Date User Action Args
2020-05-10 03:27:34chris.jerdoneksetrecipients: + chris.jerdonek, vstinner, njs, asvetlov, yselivanov, aeros
2020-05-10 03:27:33chris.jerdoneksetmessageid: <1589081253.96.0.230030852578.issue38323@roundup.psfhosted.org>
2020-05-10 03:27:33chris.jerdoneklinkissue38323 messages
2020-05-10 03:27:33chris.jerdonekcreate