This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: asyncio SIGCHLD scalability problems
Type: Stage:
Components: asyncio Versions:
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Roman.Evstifeev, asvetlov, holger+lp, njs, twisteroid ambassador, vstinner, yselivanov
Priority: normal Keywords:

Created on 2018-02-05 21:22 by holger+lp, last changed 2022-04-11 14:58 by admin.

Messages (3)
msg311692 - (view) Author: holger (holger+lp) Date: 2018-02-05 21:22
I intended to use the asyncio framework for building an end-to-end test for our software. In the test I would spawn somewhere between 5k to 10k processes and have the same number of sockets to manage.

When I built a prototype I ran into some scaling issues. Instead of launching our real software I tested it with calls to sleep 30. At some point started processes would finish, a SIGCHLD would be delivered to python and then it would fail:

 Exception ignored when trying to write to the signal wakeup fd:
 BlockingIOError: [Errno 11] Resource temporarily unavailable

Using strace I saw something like:

send(5, "\0", 1, 0)                     = -1 EAGAIN (Resource temporarily unavailable)
waitpid(12218, 0xbf8592d8, WNOHANG)     = 0
waitpid(12219, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], WNOHANG) = 12219
send(5, "\0", 1, 0)                     = -1 EAGAIN (Resource temporarily unavailable)
waitpid(12220, 0xbf8592d8, WNOHANG)     = 0
waitpid(12221, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], WNOHANG) = 12221
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=12293, si_uid=1001, si_status=0, si_utime=0, si_stime=
0} ---
getpid()                                = 11832
write(5, "\21", 1)                      = -1 EAGAIN (Resource temporarily unavailable)
sigreturn({mask=[]})                    = 12221
write(2, "Exception ignored when trying to"..., 64) = 64
write(2, "BlockingIOError: [Errno 11] Reso"..., 61) = 61


Looking at the code I see that si_pid of the signal will be ignored and instead wait(2) will be called for all processes. This doesn't seem to scale well enough for my intended use case.

I think what could be done is one of the following:

* Switch to signalfd for the event notification?
* Take si_pid and instead of just notifying that work is there.. inform about the PID that exited?
* Use wait(-1,... if there can be only one SIGCHLD handler to collect any dead child
msg311694 - (view) Author: Nathaniel Smith (njs) * (Python committer) Date: 2018-02-05 22:26
There's two separate issues here: the warning spew because asyncio's internal signal handling code starts losing signals when they arrive too quickly, and the way the child reaping loop polls all the children on every SIGCHLD, which makes reaping N children an O(N**2) operation.

The warning spew is inherent in asyncio's current signal handling design. (BTW, Victor, here's an answer to your question in https://bugs.python.org/issue30050#msg291533 about whether this overflow can happen in real life :-).) It's not *terribly* harmful – losing some SIGCHLDs among thousands doesn't matter. It does mean that you could lose other signals if they happen to arrive at the same time, e.g. SIGINT and SIGTERM are probably ignored while this is happening.

For the O(N**2) issue, I think you can work around it by using some incantation involving set_child_watcher and the FastChildWatcher class. These aren't documented (maybe someone should document them!) and I don't know the exact details off the top of my head, but it is a public interface for making your child reaping O(N). (This uses the wait(-1, ...) trick. Unfortunately there are arcane technical limitations with signalfd and si_pid that mean they can't solve this problem. Something like CLONE_FD would help, but that patch was never merged into Linux.)

You might also consider switching to something like uvloop, which has a more robust event loop implementation underneath. I haven't checked, but I'm pretty sure it'd fix the signal buffer overflow issue, and I doubt they use an O(N**2) child reaping algorithm.
msg339471 - (view) Author: twisteroid ambassador (twisteroid ambassador) * Date: 2019-04-05 05:49
The child watchers are documented now, see here: https://docs.python.org/3/library/asyncio-policy.html#process-watchers

Sounds like FastChildWatcher https://docs.python.org/3/library/asyncio-policy.html#asyncio.FastChildWatcher is exactly what you need if you stick with the stock event loop.
History
Date User Action Args
2022-04-11 14:58:57adminsetgithub: 76957
2019-04-05 05:49:55twisteroid ambassadorsetnosy: + twisteroid ambassador
messages: + msg339471
2019-04-05 05:40:17Roman.Evstifeevsetnosy: + Roman.Evstifeev
2018-02-05 22:26:18njssetnosy: + vstinner, njs
messages: + msg311694
2018-02-05 21:22:42holger+lpcreate