Message311692
I intended to use the asyncio framework for building an end-to-end test for our software. In the test I would spawn somewhere between 5k to 10k processes and have the same number of sockets to manage.
When I built a prototype I ran into some scaling issues. Instead of launching our real software I tested it with calls to sleep 30. At some point started processes would finish, a SIGCHLD would be delivered to python and then it would fail:
Exception ignored when trying to write to the signal wakeup fd:
BlockingIOError: [Errno 11] Resource temporarily unavailable
Using strace I saw something like:
send(5, "\0", 1, 0) = -1 EAGAIN (Resource temporarily unavailable)
waitpid(12218, 0xbf8592d8, WNOHANG) = 0
waitpid(12219, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], WNOHANG) = 12219
send(5, "\0", 1, 0) = -1 EAGAIN (Resource temporarily unavailable)
waitpid(12220, 0xbf8592d8, WNOHANG) = 0
waitpid(12221, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], WNOHANG) = 12221
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=12293, si_uid=1001, si_status=0, si_utime=0, si_stime=
0} ---
getpid() = 11832
write(5, "\21", 1) = -1 EAGAIN (Resource temporarily unavailable)
sigreturn({mask=[]}) = 12221
write(2, "Exception ignored when trying to"..., 64) = 64
write(2, "BlockingIOError: [Errno 11] Reso"..., 61) = 61
Looking at the code I see that si_pid of the signal will be ignored and instead wait(2) will be called for all processes. This doesn't seem to scale well enough for my intended use case.
I think what could be done is one of the following:
* Switch to signalfd for the event notification?
* Take si_pid and instead of just notifying that work is there.. inform about the PID that exited?
* Use wait(-1,... if there can be only one SIGCHLD handler to collect any dead child |
|
Date |
User |
Action |
Args |
2018-02-05 21:22:42 | holger+lp | set | recipients:
+ holger+lp, asvetlov, yselivanov |
2018-02-05 21:22:42 | holger+lp | set | messageid: <1517865762.95.0.467229070634.issue32776@psf.upfronthosting.co.za> |
2018-02-05 21:22:42 | holger+lp | link | issue32776 messages |
2018-02-05 21:22:42 | holger+lp | create | |
|