New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
asyncio: Creating many subprocess generates lots of internal BlockingIOError #65794
Comments
Using the asyncio.create_subprocess_exec, generates lost of internal error messages. These messages are: Exception ignored when trying to write to the signal wakeup fd: Getting the messages depeneds on how many subprocesses are active at the same time. In my system (Debian 7, kernel 3.2.0-4-amd64, python 3.4.1), with 3 or less processes at the same time I don't see any problem, but with 4 or more I got lot of messages. On the other hand, these error messages seem to be innocuous, as no exception seems to be raised. Attached is a test script that shows the problem. It is run as: it requires to have the du command. Let me know if there are any (conceptual) mistakes in the attached code. |
"Exception ignored when trying to write to the signal wakeup fd" message comes from the signal handler in Modules/signalmodule.c. The problem is that Python gets a lot of SIGCHLD signals (the test scripts creates +300 processes per second on my computer). The producer (signal handler writing the signal number into the "self" pipe) is faster than the consumer (BaseSelectorEventLoop._read_from_self callback). Attached patch should reduce the risk of seeing the message "Exception ignored when trying to write to the signal wakeup fd". The patch reads all pending of the self pipe, instead of just trying to read a signal byte. The test script doesn't write the error message anymore when the patch is applied (the script creates more than 300 processes per second). The patch doesn't solve completly the issue. Other possible enhancements:
|
BaseProactorEventLoop._loop_self_reading() uses an overlapped read of 4096 bytes. I don't understand how it wakes up the event loop. When the operation is done, _loop_self_reading() is scheduled with call_soon() by the Future object. Is it enough to wake up the event loop? Is BaseProactorEventLoop correct? -- Oh, I forgot to explain this part of asyncio_read_from_self.patch: + data = self._ssock.recv(4096) This break "should never occur". It should only occur if _ssock is no more blocking. But it would be a bug, because this pipe is private and set to non-blocking at its creation. I chose to add the test because it should not hurt to add it "just in case" (and to avoid an unlimited busy loop). |
Can someone please review asyncio_read_from_self.patch? |
Hum, maybe I need to add a unit test for it. |
asyncio_read_from_self_test.patch: Unit test to check that running the loop once reads all bytes. The unit test is ugly: it calls private methods, and it is completly different on UNIX (selector) and on Windows (proactor). I would prefer to *not* add such test, and just enhance the code (apply asyncio_read_from_self.patch). |
Can someone please review asyncio_read_from_self.patch? |
New changeset 46c251118799 by Victor Stinner in branch '3.4': New changeset 513eea89b80a by Victor Stinner in branch 'default': |
I commited asyncio_read_from_self.patch into Tulip, Python 3.4 and 3.5. If someone is interested to work on more advanced enhancement, please open a new issue. Oh by, a workaround is to limit the number of concurrent processes. Without the patch, "./python test_subprocess_error.py 5 1000" (max: 5 concurrenet processes) emits a lot of "BlockingIOError: [Errno 11] Resource temporarily unavailable" message. With the patch, I start getting messages with 140 concurrent processes, which is much better :-) IMO more than 100 concurrent processes is crazy, don't do that at home :-) I mean processes with a very short lifetime. The limit is the number of SIGCHLD per second, so the number of processes which end at the same second. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: