Issue 21595: asyncio: Creating many subprocess generates lots of internal BlockingIOError

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/65794

classification

Title:	asyncio: Creating many subprocess generates lots of internal BlockingIOError
Type:		Stage:	resolved
Components:	asyncio	Versions:	Python 3.4

process

Status:	closed	Resolution:	fixed
Dependencies:		Superseder:
Assigned To:		Nosy List:	Sebastian.Kreft.Deezer, giampaolo.rodola, gvanrossum, pitrou, python-dev, vstinner, yselivanov
Priority:	normal	Keywords:	patch

Created on 2014-05-28 16:18 by Sebastian.Kreft.Deezer, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Files
File name	Uploaded	Description	Edit
test_subprocess_error.py	Sebastian.Kreft.Deezer, 2014-05-28 16:18
asyncio_read_from_self.patch	vstinner, 2014-05-28 21:13		review
asyncio_read_from_self_test.patch	vstinner, 2014-06-06 12:19		review

Messages (9)
msg219288 - (view)	Author: Sebastian Kreft (Sebastian.Kreft.Deezer)	Date: 2014-05-28 16:18
Using the asyncio.create_subprocess_exec, generates lost of internal error messages. These messages are: Exception ignored when trying to write to the signal wakeup fd: BlockingIOError: [Errno 11] Resource temporarily unavailable Getting the messages depeneds on how many subprocesses are active at the same time. In my system (Debian 7, kernel 3.2.0-4-amd64, python 3.4.1), with 3 or less processes at the same time I don't see any problem, but with 4 or more I got lot of messages. On the other hand, these error messages seem to be innocuous, as no exception seems to be raised. Attached is a test script that shows the problem. It is run as: bin/python3.4 test_subprocess_error.py <MAX_PROCESSES> <ITERATIONS> it requires to have the du command. Let me know if there are any (conceptual) mistakes in the attached code.
msg219299 - (view)	Author: STINNER Victor (vstinner) *	Date: 2014-05-28 21:13
"Exception ignored when trying to write to the signal wakeup fd" message comes from the signal handler in Modules/signalmodule.c. The problem is that Python gets a lot of SIGCHLD signals (the test scripts creates +300 processes per second on my computer). The producer (signal handler writing the signal number into the "self" pipe) is faster than the consumer (BaseSelectorEventLoop._read_from_self callback). Attached patch should reduce the risk of seeing the message "Exception ignored when trying to write to the signal wakeup fd". The patch reads all pending of the self pipe, instead of just trying to read a signal byte. The test script doesn't write the error message anymore when the patch is applied (the script creates more than 300 processes per second). The patch doesn't solve completly the issue. Other possible enhancements: * Add a flag in the signal handler to notify that a signal was received, and write a single byte until the flag is reset to False. It would avoid to fill the pipe. It requires to implement a custom signal handler implemented in C, different from signal handlers of the Python module. * Add an higher priority to callbacks of signal handlers. Asyncio doesn't support priority on callbacks right now. * Increaze the size of the pipe. On Linux, it looks like "fcntl(fd, F_SETPIPE_SZ, size);" can be used. The maximum size is /proc/sys/fs/pipe-max-size (ex: 1 MB of my Fedora 20).
msg219301 - (view)	Author: STINNER Victor (vstinner) *	Date: 2014-05-28 21:20
BaseProactorEventLoop._loop_self_reading() uses an overlapped read of 4096 bytes. I don't understand how it wakes up the event loop. When the operation is done, _loop_self_reading() is scheduled with call_soon() by the Future object. Is it enough to wake up the event loop? Is BaseProactorEventLoop correct? -- Oh, I forgot to explain this part of asyncio_read_from_self.patch: + data = self._ssock.recv(4096) + if not data: + break This break "should never occur". It should only occur if _ssock is no more blocking. But it would be a bug, because this pipe is private and set to non-blocking at its creation. I chose to add the test because it should not hurt to add it "just in case" (and to avoid an unlimited busy loop).
msg219872 - (view)	Author: STINNER Victor (vstinner) *	Date: 2014-06-06 11:44
Can someone please review asyncio_read_from_self.patch?
msg219873 - (view)	Author: STINNER Victor (vstinner) *	Date: 2014-06-06 11:44
Hum, maybe I need to add a unit test for it.
msg219875 - (view)	Author: STINNER Victor (vstinner) *	Date: 2014-06-06 12:19
asyncio_read_from_self_test.patch: Unit test to check that running the loop once reads all bytes. The unit test is ugly: it calls private methods, and it is completly different on UNIX (selector) and on Windows (proactor). I would prefer to not add such test, and just enhance the code (apply asyncio_read_from_self.patch).
msg220154 - (view)	Author: STINNER Victor (vstinner) *	Date: 2014-06-10 09:13
Can someone please review asyncio_read_from_self.patch?
msg220965 - (view)	Author: Roundup Robot (python-dev)	Date: 2014-06-19 11:00
New changeset 46c251118799 by Victor Stinner in branch '3.4': Closes #21595: asyncio.BaseSelectorEventLoop._read_from_self() now reads all http://hg.python.org/cpython/rev/46c251118799 New changeset 513eea89b80a by Victor Stinner in branch 'default': (Merge 3.4) Closes #21595: asyncio.BaseSelectorEventLoop._read_from_self() now http://hg.python.org/cpython/rev/513eea89b80a
msg220966 - (view)	Author: STINNER Victor (vstinner) *	Date: 2014-06-19 11:10
I commited asyncio_read_from_self.patch into Tulip, Python 3.4 and 3.5. If someone is interested to work on more advanced enhancement, please open a new issue. Oh by, a workaround is to limit the number of concurrent processes. Without the patch, "./python test_subprocess_error.py 5 1000" (max: 5 concurrenet processes) emits a lot of "BlockingIOError: [Errno 11] Resource temporarily unavailable" message. With the patch, I start getting messages with 140 concurrent processes, which is much better :-) IMO more than 100 concurrent processes is crazy, don't do that at home :-) I mean processes with a very short lifetime. The limit is the number of SIGCHLD per second, so the number of processes which end at the same second.

History
Date	User	Action	Args
2022-04-11 14:58:04	admin	set	github: 65794
2014-06-19 11:10:17	vstinner	set	messages: + msg220966
2014-06-19 11:00:16	python-dev	set	status: open -> closed nosy: + python-dev messages: + msg220965 resolution: fixed stage: resolved
2014-06-10 09:13:51	vstinner	set	messages: + msg220154
2014-06-06 12:19:07	vstinner	set	files: + asyncio_read_from_self_test.patch messages: + msg219875
2014-06-06 11:44:51	vstinner	set	messages: + msg219873
2014-06-06 11:44:35	vstinner	set	messages: + msg219872
2014-06-06 11:44:00	vstinner	set	components: + asyncio title: Creating many subprocess generates lots of internal BlockingIOError -> asyncio: Creating many subprocess generates lots of internal BlockingIOError
2014-05-28 21:20:32	vstinner	set	messages: + msg219301
2014-05-28 21:13:35	vstinner	set	files: + asyncio_read_from_self.patch nosy: + gvanrossum, pitrou, vstinner, giampaolo.rodola, yselivanov messages: + msg219299 keywords: + patch
2014-05-28 16:18:25	Sebastian.Kreft.Deezer	create