Issue 35238: Alleviate memory reservation of fork_exec in subprocess.Popen via forkserver

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/79419

classification

Title:	Alleviate memory reservation of fork_exec in subprocess.Popen via forkserver
Type:	enhancement	Stage:	resolved
Components:	Library (Lib)	Versions:	Python 3.8, Python 3.7, Python 3.6, Python 3.4, Python 3.5, Python 2.7

process

Status:	closed	Resolution:	not a bug
Dependencies:		Superseder:
Assigned To:		Nosy List:	4-launchpad-kalvdans-no-ip-org, oesteban, pitrou, vstinner
Priority:	normal	Keywords:

Created on 2018-11-13 21:29 by oesteban, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Messages (8)
msg329868 - (view)	Author: Oscar Esteban (oesteban) *	Date: 2018-11-13 21:29
## Context We are developers of nipype (https://github.com/nipype) which is a workflow engine for neuroimaging software. We are experiencing problems that gave rise to the addition of ``os.posix_spawn`` to Python 3.8, and particularly, this - https://bugs.python.org/issue20104#msg222570 Our software runs command line subprocesses that can be quite memory-hungry and in some cases, in the order of tens of thousands processes. Therefore, we frequently see the OOM killing some of the processes. ## Status We have successfully leveraged the ``forkserver`` context (in addition to a low number of `maxtasksperchild`) of multiprocessing to ease the load. However, the fork_exec memory allocation is still problematic on systems that do not allow overcommitting virtual memory. Waiting for os.posix_spawn to be rolled out might not be an option for us, as the problem is hitting badly right now. ## Proposed solution I'd like to page experts on Lib/multiprocessing and Lib/subprocess to give their opinions about the following: is it possible to write an extension to `multiprocessing.util.Popen` such that it has the API of `subprocess.Popen` but the fork happens via the forkserver? My naive intuition is that we would need to create a new type of Process, make sure that it then calls os.exec*e() --possibly around here https://github.com/python/cpython/blob/f966e5397ed8f5c42c185223fc9b4d750a678d02/Lib/multiprocessing/popen_forkserver.py#L51--, and finally handle communication with the subprocess. Please let me know if that is even possible.
msg332020 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2018-12-17 20:50
I'm not sure I understand the proposed solution. Do you mean you would replace this: Parent -> forkserver -> fork child then exec with: Parent -> forkserver -> posix_spawn child?
msg332021 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2018-12-17 20:57
At any rate, given the constraints you're working with (thousands of child processes, memory conservation issues), I suggest you abandon the idea of using multiprocessing and write your own subprocess-server instead. I would suggest doing so using asyncio, which should allow you to control as many subprocesses as you want without spawning countless threads or intermediate processes: https://docs.python.org/3/library/asyncio-subprocess.html
msg332023 - (view)	Author: Oscar Esteban (oesteban) *	Date: 2018-12-17 21:12
Thanks for your response. The idea would be to enable ``subprocess.Popen`` to use an existing fork server in its fork_exec. The rationale: I can start a pool of n workers very early in the execution flow. They will have ~350MB memory fingerprint in the beginning and they will be reset to that every ``maxtasksperchild``. So this is basically the amount of VM allocated (doubled) when forking. Pretty small. Currently, as the fork is done from some process with all the python stack of the app loaded in memory (1.7GB in our case), then some additional 1.7GB of VM are allocated on each fork. This could be avoided if the fork was done from the forkserver pool. As you mention, we have been considering such a "shell" server on top of asyncio, so your response just confirms our intuition. I'll close this idea for now since I agree that any investment on this problem should be directed to the asyncio solution. Please note that the idea proposed would work for Python < 3 (as opposed to anything based on asyncio).
msg332024 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2018-12-17 21:16
By the way you could open an issue so that subprocess uses posix_spawn() where possible. (or you could ask to reopen issue31814, which is basically that request but for a different reason than yours)
msg332028 - (view)	Author: STINNER Victor (vstinner) *	Date: 2018-12-17 21:43
> By the way you could open an issue so that subprocess uses posix_spawn() where possible. FYI I'm working on an implementation of this ;-)
msg332030 - (view)	Author: Oscar Esteban (oesteban) *	Date: 2018-12-17 21:56
Hi Victor, That would be great. However, we played a bit with an alternative implementation of posix_spawn (one I got from one related bpo), and it didn't seem to make any difference in terms of memory allocation. Then, we found out that posix_spawn uses fork by default (Linux implementation). So the large memory allocations still happen. One can set the vFork option, but that is apparently a very bad idea, as far as we read. Is that correct?
msg332033 - (view)	Author: STINNER Victor (vstinner) *	Date: 2018-12-18 00:47
See bpo-34663 for posix_spawn() & vfork.

History
Date	User	Action	Args
2022-04-11 14:59:08	admin	set	github: 79419
2018-12-18 00:47:08	vstinner	set	messages: + msg332033
2018-12-17 21:56:47	oesteban	set	messages: + msg332030
2018-12-17 21:43:16	vstinner	set	nosy: + vstinner messages: + msg332028
2018-12-17 21:16:15	pitrou	set	messages: + msg332024
2018-12-17 21:12:24	oesteban	set	status: open -> closed resolution: not a bug messages: + msg332023 stage: resolved
2018-12-17 20:57:15	pitrou	set	messages: + msg332021
2018-12-17 20:50:15	pitrou	set	nosy: + pitrou messages: + msg332020
2018-12-16 06:00:48	4-launchpad-kalvdans-no-ip-org	set	nosy: + 4-launchpad-kalvdans-no-ip-org
2018-11-13 21:29:55	oesteban	create