classification
Title: Alleviate memory reservation of fork_exec in subprocess.Popen via forkserver
Type: enhancement Stage: resolved
Components: Library (Lib) Versions: Python 3.8, Python 3.7, Python 3.6, Python 3.4, Python 3.5, Python 2.7
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: 4-launchpad-kalvdans-no-ip-org, oesteban, pitrou, vstinner
Priority: normal Keywords:

Created on 2018-11-13 21:29 by oesteban, last changed 2018-12-18 00:47 by vstinner. This issue is now closed.

Messages (8)
msg329868 - (view) Author: Oscar Esteban (oesteban) * Date: 2018-11-13 21:29
## Context

We are developers of nipype (https://github.com/nipype) which is a workflow engine for neuroimaging software. We are experiencing problems that gave rise to the addition of ``os.posix_spawn`` to Python 3.8, and particularly, this - https://bugs.python.org/issue20104#msg222570

Our software runs command line subprocesses that can be quite memory-hungry and in some cases, in the order of tens of thousands processes. Therefore, we frequently see the OOM killing some of the processes.

## Status

We have successfully leveraged the ``forkserver`` context (in addition to a low number of `maxtasksperchild`) of multiprocessing to ease the load. However, the fork_exec memory allocation is still problematic on systems that do not allow overcommitting virtual memory. Waiting for os.posix_spawn to be rolled out might not be an option for us, as the problem is hitting badly right now.

## Proposed solution

I'd like to page experts on Lib/multiprocessing and Lib/subprocess to give their opinions about the following: is it possible to write an extension to `multiprocessing.util.Popen` such that it has the API of `subprocess.Popen` but the fork happens via the forkserver?

My naive intuition is that we would need to create a new type of Process, make sure that it then calls os.exec*e() --possibly around here https://github.com/python/cpython/blob/f966e5397ed8f5c42c185223fc9b4d750a678d02/Lib/multiprocessing/popen_forkserver.py#L51--, and finally handle communication with the subprocess.

Please let me know if that is even possible.
msg332020 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2018-12-17 20:50
I'm not sure I understand the proposed solution. Do you mean you would replace this:

  Parent -> forkserver -> fork child then exec

with:

  Parent -> forkserver -> posix_spawn child?
msg332021 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2018-12-17 20:57
At any rate, given the constraints you're working with (thousands of child processes, memory conservation issues), I suggest you abandon the idea of using multiprocessing and write your own subprocess-server instead.

I would suggest doing so using asyncio, which should allow you to control as many subprocesses as you want without spawning countless threads or intermediate processes:

https://docs.python.org/3/library/asyncio-subprocess.html
msg332023 - (view) Author: Oscar Esteban (oesteban) * Date: 2018-12-17 21:12
Thanks for your response.

The idea would be to enable ``subprocess.Popen`` to use an existing fork server in its fork_exec.

The rationale: I can start a pool of n workers very early in the execution flow. They will have ~350MB memory fingerprint in the beginning and they will be reset to that every ``maxtasksperchild``. So this is basically the amount of VM allocated (doubled) when forking. Pretty small.

Currently, as the fork is done from some process with all the python stack of the app loaded in memory (1.7GB in our case), then some additional 1.7GB of VM are allocated on each fork. This could be avoided if the fork was done from the forkserver pool.

As you mention, we have been considering such a "shell" server on top of asyncio, so your response just confirms our intuition.

I'll close this idea for now since I agree that any investment on this problem should be directed to the asyncio solution.

Please note that the idea proposed would work for Python < 3 (as opposed to anything based on asyncio).
msg332024 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2018-12-17 21:16
By the way you could open an issue so that subprocess uses posix_spawn() where possible.

(or you could ask to reopen issue31814, which is basically that request but for a different reason than yours)
msg332028 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-12-17 21:43
> By the way you could open an issue so that subprocess uses posix_spawn() where possible.

FYI I'm working on an implementation of this ;-)
msg332030 - (view) Author: Oscar Esteban (oesteban) * Date: 2018-12-17 21:56
Hi Victor,

That would be great. However, we played a bit with an alternative implementation of posix_spawn (one I got from one related bpo), and it didn't seem to make any difference in terms of memory allocation.

Then, we found out that posix_spawn uses fork by default (Linux implementation). So the large memory allocations still happen. One can set the vFork option, but that is apparently a very bad idea, as far as we read.

Is that correct?
msg332033 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-12-18 00:47
See bpo-34663 for posix_spawn() & vfork.
History
Date User Action Args
2018-12-18 00:47:08vstinnersetmessages: + msg332033
2018-12-17 21:56:47oestebansetmessages: + msg332030
2018-12-17 21:43:16vstinnersetnosy: + vstinner
messages: + msg332028
2018-12-17 21:16:15pitrousetmessages: + msg332024
2018-12-17 21:12:24oestebansetstatus: open -> closed
resolution: not a bug
messages: + msg332023

stage: resolved
2018-12-17 20:57:15pitrousetmessages: + msg332021
2018-12-17 20:50:15pitrousetnosy: + pitrou
messages: + msg332020
2018-12-16 06:00:484-launchpad-kalvdans-no-ip-orgsetnosy: + 4-launchpad-kalvdans-no-ip-org
2018-11-13 21:29:55oestebancreate