classification
Title: multiprocessing's "spawn" doesn't actually use spawn
Type: behavior Stage:
Components: Library (Lib) Versions: Python 3.11, Python 3.10, Python 3.9
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: jakirkham, vstinner
Priority: normal Keywords:

Created on 2022-01-13 19:20 by jakirkham, last changed 2022-01-15 00:10 by vstinner.

Messages (2)
msg410512 - (view) Author: (jakirkham) Date: 2022-01-13 19:20
Reporting an issue recently encountered by a colleague.

It appears the `multiprocessing`'s "spawn" mode doesn't actually use POSIX spawn, but instead uses fork+exec[1]. While this is certainly a useful feature in its own right, this not quite one would expect from something described as spawn. AFAICT the documentation doesn't point this out.

This is important as some libraries are not fork-safe and even fork+exec is not sufficient to protect them. Would be helpful if "spawn" did use POSIX spawn and the current behavior was covered under a clearer name (like "forkexec").


Ref:
1. https://github.com/python/cpython/blob/af6b4068859a5d0c8afd696f3c0c0155660211a4/Lib/multiprocessing/util.py#L448-L458
msg410612 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-01-15 00:10
> It appears the `multiprocessing`'s "spawn" mode doesn't actually use POSIX spawn, but instead uses fork+exec[1].

The documentation doesn't pretend to use posix_spawn(). It only says: "starts a fresh python interpreter process".
https://docs.python.org/dev/library/multiprocessing.html#contexts-and-start-methods

I suggest to close the issue as "not a bug". I don't see anything wrong in the current documentation.

--

posix_spawn() is a function of the C library. It is implemented as fork+exec on most operating systems. I'm only aware of macOS which has a dedicated syscall. Well, posix_spawn() implementation is usually faster thanks to some optimizations.

Python has os.posix_spawn() since Python 3.8.

The subprocess can use os.posix_spawn() on Linux under some conditions:
https://docs.python.org/dev/whatsnew/3.8.html#optimizations

Sadly, it's not used by default, since close_fds=True remains subprocess.Popen() default.

I'm open to use it on more platforms. os.posix_spawn() can only be used if it reports properly errors to the parent process, and some other things and bugs. It's a complex function!

--

Oh, about multiprocessing. Well, someone has to propose a patch! I don't know why multiprocessing uses directly _posixsubprocess.fork_exec() rather than the subprocess module. It's also a complex module with many specific constraints.

posix_spawn() looks nice, but it cannot be used in many cases :-(
History
Date User Action Args
2022-01-15 00:10:58vstinnersetmessages: + msg410612
2022-01-14 18:44:43iritkatrielsetnosy: + vstinner

versions: - Python 3.7, Python 3.8
2022-01-13 19:20:59jakirkhamcreate