This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author itamarst
Recipients itamarst
Date 2020-04-24.18:22:22
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1587752543.41.0.119768714545.issue40379@roundup.psfhosted.org>
In-reply-to
Content
By default, multiprocessing uses fork() without exec() on POSIX. For a variety of reasons this can lead to inconsistent state in subprocesses: module-level globals are copied, which can mess up logging, threads don't survive fork(), etc..

The end results vary, but quite often are silent lockups.

In real world usage, this results in users getting mysterious hangs they do not have the knowledge to debug.

The fix for these people is to use "spawn" by default, which is the default on Windows.

Just a small sample:

1. Today I talked to a scientist who spent two weeks stuck, until she found my article on the subject (https://codewithoutrules.com/2018/09/04/python-multiprocessing/). Basically multiprocessing locked up, doing nothing forever. Switching to "spawn" fixed it.
2. https://github.com/dask/dask/issues/3759#issuecomment-476743555 is someone who had issues fixed by "spawn".
3. https://github.com/numpy/numpy/issues/15973 is a NumPy issue which apparently impacted scikit-learn.


I suggest changing the default on POSIX to match Windows.
History
Date User Action Args
2020-04-24 18:22:23itamarstsetrecipients: + itamarst
2020-04-24 18:22:23itamarstsetmessageid: <1587752543.41.0.119768714545.issue40379@roundup.psfhosted.org>
2020-04-24 18:22:23itamarstlinkissue40379 messages
2020-04-24 18:22:22itamarstcreate