classification
Title: multiprocessing needs option to eschew fork() under Linux
Type: enhancement Stage: needs patch
Components: Library (Lib) Versions: Python 3.3
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: asksol, brandon-rhodes, cool-RR, dholth, jnoller, mrmekon, ned.deily, neologix, numbernine, pitrou, rcoyner, sbt, vsekhar
Priority: normal Keywords: patch

Created on 2010-05-14 16:13 by brandon-rhodes, last changed 2012-01-24 07:58 by neologix.

Files
File name Uploaded Description Edit
mp_fork_exec.patch sbt, 2011-09-13 13:50
mp_fork_exec.patch sbt, 2012-01-23 14:36 review
mp_split_tests.patch sbt, 2012-01-23 14:42
Messages (19)
msg105719 - (view) Author: Brandon Craig Rhodes (brandon-rhodes) * Date: 2010-05-14 16:13
The "multiprocessing" module uses a bare fork() to create child processes under Linux, so the children get a copy of the entire state of the parent process.  But under Windows, child processes are freshly spun-up Python interpreters with none of the data structures or open connections of the parent process available.  This means that code that tests fine under Linux, because it is depending on residual parent state in a way that the programmer has not noticed, can fail spectacularly under Windows.

Therefore, the "multiprocessing" module should offer an option under Linux that ignores the advantage of being able to do a bare fork() and instead spins up a new interpreter instance just like Windows does.  Some developers will just use this for testing under Linux, so their test results are valid for Windows too; and some developers might even use this in production, preferring to give up a bit of efficiency under Linux in return for an application that will show the same behavior on both platforms.  Either way, an option that lets the developer subvert the simple "sys.platform != 'win32'" check in "forking.py" would go a long way towards helping us write platform-agnostic Python programs.
msg105724 - (view) Author: Jesse Noller (jnoller) * (Python committer) Date: 2010-05-14 17:15
This is on my wish list; but I have not had time to do it. Patch welcome.
msg105730 - (view) Author: Brandon Craig Rhodes (brandon-rhodes) * Date: 2010-05-14 18:05
Jesse, it's great to learn it's on your wish list too!

Should I design the patch so that (a) there is some global in the module that needs tweaking to choose the child creation technique, or (b) that an argument to the Process() constructor forces a full interpreter exec to make all platforms match, or (c) that a process object once created has an attribute (like ".daemon") that you set before starting it off?  Or (d) should there be a subclass of Process that, if specifically used, has the fork/exec behavior instead of just doing the fork?

My vote would probably be for (b), but you have a much better feel for the library and its style than I do.
msg105734 - (view) Author: Jesse Noller (jnoller) * (Python committer) Date: 2010-05-14 18:27
I pretty much agree with (b) an argument - your gut instinct is correct - there's a long standing thread in python-dev which pretty much solidified my thinking about whether or not we need this (we do). 

Any patch has to be backwards compatible by the way, it can not alter the current default behavior, also, it has to target python3 as 2.7 is nearing final, and this is a behavioral change.
msg105936 - (view) Author: Ram Rachum (cool-RR) Date: 2010-05-17 20:18
+1 for this issue; I've also wished for this feature in the past.
msg126748 - (view) Author: Daniel Holth (dholth) Date: 2011-01-21 15:31
+1
msg142915 - (view) Author: Ask Solem (asksol) (Python committer) Date: 2011-08-24 20:30
I have suspected that this may be necessary, not just merely useful, for some time, and issue6721 seems to verify that.  In addition to adding the keyword arg to Process, it should also be added to Pool and Manager.

Is anyone working on a patch? If not I will work on a patch asap.
msg142918 - (view) Author: Jesse Noller (jnoller) * (Python committer) Date: 2011-08-24 20:41
No one is currently working on a patch AFAIK
msg143966 - (view) Author: (sbt) Date: 2011-09-13 13:06
Here is a patch which adds the following functions:

  forking_disable()
  forking_enable()
  forking_is_enabled()
  set_semaphore_prefix()
  get_semaphore_prefix()

To create child processes using fork+exec on Unix, call
forking_disable() at the beginning of the program.

I have tested the patch on Linux (by adding forking_disable() to
test_multiprocessing), and it seems to work.  However, the patch does
not modify test_multiprocessing, and I am not sure of the best way to
do so.  (See below.)

There are some issues with named semaphores.  When forking is
disabled, the name of the semaphore must be left unlinked so that
child processes can use sem_open() on the name.  The patch therefore
delays unlinking the name (only when forking is disabled) until the
original SemLock object is garbage collected or the process which
created it exits.

But if a process is killed without exiting cleanly then the name may
be left unlinked.  This happens, for instance, if I run
test_multiprocessing and then keep hitting ^C until all the processes
exit.  On Linux this leaves files with names like

  /dev/shm/sem.mp-fa012c80-4019-2

which represent leaked semaphores.  These won't be destroyed until the
computer reboots or the semaphores are manually removed (by using
sem_unlink() or by unlinking the entry from the file system).

If some form of this patch is accepted, then the problem of leaked
semaphores needs to be addressed, otherwise the buildbots are likely
run out of named semaphores.  But I am not sure how best to do this in
a platform agnostic way.  (Maybe a forked process could collect names
of all semaphores created, via a pipe.  Then it could try to
sem_unlink() all those names when all write-ends of the pipe are
closed.)
msg143969 - (view) Author: (sbt) Date: 2011-09-13 13:50
Small fix to patch.
msg149996 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2011-12-21 15:04
Thanks for the patch sbt.
I think this is indeed useful, but I'm tempted to go further and say we should make this the default - and only - behavior. This will probably break existing code that accidentaly relied the fact that the implementation uses a bare fork(), but i'd say it's worth it:
- it's cleaner
- it will make it possible to remove all the ad-hoc handlers called after fork()
- it will remove the only place in the whole stdlib where fork() isn't followed by exec(): people who get bitten by issue #6721 will thus only be people calling explicitely fork(), in which case they're the sole responsibles for their misery ;-)

Another - although less common - advantage over the current implementation is that now one can run out of memory pretty easily if the operating system doesn't do overcommitting if you work with a large dataset. If fork() is followed by an exec, no problem.

Thoughts?
msg149998 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-12-21 15:28
> Thanks for the patch sbt.
> I think this is indeed useful, but I'm tempted to go further and say
> we should make this the default - and only - behavior. This will
> probably break existing code that accidentaly relied the fact that the
> implementation uses a bare fork(),

There is probably lots of such code:
- code that passes non-pickleable function object / function args to
execute in the child process
- code that executes code with side effects at module top level
- code that relies (willingly or not) on other stuff such as fds being
inherited
msg150001 - (view) Author: (sbt) Date: 2011-12-21 15:57
> I think this is indeed useful, but I'm tempted to go further and say we 
> should make this the default - and only - behavior. This will probably 
> break existing code that accidentaly relied the fact that the 
> implementation uses a bare fork(), but i'd say it's worth it:

I'm not convinced about making it the default behaviour, and certainly not the only one.

I have a working patch which ensures that leaked semaphores get cleaned up on exit.  However, I think to add proper tests for the patch, test_multiprocessing needs to be refactored.  Maybe we could end up with

multiprocessing_common.py
test_multiprocessing_processes_fork.py
test_multiprocessing_processes_nofork.py
test_multiprocessing_manager_fork.py
test_multiprocessing_manager_nofork.py
test_multiprocessing_threads.py
test_multiprocessing_others.py

The actual unittests would be in multiprocessing_common.py and test_multiprocessing_others.py.  The other files would run the unittests in multiprocessing_common.py using different configurations.

Thoughts?
msg150004 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2011-12-21 16:05
> There is probably lots of such code:

> I'm not convinced about making it the default behaviour, and 
> certainly not the only one.

Then I'm not convinced that this patch is useful.
Having three different implentations and code paths doesn't sound like a good idea to me.
fork() vs fork() + exec() is an implementation detail, and exposing such tweakables to the user will only make confusion worse.
msg150030 - (view) Author: Jesse Noller (jnoller) * (Python committer) Date: 2011-12-21 18:26
On Wednesday, December 21, 2011 at 10:04 AM, Charles-François Natali wrote:

While I would tend to agree with you in theory - I don't think we should make it the default - at least not without a LOT of lead time. There's a surprising amount of code relying on the current behavior that I think the best course is to enable this option, and change the docs to steer users in this direction.

For users jumping from 2.x into 3.x, I think the less surprises they have the better, and changing the default behavior of the stdlib module in this was would qualify as surprising.
msg150582 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2012-01-04 03:37
See also consolidated Issue13558 for additional justification for processes option on OS X.
msg151818 - (view) Author: (sbt) Date: 2012-01-23 14:36
Attached is an updated version of the mp_fork_exec.patch.  This one is able to reliably clean up any unlinked semaphores if the program exits abnormally.
msg151819 - (view) Author: (sbt) Date: 2012-01-23 14:42
mp_split_tests.patch splits up the test_multiprocessing.py:

test_multiprocessing_misc.py
  miscellaneous tests which need not be run with multiple configurations

mp_common.py
  testcases which should be run with multiple configurations

test_multiprocessing_fork.py
test_multiprocessing_nofork.py
test_multiprocessing_manager_fork.py
test_multiprocessing_manager_nofork.py
test_multiprocessing_threads.py
  run the testcases in mp_common.py with various configurations
msg151882 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2012-01-24 07:58
I don't know what the others think, but I'm still -1 on this patch.
Not that I don't like the principle - it's actually the contrary: in a
perfect world, I think this should be made the default -and only -
behavior on POSIX. But since it may break existing code, we'd have to
keep both implementations for POSIX systems, with - at least to me -
little benefit.
Having three different implementations, with different codepaths, will
increase the cognitive burden, make the code less readable, and
debugging harder:
- user: I'm getting this error with multiprocessing
- dev: On Windows or on Unix?
- user: On Unix
- dev: Do you use the fork()+exec() version or the bare fork() version?
- user: what's fork() and exec()?
History
Date User Action Args
2012-01-24 07:58:55neologixsetmessages: + msg151882
2012-01-23 17:35:34santa4ntsetnosy: - santa4nt
2012-01-23 14:42:57sbtsetfiles: + mp_split_tests.patch

messages: + msg151819
2012-01-23 14:36:07sbtsetfiles: + mp_fork_exec.patch

messages: + msg151818
2012-01-04 03:37:47ned.deilysetnosy: + ned.deily, mrmekon
messages: + msg150582
2012-01-04 03:34:27ned.deilylinkissue13558 superseder
2011-12-21 18:26:29jnollersetmessages: + msg150030
2011-12-21 16:05:50neologixsetmessages: + msg150004
2011-12-21 15:57:24sbtsetmessages: + msg150001
2011-12-21 15:28:03pitrousetmessages: + msg149998
2011-12-21 15:04:28neologixsetnosy: + pitrou, neologix
messages: + msg149996
2011-09-13 13:50:29sbtsetfiles: + mp_fork_exec.patch

messages: + msg143969
2011-09-13 13:48:57sbtsetfiles: - mp_fork_exec.patch
2011-09-13 13:06:48sbtsetfiles: + mp_fork_exec.patch

nosy: + sbt
messages: + msg143966

keywords: + patch
2011-08-24 20:41:42jnollersetmessages: + msg142918
2011-08-24 20:30:27asksolsetmessages: + msg142915
2011-08-20 15:02:40vsekharsetnosy: + vsekhar
2011-08-10 17:46:12numberninesetnosy: + numbernine
2011-01-21 15:43:35pitrousetnosy: + asksol
stage: needs patch

versions: + Python 3.3, - Python 3.2
2011-01-21 15:31:15dholthsetnosy: + dholth
messages: + msg126748
2010-08-07 18:09:37terry.reedysetversions: - Python 2.6, Python 3.1, Python 2.7, Python 3.3
2010-05-28 15:57:02rcoynersetnosy: + rcoyner
2010-05-17 20:18:26cool-RRsetnosy: + cool-RR
messages: + msg105936
2010-05-15 04:49:25santa4ntsetnosy: + santa4nt
2010-05-14 18:27:41jnollersetmessages: + msg105734
2010-05-14 18:05:46brandon-rhodessetmessages: + msg105730
2010-05-14 17:15:28jnollersetmessages: + msg105724
2010-05-14 17:01:17r.david.murraysetnosy: + jnoller
2010-05-14 16:13:09brandon-rhodescreate