This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Clarify how to share multiprocessing primitives
Type: enhancement Stage: resolved
Components: Documentation Versions: Python 3.6
process
Status: closed Resolution: works for me
Dependencies: Superseder:
Assigned To: docs@python Nosy List: davin, docs@python, max
Priority: normal Keywords:

Created on 2017-03-11 19:25 by max, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Messages (5)
msg289454 - (view) Author: Max (max) * Date: 2017-03-11 19:25
It seems both me and many other people (judging from SO questions) are confused about whether it's ok to write this:

from multiprocessing import Process, Queue
q = Queue()

def f():
    q.put([42, None, 'hello'])

def main():
    p = Process(target=f)
    p.start()
    print(q.get())    # prints "[42, None, 'hello']"
    p.join()

if __name__ == '__main__':
    main()

It's not ok (doesn't work on Windows presumably because somehow when it's pickled, the connection between global queues in the two processes is lost; works on Linux, because I guess fork keeps more information than pickle, so the connection is maintained).

I thought it would be good to clarify in the docs that all the Queue() and Manager().* and other similar objects should be passed as parameters not just defined as globals.
msg289456 - (view) Author: Davin Potts (davin) * (Python committer) Date: 2017-03-11 20:46
On Windows, because that OS does not support fork, multiprocessing uses spawn to create new processes by default.  Note that in Python 3, multiprocessing provides the user with a choice of how to create new processes (i.e. fork, spawn, forkserver).

When fork is used, the 'q = Queue()' in this example would be executed once by the parent process before the fork takes place, the resulting child process continues execution from the same point as the parent when it triggered the fork, and thus both parent and child processes would see the same multiprocessing.Queue.  When spawn is used, a new process is spawned and the whole of this example script would be executed again from scratch by the child process, resulting in the child (spawned) process creating a new Queue object of its own with no sense of connection to the parent.


Would you be up for proposing replacement text to improve the documentation?  Getting the documentation just right so that everyone understands it is worth spending time on.
msg289459 - (view) Author: Max (max) * Date: 2017-03-12 00:46
How about inserting this text somewhere:

Note that sharing and synchronization objects (such as `Queue()`, `Pipe()`, `Manager()`, `Lock()`, `Semaphore()`) should be made available to a new process by passing them as arguments to the `target` function invoked by the `run()` method. Making these objects visible through global variables will only work when the process was started using `fork` (and as such sacrifices portability for no special benefit).
msg289505 - (view) Author: Max (max) * Date: 2017-03-12 19:22
Somewhat related is this statement from Programming Guidelines:

> When using the spawn or forkserver start methods many types from multiprocessing need to be picklable so that child processes can use them. However, one should generally avoid sending shared objects to other processes using pipes or queues. Instead you should arrange the program so that a process which needs access to a shared resource created elsewhere can inherit it from an ancestor process.

Since on Windows, even "inheritance" is really the same pickle + pipe executed inside CPython, I assume the entire paragraph is intended for UNIX platform only (might be worth clarifying, btw).

On Linux, "inheritance" works faster, and can deal with more complex objects compared to pickle with pipe/queue -- but it's equally true whether it's inheritance through global variables or through arguments to the target function. There's no reason 

So the text I proposed earlier wouldn't conflict with this one. It would just encourage programmers to use function arguments instead of global variables: because it's doesn't matter on Linux but makes the code portable to Windows.
msg289522 - (view) Author: Max (max) * Date: 2017-03-13 02:25
Actually, never mind, I think one of the paragraphs in the Programming Guidelines ("Explicitly pass resources to child processes") basically explains everything already. I just didn't notice it until @noxdafox pointed it out to me on SO.

Close please.
History
Date User Action Args
2022-04-11 14:58:44adminsetgithub: 73981
2017-03-13 03:42:17davinsetstatus: open -> closed
resolution: works for me
stage: needs patch -> resolved
2017-03-13 02:25:52maxsetmessages: + msg289522
2017-03-12 19:22:21maxsetmessages: + msg289505
2017-03-12 00:46:36maxsetmessages: + msg289459
2017-03-11 20:46:23davinsetnosy: + davin
messages: + msg289456

type: behavior -> enhancement
stage: needs patch
2017-03-11 19:25:49maxcreate