classification
Title: Pool methods can only be used by parent process.
Type: Stage: resolved
Components: Documentation Versions: Python 3.3, Python 3.4, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: docs@python Nosy List: abn, docs@python, jnoller, python-dev, sbt, terry.reedy
Priority: normal Keywords:

Created on 2013-02-22 06:20 by abn, last changed 2013-07-02 11:45 by sbt. This issue is now closed.

Files
File name Uploaded Description Edit
pool_forking.py abn, 2013-02-22 06:20 Example script highlighting the issue
Messages (8)
msg182647 - (view) Author: Arun Babu Neelicattu (abn) * Date: 2013-02-22 06:20
The task/worker handler threads in the multiprocessing.pool.Pool class are (in accordance to posix standards) not copied over when the process containing the pool is forked.

This leads to a situation where the Pool keeps receiving tasks but the tasks never get handled. This could potentially lead to deadlocks if AsyncResult.wait() is called.

Not sure if this should be considered as a bug, or an invalid use case. However, this becomes a problem when importing modules that use pools and the main code uses multiprocessing too.

[BAD] Workaround:
Reassigning Pool._task_handler to a new instance of threading.Thread after the fork seems to work in the case highlighted in the example.

Environment:
Fedora 18 
Linux 3.7.8-202.fc18.x86_64 #1 SMP Fri Feb 15 17:33:07 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
python3-3.3.0-1.fc18.x86_64

An example of this issue is shown below:

from multiprocessing import Pool, Process

def t2():
	# We expect the pool to handle this
	print('t2: Hello!')

pool = Pool()
def t1():
	# We assign a task to the pool
	pool.apply_async(t2)
	print('t1: Hello!')

if __name__ == '__main__':
	# Process() forks the main process containing the pool
	Process(target=t1).start()
msg182656 - (view) Author: Arun Babu Neelicattu (abn) * Date: 2013-02-22 07:35
I should have mentioned this too,

[GOOD] Workaround:
Probably the 'correct' way to achieve what is required in the example, could be to use a managed pool.

pool = multiprocessing.Manager().Pool()
msg182659 - (view) Author: Richard Oudkerk (sbt) * (Python committer) Date: 2013-02-22 09:50
A pool should only be used by the process that created it (unless you use a managed pool).

If you are creating long lived processes then you could create a new pool on demand.  For example (untested)

    pool_pid = (None, None)

    def get_pool():
        global pool_pid
        if os.getpid() != pool_pid[1]:
            pool_pid = (Pool(), os.getpid())
        return pool_pid[0]
msg182697 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2013-02-22 20:23
Arun, to call this a bug, you need to demonstrate a conflict between behavior and doc, and I do not see that you have.

Richard, are you suggesting that we close this, or do you see an actionable issue? (a plausible patch to the repository?)
msg182701 - (view) Author: Richard Oudkerk (sbt) * (Python committer) Date: 2013-02-22 21:45
> Richard, are you suggesting that we close this, or do you see an 
> actionable issue? (a plausible patch to the repository?)

I skimmed the documentation and could not see that this restriction has been documented.

So I think a documentation patch would be a good idea.
msg182707 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2013-02-23 01:43
Arun, can you suggest a sentence to add and where to add it?
msg182852 - (view) Author: Arun Babu Neelicattu (abn) * Date: 2013-02-24 04:24
Terry, I think the best place to make a note of this would be at [1,2].

As for what should be noted, something along the lines of what Richard mentioned should suffice.

"A pool should only be used by the process that created it (unless you use a managed pool)."

I am not certain what the best way to phrase this would be, but it would also be helpful to note that this will cause unexpected behavior if a script imports a module that uses a Pool and forks (ie. uses Process() or another Pool()). This is how I bumped into this issue.

Hope this helps.

[1] http://docs.python.org/2/library/multiprocessing.html#using-a-pool-of-workers
[2] http://docs.python.org/3/library/multiprocessing.html#using-a-pool-of-workers
msg192187 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2013-07-02 11:42
New changeset 389788ba6bcb by Richard Oudkerk in branch '2.7':
Issue #17273: Clarify that pool methods can only be used by parent process.
http://hg.python.org/cpython/rev/389788ba6bcb

New changeset 57fe80fda9be by Richard Oudkerk in branch '3.3':
Issue #17273: Clarify that pool methods can only be used by parent process.
http://hg.python.org/cpython/rev/57fe80fda9be

New changeset 7ccf3d36ad13 by Richard Oudkerk in branch 'default':
Issue #17273: Clarify that pool methods can only be used by parent process.
http://hg.python.org/cpython/rev/7ccf3d36ad13
History
Date User Action Args
2013-07-02 11:45:39sbtsetstatus: open -> closed
versions: + Python 2.7, Python 3.4
type: behavior ->
title: multiprocessing.pool.Pool task/worker handlers are not fork safe -> Pool methods can only be used by parent process.
resolution: fixed
stage: resolved
2013-07-02 11:42:32python-devsetnosy: + python-dev
messages: + msg192187
2013-02-24 04:24:56abnsetmessages: + msg182852
2013-02-23 01:43:34terry.reedysetnosy: + docs@python
messages: + msg182707

assignee: docs@python
components: + Documentation, - Library (Lib)
2013-02-22 21:45:09sbtsetmessages: + msg182701
2013-02-22 20:23:20terry.reedysetnosy: + terry.reedy
messages: + msg182697
2013-02-22 09:50:12sbtsetmessages: + msg182659
2013-02-22 07:35:46abnsetmessages: + msg182656
2013-02-22 07:03:11abnsetnosy: + jnoller, sbt
2013-02-22 06:20:53abncreate