Message 143174 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	sbt
Recipients	Giovanni.Bajo, avian, bobbyi, gregory.p.smith, neologix, nirai, pitrou, sbt, sdaoden, vstinner
Date	2011-08-29.19:04:00
SpamBayes Score	2.220446e-16
Marked as misclassified	No
Message-id	<1314644641.94.0.400299242844.issue6721@psf.upfronthosting.co.za>
In-reply-to

Content
multiprocessing.util already has register_after_fork() which it uses for cleaning up certain things when a new process (launched by multiprocessing) is starting. This is very similar to the proposed atfork mechanism. Multiprocessing assumes that it is always safe to delete lock objects. If reinit_locks.diff is committed then I guess this won't be a problem. I will try to go through multiprocessing's use of threads: Queue ----- Queue's have a feeder thread which pushes objects in to the underlying pipe as soon as possible. The state which can be modified by this thread is a threading.Condition object and a collections.deque buffer. Both of these are replaced by fresh copies by the after-fork mechanism. However, because objects in the buffer may have __del__ methods or weakref callbacks associated, arbitrary code may be run by the background thread if the reference count falls to zero. Simply pickling the argument of put() before adding it to the buffer fixes that problem -- see the patch for Issue 10886. With this patch I think Queue's use of threads is fork-safe. Pool ---- If a fork occurs while a pool is running then a forked process will get a copy of the pool object in an inconsistent state -- but that does not matter since trying to use a pool from a forked process will never work. Also, some of a pool's methods support callbacks which can execute arbitrary code in a background thread. This can create inconsistent state in a forked process As with Queue.put, pool methods should pickle immediately for similar reasons. I would suggest documenting clearly that a pool should only ever be used or deleted by the process which created it. We can use register_after_fork to make all of a pool's methods raise an error after a fork. We should also document that callbacks should only be used if no more processes will be forked. allow_connection_pickling ------------------------- Currently multiprocessing.allow_connection_pickling() does not work because types are registered with ForkingPickler instead of copyreg -- see Issue 4892. However, the code in multiprocessing.reduction uses a background thread to support the transfer of sockets/connections between processes. If this code is ever resurrected I think the use of register_after_fork makes this safe. Managers -------- A manager uses a threaded server process. This is not a problem unless you create a user defined manager which forks new processes. The documentation should just say Don't Do That. I think multiprocessing's threading issues are all fixable.

multiprocessing.util already has register_after_fork() which it uses for cleaning up certain things when a new process (launched by multiprocessing) is starting. This is very similar to the proposed atfork mechanism.

Multiprocessing assumes that it is always safe to delete lock objects. If reinit_locks.diff is committed then I guess this won't be a problem.

I will try to go through multiprocessing's use of threads:

Queue
-----

Queue's have a feeder thread which pushes objects in to the underlying pipe as soon as possible. The state which can be modified by this thread is a threading.Condition object and a collections.deque buffer. Both of these are replaced by fresh copies by the after-fork mechanism.

However, because objects in the buffer may have __del__ methods or weakref callbacks associated, arbitrary code may be run by the background thread if the reference count falls to zero.

Simply pickling the argument of put() before adding it to the buffer fixes that problem -- see the patch for Issue 10886. With this patch I think Queue's use of threads is fork-safe.

Pool
----

If a fork occurs while a pool is running then a forked process will get a copy of the pool object in an inconsistent state -- but that does not matter since trying to use a pool from a forked process will *never* work.

Also, some of a pool's methods support callbacks which can execute arbitrary code in a background thread. This can create inconsistent state in a forked process

As with Queue.put, pool methods should pickle immediately for similar reasons.

I would suggest documenting clearly that a pool should only ever be used or deleted by the process which created it. We can use register_after_fork to make all of a pool's methods raise an error after a fork.

We should also document that callbacks should only be used if no more processes will be forked.

allow_connection_pickling
-------------------------

Currently multiprocessing.allow_connection_pickling() does not work because types are registered with ForkingPickler instead of copyreg -- see Issue 4892. However, the code in multiprocessing.reduction uses a background thread to support the transfer of sockets/connections between processes.

If this code is ever resurrected I think the use of register_after_fork makes this safe.

Managers
--------

A manager uses a threaded server process. This is not a problem unless you create a user defined manager which forks new processes. The documentation should just say Don't Do That.

I think multiprocessing's threading issues are all fixable.

History
Date	User	Action	Args
2011-08-29 19:04:02	sbt	set	recipients: + sbt, gregory.p.smith, pitrou, vstinner, nirai, bobbyi, neologix, Giovanni.Bajo, sdaoden, avian
2011-08-29 19:04:01	sbt	set	messageid: <1314644641.94.0.400299242844.issue6721@psf.upfronthosting.co.za>
2011-08-29 19:04:01	sbt	link	issue6721 messages
2011-08-29 19:04:00	sbt	create