Issue16500
Created on 2012-11-18 15:20 by christian.heimes, last changed 2013-01-14 16:06 by pitrou.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| pure-python-atfork.patch | sbt, 2012-11-19 22:43 | review | ||
| Messages (19) | |||
|---|---|---|---|
| msg175878 - (view) | Author: Christian Heimes (christian.heimes) * ![]() |
Date: 2012-11-18 15:20 | |
I propose the addition of an 'afterfork' module. The module shall fulfill a similar task as the 'atexit' module except that it handles process forks instead of process shutdown. The 'afterfork' module shall allow libraries to register callbacks that are executed on fork() inside the child process and as soon as possible. Python already has a function that must be called by C code: PyOS_AfterFork(). The 'afterfork' callbacks are called as the last step in PyOS_AfterFork(). Use case example: The tempfile module has a specialized RNG that re-initialized the RNG after fork() by comparing os.getpid() to an instance variable every time the RNG is accessed. The check can be replaced with an afterfork callback. Open questions: How should the afterfork() module handle exceptions that are raised by callbacks? Implementation: I'm going to use as much code from atexitmodule.c as possible. I'm going to copy common code to a template file and include the template from atexitmodule.c and afterforkmodule.c with some preprocessor tricks. |
|||
| msg175892 - (view) | Author: Richard Oudkerk (sbt) * ![]() |
Date: 2012-11-18 17:34 | |
pthread_atfork() allows the registering of three types of callbacks: 1) prepare callbacks which are called before the fork, 2) parent callbacks which are called in the parent after the fork 3) child callbacks which are called in the child after the fork. I think all three should be supported. I also think that a recursive "fork lock" should be introduced which is held during the fork. This can be acquired around critical sections during which forks must not occur. This is more or less a duplicate of #6923. See also #6721. |
|||
| msg175967 - (view) | Author: Christian Heimes (christian.heimes) * ![]() |
Date: 2012-11-19 20:55 | |
Thanks Richard! My first reaction was YAGNI but after I read the two tickets I now understand the need for three different hooks. I suggest that we implement our own hooks like the http://linux.die.net/man/3/pthread_atfork function, especially the order of function calls: The parent and child fork handlers shall be called in the order in which they were established by calls to pthread_atfork(). The prepare fork handlers shall be called in the opposite order. I like to focus on three hooks + the Python API and leave the usage of the hooks to other developers. Proposal: * Introduce a new module called atfork (Modules/atforkmodule.c) that is build into the core. * Move PyOS_AfterFork to Modules/atforkmodule.c. * Add PyOS_BeforeFork() (or PyOS_PrepareFork() ?) and PyOS_AfterForkParent() * call the two new methods around the calls to fork() in the stdlib. I'm not yet sure how to implement the Python API. I could either implement six methods: atfork.register_before_fork(callable, *args, **kwargs) atfork.register_after_fork_child(callable, *args, **kwargs) atfork.register_after_fork_parent(callable, *args, **kwargs) atfork.unregister_before_fork(callable) atfork.unregister_after_fork_child(callable) atfork.unregister_after_fork_parent(callable) or two: atfork.register(prepare=None, parent=None, child=None, *args, **kwargs) atfork.unregister(prepare=None, parent=None, child=None) |
|||
| msg175972 - (view) | Author: Richard Oudkerk (sbt) * ![]() |
Date: 2012-11-19 22:43 | |
Note that Gregory P. Smith has written
http://code.google.com/p/python-atfork/
I also started a pure python patch but did not get round it posting it. (It also implements the fork lock idea.) I'll attach it here.
How do you intend to handle the propagation of exceptions? I decided that after
atfork.atfork(prepare1, parent1, child1)
atfork.atfork(prepare2, parent2, child2)
...
atfork.atfork(prepareN, parentN, childN)
calling "pid = os.fork()" should be equivalent to
pid = None
prepareN()
try:
...
prepare2()
try:
prepare1()
try:
pid = posix.fork()
finally:
parent1() if pid != 0 else child1()
finally:
parent2() if pid != 0 else child2()
...
finally:
parentN() if pid != 0 else childN()
|
|||
| msg175973 - (view) | Author: Gregory P. Smith (gregory.p.smith) * ![]() |
Date: 2012-11-19 23:59 | |
I would not allow exceptions to propagate. No caller is expecting them. |
|||
| msg175974 - (view) | Author: Gregory P. Smith (gregory.p.smith) * ![]() |
Date: 2012-11-20 00:13 | |
pthread_atfork() cannot be used to implement this. Another non-python thread started by a C extension module or the C application that is embedding Python within it is always free to call fork() on its own with zero knowledge that Python even exists at all. It's guaranteed that fork will be called while the Python GIL is held in this situation which would cause any pre-fork thing registered by Python to deadlock. At best, this can be implemented manually as we do with some of the before and after fork stuff today but it must come with the caveat warning that it cannot guarantee that these things are actually called before and after fork() other than direct os.fork() calls from Python code or extremely Python aware C extension modules that may call fork() (very rare, most C & C++ libraries an extension module may be using assume that they've got the run of the house). ie: this problem is unsolvable unless you control 100% of the code being used by your entire user application. On Mon, Nov 19, 2012 at 3:59 PM, Gregory P. Smith <report@bugs.python.org>wrote: > > Gregory P. Smith added the comment: > > I would not allow exceptions to propagate. No caller is expecting them. > > ---------- > > _______________________________________ > Python tracker <report@bugs.python.org> > <http://bugs.python.org/issue16500> > _______________________________________ > |
|||
| msg175975 - (view) | Author: Christian Heimes (christian.heimes) * ![]() |
Date: 2012-11-20 00:52 | |
Meh! Exception handling takes all the fun of the API and is going to make it MUCH more complicated. pthread_atfork() ignores error handling for a good reason. It's going to be hard to get it right. :/ IFF we are going to walk the hard and rocky road of exception handling, then we are going to need at least four hooks and a register function that takres four callables as arguments: register(prepare, error, parent, child). Each prepare() call pushes an error handling onto a stack. In case of an exception in a prepare handler, the error stack is popped until all error handlers are called. This approach allows a prepare handler to actually prevent a fork() call from succeeding. The parent and child hooks are always called no matter what. Exception are recorded and a warning is emitted when at least one hook fails. We might raise an exception but it has to be a special exception that ships information if fork() has succeeded, if the code runs in child or parent and about the child's PID. I fear it's going to be *really* hard to get everything right. Gregory made a good point, too. We can rely on pthread_atfork() as we are unable to predict how third party code is using fork(): "Take cover, dead locks ahead!" :) A cooperative design of the C API with three function is my preferred way, too. PyOS_AfterForkParent() should take an argument to signal a failed fork() call. |
|||
| msg175980 - (view) | Author: Amaury Forgeot d'Arc (Amaury.Forgeot.d'Arc) * | Date: 2012-11-20 09:11 | |
2012/11/20 Christian Heimes <report@bugs.python.org> > IFF we are going to walk the hard and rocky road of exception handling, > then we are going to need at least four hooks and a register function that > takres four callables as arguments: register(prepare, error, parent, > child). Each prepare() call pushes an error handling onto a stack. In case > of an exception in a prepare handler, the error stack is popped until all > error handlers are called. This approach allows a prepare handler to > actually prevent a fork() call from succeeding. > FWIW, PyPy already has a notion of fork hooks: https://bitbucket.org/pypy/pypy/src/b4e4017909bac6c102fbc883ac8d2e42fa41553b/pypy/module/posix/interp_posix.py?at=default#cl-682 Various subsystems (threads cleanup, import lock, threading.local...) register their hook functions. You may want to experiment from there :-) A new "atfork" module would be easy to implement. |
|||
| msg175997 - (view) | Author: Richard Oudkerk (sbt) * ![]() |
Date: 2012-11-20 15:44 | |
> IFF we are going to walk the hard and rocky road of exception handling,
> then we are going to need at least four hooks and a register function that
> takres four callables as arguments: register(prepare, error, parent,
> child). Each prepare() call pushes an error handling onto a stack. In case
> of an exception in a prepare handler, the error stack is popped until all
> error handlers are called. This approach allows a prepare handler to
> actually prevent a fork() call from succeeding.
I think there are two main options if a prepare callback fails:
1) The fork should not occur and the exception should be raised
2) The fork should occur and the exception should be only be printed
I favour option 1 since, if they want, users can always wrap their prepare callbacks with
try:
...
except:
sys.excepthook(*sys.exc_info())
With option 1 I don't see why error callbacks are necessary. Just unwind the stack of imaginary try...finally... clauses and let any exceptions propagate out using exception chaining if necessary. This is what pure-python-atfork.patch does. Note, however, that if the fork succeeds then any subsequent exception is only printed.
|
|||
| msg176002 - (view) | Author: Christian Heimes (christian.heimes) * ![]() |
Date: 2012-11-20 16:20 | |
Amaury: PyPy doesn't handle exceptions in hooks. Is there a reason why PyPy goes for the simplistic approach? Richard: An error callback has the benefit that the API can notice the hooks that some error has occurred. We may not need it, though. I can think of six exception scenarios that must be handled: (1) exception in a prepare hook -> don't call the remaining prepare hooks, run all related parent hooks in FILO order, prevent fork() call (2) exception in parent hook during the handling of (1) -> print exception, continue with next parent hook (3) exception in fork() call -> run parent hooks in FILO order (4) exception in parent hook during the handling of (3) -> print exception, continue with next parent hook (5) exception in parent hook when fork() has succeeded -> print exception, continue with next parent hook (6) exception in child hook when fork() has succeeded -> print exception, continue with next child hook Do you agree? |
|||
| msg176004 - (view) | Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * ![]() |
Date: 2012-11-20 16:29 | |
> PyPy doesn't handle exceptions in hooks. > Is there a reason why PyPy goes for the simplistic approach? Probably because nobody thought about it. At the moment, there is only one 'before', one 'parent' hook (so the FILO order is simple), and three 'child' hooks. And if the _PyImport_ReleaseLock call fails, you'd better not ignore the error... |
|||
| msg176019 - (view) | Author: Gregory P. Smith (gregory.p.smith) * ![]() |
Date: 2012-11-20 19:33 | |
I think you are solving a non-problem if you want to expose exceptions from such hooks. Nobody needs it. |
|||
| msg176020 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2012-11-20 19:49 | |
> I think you are solving a non-problem if you want to expose exceptions from > such hooks. Nobody needs it. Agreed. |
|||
| msg176022 - (view) | Author: Christian Heimes (christian.heimes) * ![]() |
Date: 2012-11-20 20:04 | |
Your suggestion is that the hooks are called as:
for hook in hooks:
try:
hook()
except:
try:
sys.excepthook(*sys.exc_info())
except:
pass
That makes the implementation much easier. :)
|
|||
| msg179838 - (view) | Author: STINNER Victor (haypo) * ![]() |
Date: 2013-01-12 23:37 | |
"The tempfile module has a specialized RNG that re-initialized the RNG after fork() by comparing os.getpid() to an instance variable every time the RNG is accessed. The check can be replaced with an afterfork callback." By the way, OpenSSL expects that its PRNG is reseed somehow (call RNG_add) after a fork. I wrote a patch for OpenSSL, but I don't remember if I sent it to OpenSSL. https://bitbucket.org/haypo/hasard/src/4a1be69a47eb1b2ec7ca95a341d4ca953a77f8c6/patches/openssl_rand_fork.patch?at=default Reseeding tempfile PRNG is useless (but spend CPU/memory/hang until we have enough entropy?) if the tempfile is not used after fork. I like the current approach. -- I'm not saying that a new atfork module would not help, just that the specific case of tempfile should be discussed :-) I like the idea of a generic module to call code after fork. |
|||
| msg179888 - (view) | Author: Georg Brandl (georg.brandl) * ![]() |
Date: 2013-01-13 19:21 | |
Might make sense to put this in atexit.atfork() to avoid small-module inflation? |
|||
| msg179927 - (view) | Author: STINNER Victor (haypo) * ![]() |
Date: 2013-01-14 09:13 | |
> Might make sense to put this in atexit.atfork() to avoid small-module inflation? It sounds strange to mix "at exit" and "at fork" in the same module. Both are very different. 2013/1/13 Arfrever Frehtes Taifersar Arahesis <report@bugs.python.org>: > > Changes by Arfrever Frehtes Taifersar Arahesis <Arfrever.FTA@GMail.Com>: > > > ---------- > nosy: +Arfrever > > _______________________________________ > Python tracker <report@bugs.python.org> > <http://bugs.python.org/issue16500> > _______________________________________ |
|||
| msg179945 - (view) | Author: Marc-Andre Lemburg (lemburg) * ![]() |
Date: 2013-01-14 15:49 | |
On 13.01.2013 00:37, STINNER Victor wrote: > By the way, OpenSSL expects that its PRNG is reseed somehow (call RNG_add) after a fork. I wrote a patch for OpenSSL, but I don't remember if I sent it to OpenSSL. > https://bitbucket.org/haypo/hasard/src/4a1be69a47eb1b2ec7ca95a341d4ca953a77f8c6/patches/openssl_rand_fork.patch?at=default Apparently not, and according to this thread, they don't think this is an OpenSSL problem to solve: http://openssl.6102.n7.nabble.com/recycled-pids-causes-PRNG-to-repeat-td41669.html Note that you don't have to reseed the RNG just make sure that the two forks use different sequences. Simply adding some extra data in each process will suffice, e.g. by adding the PID of the new process to the RNG pool. This is certainly doable without any major CPU overhead :-) |
|||
| msg179949 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2013-01-14 16:06 | |
> It sounds strange to mix "at exit" and "at fork" in the same module. > Both are very different. That's true. The sys module would probably be the right place for both functionalities. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2013-01-14 16:06:08 | pitrou | set | messages: + msg179949 |
| 2013-01-14 15:49:37 | lemburg | set | nosy:
+ lemburg messages: + msg179945 |
| 2013-01-14 09:13:28 | haypo | set | messages: + msg179927 |
| 2013-01-13 21:43:00 | Arfrever | set | nosy:
+ Arfrever |
| 2013-01-13 19:21:22 | georg.brandl | set | nosy:
+ georg.brandl messages: + msg179888 |
| 2013-01-12 23:37:35 | haypo | set | nosy:
+ haypo messages: + msg179838 title: Add an 'afterfork' module -> Add an 'atfork' module |
| 2012-11-27 05:56:18 | grahamd | set | nosy:
+ grahamd |
| 2012-11-24 00:35:44 | jcea | set | nosy:
+ jcea |
| 2012-11-20 20:04:30 | christian.heimes | set | messages: + msg176022 |
| 2012-11-20 19:49:23 | pitrou | set | nosy:
+ pitrou messages: + msg176020 |
| 2012-11-20 19:33:38 | gregory.p.smith | set | messages: + msg176019 |
| 2012-11-20 16:29:10 | amaury.forgeotdarc | set | messages: + msg176004 |
| 2012-11-20 16:20:42 | christian.heimes | set | messages: + msg176002 |
| 2012-11-20 15:44:34 | sbt | set | messages: + msg175997 |
| 2012-11-20 14:59:07 | amaury.forgeotdarc | set | nosy: + amaury.forgeotdarc, - Amaury.Forgeot.d'Arc |
| 2012-11-20 14:35:50 | asvetlov | set | nosy:
+ asvetlov |
| 2012-11-20 09:11:52 | Amaury.Forgeot.d'Arc | set | nosy:
+ Amaury.Forgeot.d'Arc messages: + msg175980 |
| 2012-11-20 00:52:11 | christian.heimes | set | messages: + msg175975 |
| 2012-11-20 00:13:31 | gregory.p.smith | set | messages: + msg175974 |
| 2012-11-19 23:59:25 | gregory.p.smith | set | messages: + msg175973 |
| 2012-11-19 22:43:50 | sbt | set | files:
+ pure-python-atfork.patch keywords: + patch messages: + msg175972 |
| 2012-11-19 20:55:20 | christian.heimes | set | nosy:
+ twouters, gregory.p.smith messages: + msg175967 |
| 2012-11-18 17:34:43 | sbt | set | messages: + msg175892 |
| 2012-11-18 17:08:15 | pitrou | set | nosy:
+ sbt |
| 2012-11-18 15:20:05 | christian.heimes | create | |
