classification
Title: atexit handlers are not executed when using multiprocessing.Pool.map.
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 2.7
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: davin, juj, sbt
Priority: normal Keywords:

Created on 2015-02-20 11:48 by juj, last changed 2015-02-20 23:59 by davin. This issue is now closed.

Files
File name Uploaded Description Edit
task_spawn.py juj, 2015-02-20 11:48
Messages (6)
msg236273 - (view) Author: juj (juj) Date: 2015-02-20 11:48
When Multiprocessing.Pool.map is used for a script that registers atexit handlers, the atexit handlers are not executed when the pool threads quit.

STR:

1. Run attached file in Python 2.7 with 'python task_spawn.py'
2. Observe the printed output.

Observed:

Console prints:

CREATED TEMP DIRECTORY c:\users\clb\appdata\local\temp\temp_qef8r_
CREATED TEMP DIRECTORY c:\users\clb\appdata\local\temp\temp_axi9tt
CREATED TEMP DIRECTORY c:\users\clb\appdata\local\temp\temp_vx6fmu
task1
task2
ATEXIT: REMOVING TEMP DIRECTORY c:\users\clb\appdata\local\temp\temp_qef8r_

Expected:

Console should print:

CREATED TEMP DIRECTORY c:\users\clb\appdata\local\temp\temp_qef8r_
CREATED TEMP DIRECTORY c:\users\clb\appdata\local\temp\temp_axi9tt
CREATED TEMP DIRECTORY c:\users\clb\appdata\local\temp\temp_vx6fmu
task1
task2
ATEXIT: REMOVING TEMP DIRECTORY c:\users\clb\appdata\local\temp\temp_vx6fmu
ATEXIT: REMOVING TEMP DIRECTORY c:\users\clb\appdata\local\temp\temp_axi9tt
ATEXIT: REMOVING TEMP DIRECTORY c:\users\clb\appdata\local\temp\temp_qef8r_
msg236274 - (view) Author: juj (juj) Date: 2015-02-20 11:50
This was tested on Python 2.7.9 64-bit on Windows 8.1, however I believe that it occurs equally on OSX and Linux, since I am running servers with those OSes that also exhibit temp file leaking issues (although I did not specifically confirm if the root cause is the same as this).
msg236329 - (view) Author: Davin Potts (davin) * (Python committer) Date: 2015-02-20 21:05
There are at least two issues at play here.


Running the attached file on OS X produces starkly different results -- console prints:
CREATED TEMP DIRECTORY /var/folders/s4/tc1y5rjx25vfknpzvnfh1b140000gn/T/temp_z6I0BA
task1
task2
ATEXIT: REMOVING TEMP DIRECTORY /var/folders/s4/tc1y5rjx25vfknpzvnfh1b140000gn/T/temp_z6I0BA


The reason only one temp directory is created on OS X (or on other unix-y platforms) and more than one is created on Windows is described in more detail here:
https://docs.python.org/2/library/multiprocessing.html#windows

In short, on Windows 8.1, the processes you spawn via multiprocessing must import your main module ("task_spawn" in this case) and in so doing each executes both the line creating a temp directory and the lines following it (this is part of how import works).  I suspect you want instead to put these lines inside a "if __name__ == '__main__'" clause -- doing so will ensure only one temp dir is created and it will be properly cleaned up when the interpreter exits cleanly.  You will have consistent behavior across Windows and unix-y platforms this way too, not to mention your code will more clearly convey that you only want the main process to create a temp dir.  (Specifically see the section "Safe importing of main module" in the docs at the above link.)


That was the first issue -- on to the second.


The registering of functions with atexit means they'll be executed upon "normal interpreter termination".  Lifting a snippet from the atexit docs' introduction section (https://docs.python.org/2/library/atexit.html):

  Note: The functions registered via this module are not called when the program is killed by a signal not handled by Python, when a Python fatal internal error is detected, or when os._exit() is called.

When the processes created and managed via multiprocessing reach termination, that is quite different from "normal interpreter termination".  You are observing that when the interpreter's (main) process is done, it executes the function you registered with atexit -- that is how it should be.  Registering functions with atexit inside distinct processes will not cause them to be automagically registered with atexit in the parent interpreter process.



Hopefully with the above explanation in hand it will be possible to make the necessary changes to correct your code without breaking a sweat.
msg236330 - (view) Author: Davin Potts (davin) * (Python committer) Date: 2015-02-20 21:12
I should have added in my prior comments:

juj:  thank you very much for providing the info about the platform you tested on and even an example piece of code that triggered the problem.  I wish all issues came with the level of info you provided.
msg236335 - (view) Author: juj (juj) Date: 2015-02-20 21:52
While the test case can be 'fixed' by changing the code to use "if __name__ == '__main__'", and I'm ok to do it in my code to work around the problem, I would argue the following:

1) calling this not a bug (or solving it only at documentation level) does not at all feel correct to reflect the situation, since the provided test case silently fails and does the unexpected. If atexit() does not work at all when invoked as a result of importing from multiprocessing.Pool.map(), then at minimum it would be better that calling atexit() in such a scenario should throw an exception "not available", rather than silently discarding the operation.

2) Why couldn't the atexit handlers be executed even on Windows when the multiprocessing processes quit, even if special code is required in python multiprocessing libraries to handle it? The explanation you are giving sounds like a lazy excuse. There should not be any technical obstacle why the cleanup handlers could not be tracked and honored here?

3) Saying that this should not be working like the (existing) documentation implies, is not at all obvious to the reader. I could not find it documented that processes that exit from multiprocessing would be somehow special, and the note that you pasted does is not in any way obvious to connect to this case, since a) I was not using signals, b) there was no internal error occurring, and c) I was not calling os._exit(). The documentation does not reflect that it is undefined whether atexit() handlers are executed or not when multiprocessing is used.

4) I would even argue that it is a bug that there is different cross platform observable behavior in terms of multiprocessing and script importing, but that is probably a different topic.

Overall, leaving this as a silent failure, instead of raising an exception, nor implementing the support on Windows, does not feel mature, since it leaves a hole of C/C++ style of undefined behavior in the libraries. For maturity, I would recommend something to be done, in the descending order of preference:

I) Fix multiprocessing importing on windows so that it is not a special case compared to other OSes.

II) If the above is not possible, fix the atexit() handlers so that they are executed when the processes quit on Windows.

III) If the above is not possible, make the atexit() function raise an exception if invoked from a script that has been spawned from multiprocessing, when it is known at atexit() call time that the script was spawned a as a result of multiprocessing, and the atexit() handlers will never be run.

If none of those are really not possible due to real technical reasons, then as a last resort, explicitly document both in the docs for atexit() and the docs for multiprocessing that the atexit() handlers are not executed if called on Windows when these two are used in conjunction.

Disregarding these kind of silent failure behavior especially when cross-platformness is involved with a shrug and a NotABug label is not a good practice!
msg236344 - (view) Author: Davin Potts (davin) * (Python committer) Date: 2015-02-20 23:59
You make an overall valid point that despite reading the documentation, the resulting behavior of your code was not what you expected -- I take that specific complaint very seriously anytime anyone makes it.


Regarding your recommendations:

I) Unfortunately this is not a trivial topic; it has been discussed extensively elsewhere.  As you point out, this is a different topic.

II) Please do note the atexit documentation does not suggest it is a tool for triggering actions when a _process_ exits.  I don't think I'm a pedant but that kinda makes me sound like one.



Here are some suggestions on potential next steps:

1. If the documentation for atexit is inadequate in getting across its true nature and limitations, would you please open a new issue against atexit's documentation?  In it, please recommend what would have made it much clearer.

2. Detecting that atexit functionality has been invoked inside a process created using multiprocessing does not cover the full range of possibilities where atexit functionality is impacted and thus the originally-intended/desired behavior will not occur.  Ignoring the larger set of possible scenarios for the moment, I think a case could be made to add an atexit-like feature to multiprocessing that would give you specific control over what happens when a process created by multiprocessing is done and terminates.  If that appeals to you too, would you consider opening a new issue proposing this feature request and, given your use cases to date, please suggest things that would make it especially valuable?



Apologies if any choice of phrasing on my part added in any way to your frustration -- it was not my intention.  I do hope you'll be able to contribute something more, possibly along the lines I suggest.
History
Date User Action Args
2015-02-20 23:59:21davinsetmessages: + msg236344
2015-02-20 21:52:56jujsetmessages: + msg236335
2015-02-20 21:12:40davinsetmessages: + msg236330
2015-02-20 21:05:32davinsetstatus: open -> closed
resolution: not a bug
messages: + msg236329

stage: resolved
2015-02-20 14:15:34berker.peksagsetnosy: + sbt, davin
2015-02-20 11:50:28jujsetmessages: + msg236274
2015-02-20 11:48:28jujcreate