New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
multiprocessing occasionally spits out exception during shutdown (_handle_workers) #53453
Comments
On Ubuntu 10.04, using freshly-compiled python-from-trunk (as well as multiprocessing-from-trunk), I get tracebacks from the following about 30% of the time: """ My tracebacks are of the form:
"""
Exception in thread Thread-1 (most likely raised during interpreter shutdown):
Traceback (most recent call last):
File "/usr/local/lib/python2.7/threading.py", line 530, in __bootstrap_inner
File "/usr/local/lib/python2.7/threading.py", line 483, in run
File "/usr/local/lib/python2.7/multiprocessing/pool.py", line 272, in _handle_workers
<type 'exceptions.TypeError'>: 'NoneType' object is not callable
""" This information was originally posted to http://bugs.python.org/issue4106. |
Thanks greg; so this affects 2.6 as well (not using the backport at all) |
That's likely a mistake on my part. I'm not observing this using the stock version of multiprocessing on my Ubuntu machine(after running O(100) times). I do, however, observe it when using either python2.7 or python2.6 with multiprocessing-from-trunk, if that's interesting. I'm not really sure what the convention is here; should this be filed just under Python 2.7? Thanks. |
Oh, you mean the backport from google code? The person who stepped up to maintain that has not refreshed that in some time. I need to decide what to do with it long term. I'm pretty sure it's badly out of date. |
No, I'm not using the Google code backport. To be clear, I've tried testing this with two versions of multiprocessing:
Out of curiosity, I did just try this with the processing library (version 0.52) on a 64-bit Debian Lenny box, and did not observe these exceptions. Hope that's useful! |
Wait - so, you are pulling svn trunk, compiling and running your test with the built python executable? I'm not following the "multiprocessing-from-trunk" distinction unless you're picking the module out of the tree / compiling it and then moving it into some other install. I might be being overly dense. You're running your test with cd src/tree/ && ./python <your thing> - right? Also, what, if any, compile flags are you passing to the python build? |
> You're running your test with cd src/tree/ && ./python <your thing> - > right?
What... is src/tree? If it's what you're asking, I am running the freshly-compiled python interpreter, and it does seem to be using the relevant modules out of trunk:
>>> import threading; threading.__file__
'/usr/local/lib/python2.7/threading.pyc'
>>> import multiprocessing; multiprocessing.__file__
'/usr/local/lib/python2.7/multiprocessing/__init__.pyc'
>>> import _multiprocessing; _multiprocessing.__file__
'/usr/local/lib/python2.7/lib-dynload/_multiprocessing.so'
When running with 2.6, all modules are whatever's available for 10.04 except for the multiprocessing that I took from trunk:
>>> import threading; threading.__file__
'/usr/lib/python2.6/threading.pyc'
>>> import multiprocessing; multiprocessing.__file__
'multiprocessing/__init__.pyc'
>>> import _multiprocessing; _multiprocessing.__file__
'/usr/lib/python2.6/lib-dynload/_multiprocessing.so'
Sorry about the confusion--let me know if you'd like additional information. I can test on other platforms/with other configurations if it would be useful. |
Alright, I'm fighting ubuntu 64 bit in my vmware install right now, I'll see if I can get it up and running. |
I can confirm with a clean ubuntu 64 install, with a clean checkout of release27 that it explodes with that exception, while the stock 2.6.5 does not. |
It does not seem to appear on OS/X 10.6.4 - so the only question is does this show up on Ubuntu 32bit |
Correction; it can and does happen on OS/X. So, this is not a platform specific bug. |
Yeah, I've just taken a checkout from trunk, ran './configure && make && make install', and reproduced on:
|
Greg, can you comment out line 272 in Lib/multiprocessing/pool.py and tell me if you can reproduce? |
With the line commented out, I no longer see any exceptions. Although, if I understand what's going on, there still a (much rarer) possibility of an exception, right? I guess in the common case, the worker_handler is in the sleep when shutdown begins. But if it happens to be in the in the _maintain_pool step, would you still get these exceptions? |
I'm not sure if there would still be the possibility; the thing which worries me is the debug() function vanishing on us - something not good is happening on interpreter shutdown. |
Think http://www.mail-archive.com/python-list@python.org/msg282114.html is relevant? |
Greg - yeah. it's the same problem. |
Talking with Brett; the fix should be as simple as keeping a reference to the debug function which we have in the imports. During interpreter shutdown, the sys.modules is iterated and each module replaced with None. Since the _handle_workers thread persists slightly past the point of the parent (and can, it's a daemon thread) debug is vanishing on us. We can go with switching this to a classmethod, and keeping a reference on the class, passing debug directly into the _handle_workers thread (testing this last night fixed it 100% of the time) |
With pool.py:272 commented out, running about 50k iterations, I saw 4 tracebacks giving an exception on pool.py:152. So this seems to imply the race does exist (i.e. that the thread is in _maintain_pool rather than time.sleep when shutdown begins). It looks like the _maintain_pool run takes O(10^-4)s, so it's not surprising the error is so rare. That being said, the patch I submitted in bpo-9205 should handle this case as well. |
Thank you for doing that footwork Greg, it means a lot to me. I'm leaning towards the patch to swallow the errors - I just wanted to ponder it just a tiny bit longer before I pull the trigger. |
It looks to me as if this issue has already been pretty much sorted out already. Maybe all it lacks is to be officially closed, but just in case I just wanted to add that I too saw this bug (stock python 2.7, Ubuntu 10.04 64 bit). My example code was: #!/usr/bin/env python import multiprocessing
import os
import time
def f(i):
print "I am process number",os.getpid(),": i =",i
time.sleep(10)
return i*i
pool = multiprocessing.Pool(maxtasksperchild=1) print pool.map(f, range(10)) |
Oh, and the stack trace was identical to Greg's: $ ./test.py
I am process number 10378 : i = 0
[...]
I am process number 10390 : i = 9
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
Exception in thread Thread-1 (most likely raised during interpreter shutdown):
Traceback (most recent call last):
File "/gel/usr/mawal32/system/lib/python2.7/threading.py", line 530, in __bootstrap_inner
File "/gel/usr/mawal32/system/lib/python2.7/threading.py", line 483, in run
File "/gel/usr/mawal32/system/lib/python2.7/multiprocessing/pool.py", line 272, in _handle_workers
<type 'exceptions.TypeError'>: 'NoneType' object is not callable |
Closing this issue after having verified that the issue can no longer be reproduced on the systems mentioned (Ubuntu 10.04 or OSX). Related issues such as bpo-9205 have been addressed elsewhere and other possibly related issues such as bpo-22393 are being tracked separately. It appears Jesse's patches in 2010 (though not explicitly added to this issue here) have indeed addressed this issue. Using OS X 10.10: Using Matt Walker's supplied test script (from a much later post here), I ran his script 100 times in tight succession but was unable to provoke the issue on OS X 10.10 using either the Anaconda build of Python 2.7.9 or a local build from the default (3.5.0a1+) branch. Side note: running Matt's test script kinda takes a while ("time.sleep(10)" seriously?), especially at 100 iterations. Using Ubuntu 10.04 (64-bit):
Even if it's a little late in saying: thanks goes to the reporters, Greg and Matt, and to Jesse for implementing the fix. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: