classification
Title: multiprocessing occasionally spits out exception during shutdown (_handle_workers)
Type: behavior Stage:
Components: Library (Lib) Versions: Python 2.7, Python 2.6
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: jnoller Nosy List: Albert.Strasheim, asksol, gdb, gkcn, jnoller, mattgwwalker, pnasrat
Priority: normal Keywords:

Created on 2010-07-08 20:05 by gdb, last changed 2012-01-30 06:02 by pnasrat.

Messages (22)
msg109588 - (view) Author: Greg Brockman (gdb) Date: 2010-07-08 20:05
On Ubuntu 10.04, using freshly-compiled python-from-trunk (as well as multiprocessing-from-trunk), I get tracebacks from the following about 30% of the time:

"""
import multiprocessing, time           
def foo(x):                            
 time.sleep(3)                         
multiprocessing.Pool(1).apply(foo, [1])
"""

My tracebacks are of the form:
"""
Exception in thread Thread-1 (most likely raised during interpreter shutdown):
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/threading.py", line 530, in __bootstrap_inner
  File "/usr/local/lib/python2.7/threading.py", line 483, in run
  File "/usr/local/lib/python2.7/multiprocessing/pool.py", line 272, in _handle_workers
<type 'exceptions.TypeError'>: 'NoneType' object is not callable
"""

This information was originally posted to http://bugs.python.org/issue4106.
msg109591 - (view) Author: Jesse Noller (jnoller) * (Python committer) Date: 2010-07-08 20:26
Thanks greg; so this affects 2.6 as well (not using the backport at all)
msg109598 - (view) Author: Greg Brockman (gdb) Date: 2010-07-08 20:42
That's likely a mistake on my part.  I'm not observing this using the stock version of multiprocessing on my Ubuntu machine(after running O(100) times).  I do, however, observe it when using either python2.7 or python2.6 with multiprocessing-from-trunk, if that's interesting.

I'm not really sure what the convention is here; should this be filed just under Python 2.7?  Thanks.
msg109601 - (view) Author: Jesse Noller (jnoller) * (Python committer) Date: 2010-07-08 20:58
Oh, you mean the backport from google code?

The person who stepped up to maintain that has not refreshed that in some time. I need to decide what to do with it long term. I'm pretty sure it's badly out of date.
msg109615 - (view) Author: Greg Brockman (gdb) Date: 2010-07-08 21:33
No, I'm not using the Google code backport.

To be clear, I've tried testing this with two versions of multiprocessing:
- multiprocessing-from-trunk (r82645): I get these exceptions with ~40% frequency
- multiprocessing from Ubuntu 10.04 (version 0.70a1): No such exceptions observed

Out of curiosity, I did just try this with the processing library (version 0.52) on a 64-bit Debian Lenny box, and did not observe these exceptions.

Hope that's useful!
msg109632 - (view) Author: Jesse Noller (jnoller) * (Python committer) Date: 2010-07-08 22:41
Wait - so, you are pulling svn trunk, compiling and running your test with the built python executable? I'm not following the "multiprocessing-from-trunk" distinction unless you're picking the module out of the tree / compiling it and then moving it into some other install. I might be being overly dense.

You're running your test with cd src/tree/ && ./python <your thing> - right?

Also, what, if any, compile flags are you passing to the python build?
msg109643 - (view) Author: Greg Brockman (gdb) Date: 2010-07-08 23:12
> Wait - so, you are pulling svn trunk, compiling and running your test 
> with the built python executable?
Yes.  I initially observed this issue while using 10.04's Python (2.6.5), but wanted to make sure it wasn't fixed by using a newer interpreter.

> I'm not following the "multiprocessing-from-trunk" distinction unless > you're picking the module out of the tree / compiling it and then 
> moving it into some other install. I might be being overly dense.
Initially I was doing exactly that.  (Some context: I was working on a patch to fix a different multiprocessing issue, and figured I may as well write my patch against the most recent version of the library.)  Note that I was using Lucid's _multiprocessing, so there was no compilation involved.

> You're running your test with cd src/tree/ && ./python <your thing> - > right?
What... is src/tree?  If it's what you're asking, I am running the freshly-compiled python interpreter, and it does seem to be using the relevant modules out of trunk:
>>> import threading; threading.__file__
'/usr/local/lib/python2.7/threading.pyc'
>>> import multiprocessing; multiprocessing.__file__
'/usr/local/lib/python2.7/multiprocessing/__init__.pyc'
>>> import _multiprocessing; _multiprocessing.__file__
'/usr/local/lib/python2.7/lib-dynload/_multiprocessing.so'

When running with 2.6, all modules are whatever's available for 10.04 except for the multiprocessing that I took from trunk:
>>> import threading; threading.__file__
'/usr/lib/python2.6/threading.pyc'
>>> import multiprocessing; multiprocessing.__file__
'multiprocessing/__init__.pyc'
>>> import _multiprocessing; _multiprocessing.__file__
'/usr/lib/python2.6/lib-dynload/_multiprocessing.so'

> Also, what, if any, compile flags are you passing to the python build?
I just ran ./configure && make && make install

Sorry about the confusion--let me know if you'd like additional information.  I can test on other platforms/with other configurations if it would be useful.
msg109645 - (view) Author: Jesse Noller (jnoller) * (Python committer) Date: 2010-07-08 23:27
Alright, I'm fighting ubuntu 64 bit in my vmware install right now, I'll see if I can get it up and running.
msg109663 - (view) Author: Jesse Noller (jnoller) * (Python committer) Date: 2010-07-09 01:58
I can confirm with a clean ubuntu 64 install, with a clean checkout of release27 that it explodes with that exception, while the stock 2.6.5 does not.
msg109665 - (view) Author: Jesse Noller (jnoller) * (Python committer) Date: 2010-07-09 02:03
It does not seem to appear on OS/X 10.6.4 - so the only question is does this show up on Ubuntu 32bit
msg109666 - (view) Author: Jesse Noller (jnoller) * (Python committer) Date: 2010-07-09 02:12
Correction; it can and does happen on OS/X. So, this is not a platform specific bug.
msg109667 - (view) Author: Greg Brockman (gdb) Date: 2010-07-09 02:26
Yeah, I've just taken a checkout from trunk, ran './configure && make && make install', and reproduced on:

- Ubuntu 10.04 32-bit
- Ubuntu 9.04 32-bit
msg109668 - (view) Author: Jesse Noller (jnoller) * (Python committer) Date: 2010-07-09 02:31
Greg, can you comment out line 272 in Lib/multiprocessing/pool.py and tell me if you can reproduce?
msg109669 - (view) Author: Greg Brockman (gdb) Date: 2010-07-09 02:53
With the line commented out, I no longer see any exceptions.

Although, if I understand what's going on, there still a (much rarer) possibility of an exception, right?  I guess in the common case, the worker_handler is in the sleep when shutdown begins.  But if it happens to be in the in the _maintain_pool step, would you still get these exceptions?
msg109711 - (view) Author: Jesse Noller (jnoller) * (Python committer) Date: 2010-07-09 11:13
I'm not sure if there would still be the possibility; the thing which worries me is the debug() function vanishing on us - something not good is happening on interpreter shutdown.
msg109736 - (view) Author: Greg Brockman (gdb) Date: 2010-07-09 13:41
Think http://www.mail-archive.com/python-list@python.org/msg282114.html is relevant?
msg109737 - (view) Author: Jesse Noller (jnoller) * (Python committer) Date: 2010-07-09 13:47
Greg - yeah. it's the same problem.
msg109866 - (view) Author: Jesse Noller (jnoller) * (Python committer) Date: 2010-07-10 12:46
Talking with Brett; the fix should be as simple as keeping a reference to the debug function which we have in the imports. During interpreter shutdown, the sys.modules is iterated and each module replaced with None. Since the _handle_workers thread persists slightly past the point of the parent (and can, it's a daemon thread) debug is vanishing on us.

We can go with switching this to a classmethod, and keeping a reference on the class, passing debug directly into the _handle_workers thread (testing this last night fixed it 100% of the time)
msg110115 - (view) Author: Greg Brockman (gdb) Date: 2010-07-12 17:45
With pool.py:272 commented out, running about 50k iterations, I saw 4 tracebacks giving an exception on pool.py:152.  So this seems to imply the race does exist (i.e. that the thread is in _maintain_pool rather than time.sleep when shutdown begins).  It looks like the _maintain_pool run takes O(10^-4)s, so it's not surprising the error is so rare.

That being said, the patch I submitted in issue 9205 should handle this case as well.
msg110116 - (view) Author: Jesse Noller (jnoller) * (Python committer) Date: 2010-07-12 18:02
Thank you for doing that footwork Greg, it means a lot to me. I'm leaning towards the patch to swallow the errors - I just wanted to ponder it just a tiny bit longer before I pull the trigger.
msg127929 - (view) Author: Matthew Walker (mattgwwalker) Date: 2011-02-04 19:57
It looks to me as if this issue has already been pretty much sorted out already.  Maybe all it lacks is to be officially closed, but just in case I just wanted to add that I too saw this bug (stock python 2.7, Ubuntu 10.04 64 bit).

My example code was:

#!/usr/bin/env python

import multiprocessing
import os
import time

def f(i):
    print "I am process number",os.getpid(),": i =",i
    time.sleep(10)
    return i*i

pool = multiprocessing.Pool(maxtasksperchild=1)

print pool.map(f, range(10))
msg127930 - (view) Author: Matthew Walker (mattgwwalker) Date: 2011-02-04 20:01
Oh, and the stack trace was identical to Greg's:

$ ./test.py 
I am process number 10378 : i = 0
[...]
I am process number 10390 : i = 9
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
Exception in thread Thread-1 (most likely raised during interpreter shutdown):
Traceback (most recent call last):
  File "/gel/usr/mawal32/system/lib/python2.7/threading.py", line 530, in __bootstrap_inner
  File "/gel/usr/mawal32/system/lib/python2.7/threading.py", line 483, in run
  File "/gel/usr/mawal32/system/lib/python2.7/multiprocessing/pool.py", line 272, in _handle_workers
<type 'exceptions.TypeError'>: 'NoneType' object is not callable
History
Date User Action Args
2012-01-30 06:02:38pnasratsetnosy: + pnasrat
2011-04-27 14:19:41gkcnsetnosy: + gkcn
2011-02-04 20:01:16mattgwwalkersetnosy: jnoller, asksol, gdb, Albert.Strasheim, mattgwwalker
messages: + msg127930
2011-02-04 19:57:13mattgwwalkersetnosy: + mattgwwalker
messages: + msg127929
2010-08-27 05:13:58Albert.Strasheimsetnosy: + Albert.Strasheim
2010-07-12 18:02:21jnollersetmessages: + msg110116
2010-07-12 17:45:12gdbsetmessages: + msg110115
2010-07-10 12:46:53jnollersetmessages: + msg109866
2010-07-09 14:08:56asksolsetnosy: + asksol
2010-07-09 13:47:20jnollersetmessages: + msg109737
2010-07-09 13:41:47gdbsetmessages: + msg109736
2010-07-09 11:13:45jnollersetmessages: + msg109711
2010-07-09 02:53:00gdbsetmessages: + msg109669
2010-07-09 02:31:46jnollersetmessages: + msg109668
2010-07-09 02:26:13gdbsetmessages: + msg109667
2010-07-09 02:12:05jnollersetmessages: + msg109666
2010-07-09 02:03:54jnollersetmessages: + msg109665
2010-07-09 01:58:36jnollersetmessages: + msg109663
2010-07-08 23:27:50jnollersetmessages: + msg109645
2010-07-08 23:12:22gdbsetmessages: + msg109643
2010-07-08 22:41:48jnollersetmessages: + msg109632
2010-07-08 21:33:16gdbsetmessages: + msg109615
2010-07-08 20:58:15jnollersetmessages: + msg109601
2010-07-08 20:42:51gdbsetmessages: + msg109598
2010-07-08 20:32:18jnollersettitle: multiprocessing occasionally spits out exception during shutdown -> multiprocessing occasionally spits out exception during shutdown (_handle_workers)
2010-07-08 20:26:29jnollersetassignee: jnoller

messages: + msg109591
nosy: + jnoller
2010-07-08 20:05:32gdbcreate