New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test_multiprocessing hangs intermittently on POSIX platforms #47338
Comments
For me, test_multiprocessing hangs consistently on OS X 10.5.3. It |
I can confirm this on Leopard too. |
It passes for me on Leopard. Can you post the test running in verbose |
On python-3000 trunk, _multiprocessing doesn't even compile: /Users/jesse/open_source/subversion/python- |
-----BEGIN PGP SIGNED MESSAGE----- On Jun 12, 2008, at 9:02 AM, Benjamin Peterson wrote:
It never hangs when run standalone, though it crashes about half the The hang occurs during 'make test', and it's always the second run
-----BEGIN PGP SIGNATURE----- iQCVAwUBSFEicXEjvBPtnXfVAQLGNwP/S6f2IrO7c7SET0Gx8FXqdPmot3jcmopx |
I did a make clean && ./configure && make and it started compiling for me |
If it's only failing during the second run of "make test", typically If it works for some folks and not for others, on the same platform, Finding this is usually a painful process of bisecting the set of tests FWIW, when I tried (on Leopard) "make test ====================================================================== Traceback (most recent call last):
File "/Users/guido/p/Lib/test/test_multiprocessing.py", line 1167, in
test_remote
queue = manager2.get_queue()
File "/Users/guido/p/Lib/multiprocessing/managers.py", line 650, in temp
authkey=self._authkey, exposed=exp
File "/Users/guido/p/Lib/multiprocessing/managers.py", line 902, in
AutoProxy
incref=incref)
File "/Users/guido/p/Lib/multiprocessing/managers.py", line 711, in
__init__
self._incref()
File "/Users/guido/p/Lib/multiprocessing/managers.py", line 758, in
_incref
dispatch(conn, None, 'incref', (self._id,))
File "/Users/guido/p/Lib/multiprocessing/managers.py", line 94, in
dispatch
raise convert_to_error(kind, result)
RemoteError:
---------------------------------------------------------------------------
Traceback (most recent call last):
File "/Users/guido/p/Lib/multiprocessing/managers.py", line 196, in
handle_request
result = func(c, *args, **kwds)
File "/Users/guido/p/Lib/multiprocessing/managers.py", line 412, in incref
self.id_to_refcount[ident] += 1
KeyError: '5f2828' |
I should add this was in the trunk (2.6). |
I think I'm having a similar lockup on fedora core 4 (smp machine). This |
After a few more runs with -v and redirecting output to a file it seems |
It's taking me longer to get to this than I planned, any help is |
I can get an intermittent (1 every 15 or so runs) lock in: Executed like this: When I control-c it the stack looks like this: I'm not seeing frequent locks/failures when run with regrtest, but I am I've attached full output. Still trying to figure it out |
I made a copy of test_multiprocessing.py (to test_mp.py) and basically 9:57|paul@tabu:~/c/py3k-svn> make test TESTOPTS="-v test_mp" Failed to find the necessary bits to build these modules: find ./Lib -name '*.py[co]' -print | xargs rm -f ---------------------------------------------------------------------- OK
1 test OK.
CAUTION: stdout isn't compared in verbose mode:
a test that passes in verbose mode may fail without it.
./python -E -bb ./Lib/test/regrtest.py -v test_mp
test_mp
test_notify_all (test.test_mp.WithProcessesTestCondition) ... ok
test_notify_all (test.test_mp.WithThreadsTestCondition) ... ok
test_notify_all (test.test_mp.WithManagerTestCondition) ... Exception in
thread Thread-28:
Traceback (most recent call last):
File "/home/paul/c/py3k-svn/Lib/threading.py", line 492, in
_bootstrap_inner
self.run()
File "/home/paul/c/py3k-svn/Lib/threading.py", line 447, in run
self._target(*self._args, **self._kwargs)
File "/home/paul/c/py3k-svn/Lib/test/test_mp.py", line 208, in f
cond.wait(timeout)
File "/home/paul/c/py3k-svn/Lib/multiprocessing/managers.py", line
973, in wait
return self._callmethod('wait', (timeout,))
File "/home/paul/c/py3k-svn/Lib/multiprocessing/managers.py", line
748, in _callmethod
raise convert_to_error(kind, result)
RuntimeError: cannot wait on un-aquired lock |
FWIW: In order to boost the logging level within the test(s) do the Search for LOG_LEVEL, set it to: And then in the main() replace: |
I also isolated the test(s) like Paul did, and it looks like a semi- This is running only the test_event test. The racquire traces back to |
Seems to work fine for me now with latest py3k branch. |
I suspect the problems with WithManagerTestCondition.notify_all() may |
After talking with Richard, I think the best way to attack this issue I removed the more unreliable test cases while keeping the core ones and I ran the same thing on trunk and py3k just to make sure I could not get |
Here is the loop I ran the tests with: #!/bin/sh for (( i=1;i<=100;i+=1 )); do |
I don't have commit rights, so I can't apply the test_multiprocessing_reduced.diff myself. Anyone willing? I think this |
I'm going to knock this one down to critical since it's working for me |
Still hangs for me on the 2.6 trunk on Ubuntu 8.04 |
Where exactly does it hang Miki? |
Jesse, I just run "make test", it runs until test_multiprocessing and then |
test_multiprocessing is also still hanging for me, perhaps 30% of the When running the test by itself it seems to pass much more often, but Macintosh-3:trunk dickinsm$ ./python.exe |
I think I narrowed the problem to a race condition in *subclasses* of |
On Wed, Jul 2, 2008 at 5:08 PM, Mark Dickinson <report@bugs.python.org> wrote:
Are you sure that's right? That traceback has no mention of |
I just run "make test" and it never moves past test_multiprocessing. Maybe it's my machine which is dual cpu quad core (total of 8 cores)? |
Doubtful Miki - I do the work on the module on an 8 Core Gentoo, 8 Core |
Not at all. :-)
The date and time on the core file look right (Jul 2, 23:52 GMT+1), and |
Here's a new traceback (a different error again, this time: a negative |
That looks better. It crashed while deleting an exception, who's args Things to try:
|
Also, make sure you do a "make clean" since you last updated the tree or |
Barring the segfaults Mark is seeing, I went through and removed all of |
The two tracebacks provided by Mark seem to correspond to the following Lib/test/test_multiprocessing.py, line 1005, in _test_map_unordered |
The test hanged for me at first try but worked fine on the second test, |
On a Linux system (FC4) with r64686 of the Py3k branch I also still get The following may or may not be related. Some time ago I decided to give Valgrind reports as its first error: ==9719== Thread #1: Bug in libpthread: sem_wait succeeded on semaphore I've been hesitant to report this as the claim that libpthread is broken Could it be that the multiprocessing tests are exposing one or more bugs [1] http://thread.gmane.org/gmane.comp.debugging.valgrind/8345 |
So now I think that the traceback was right. There was no mention of I've attached another traceback, showing all the threads, and applying 'tb |
I managed to hang on Ubuntu, here is the backtrace that I got with CTRL-C: Process PoolWorker-5:1:
Traceback (most recent call last):
File "/home/cartman/Sources/py3k/Lib/multiprocessing/process.py", line
232, in _bootstrap
test_bsddb test_bsddb3 test_cProfile test_kqueue test_lib2to3
2 skips unexpected on linux2:
test_bsddb3 test_bsddb
Process PoolWorker-5:3:
Traceback (most recent call last):
File "/home/cartman/Sources/py3k/Lib/multiprocessing/process.py", line
232, in _bootstrap
self.run()
File "/home/cartman/Sources/py3k/Lib/multiprocessing/process.py", line
88, in run
self._target(*self._args, **self._kwargs)
File "/home/cartman/Sources/py3k/Lib/multiprocessing/pool.py", line
57, in worker
self.run()
File "/home/cartman/Sources/py3k/Lib/multiprocessing/process.py", line
88, in run
self._target(*self._args, **self._kwargs)
File "/home/cartman/Sources/py3k/Lib/multiprocessing/pool.py", line
57, in worker
task = get()
File "/home/cartman/Sources/py3k/Lib/multiprocessing/queues.py", line
339, in get
task = get()
File "/home/cartman/Sources/py3k/Lib/multiprocessing/queues.py", line
337, in get
return recv()
File "/home/cartman/Sources/py3k/Lib/pickle.py", line 1327, in loads
racquire()
KeyboardInterrupt
Process PoolWorker-5:2:
Traceback (most recent call last):
File "/home/cartman/Sources/py3k/Lib/multiprocessing/process.py", line
232, in _bootstrap
self.run()
File "/home/cartman/Sources/py3k/Lib/multiprocessing/process.py", line
88, in run
self._target(*self._args, **self._kwargs)
File "/home/cartman/Sources/py3k/Lib/multiprocessing/pool.py", line
57, in worker
def loads(s, *, encoding="ASCII", errors="strict"):
KeyboardInterrupt
Process PoolWorker-5:4:
Traceback (most recent call last):
File "/home/cartman/Sources/py3k/Lib/multiprocessing/process.py", line
232, in _bootstrap
self.run()
File "/home/cartman/Sources/py3k/Lib/multiprocessing/process.py", line
88, in run
self._target(*self._args, **self._kwargs)
File "/home/cartman/Sources/py3k/Lib/multiprocessing/pool.py", line
57, in worker
task = get()
File "/home/cartman/Sources/py3k/Lib/multiprocessing/queues.py", line
337, in get
racquire()
KeyboardInterrupt
task = get()
File "/home/cartman/Sources/py3k/Lib/multiprocessing/queues.py", line
337, in get
racquire()
KeyboardInterrupt
^CError in atexit._run_exitfuncs:
make: *** [testall] Segmentation fault |
I found that on my Debian64, running test_multiprocessing under gdb And it appears that the problem is described in bpo-874900: "threading |
Thanks Amaury - I've been working through the tests and identifying |
I'm still seeing intermittent lockups on Ubuntu 7.10 - traceback on Since Jesse seems to be on top of this, I'll stick to using -x |
Bumping back to release blocker for beta 2 (Barry may choose to defer it |
Updated issue title to more accurately reflect scope of the problem. |
I forgot to mention that I am seeing the intermittent hangs on the trunk |
Sadly _multiprocessing apparently doesn't even build on my Ubuntu 8.04 |
On Jul 15, 2008, at 8:38 PM, "Barry A. Warsaw"
There is no reason it shouldn't compile on ubuntu - without the patch |
Barry - can you email the compile errors? |
Something's very strange. The first make after configure fails to build |
Here's the 'make' output. What's strange is that moving building '_multiprocessing' extension |
bpo-874900's patch seems to have resolve the hangs. I am closing this |
I confirm this is solved for me in beta 2 |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: