This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author paulmelis
Recipients Rhamphoryncus, amaury.forgeotdarc, barry, benjamin.peterson, donmez, gvanrossum, jnoller, mark.dickinson, paulmelis, roudkerk, tebeka
Date 2008-07-03.08:14:36
SpamBayes Score 0.017715426
Marked as misclassified No
Message-id <1215072879.46.0.296280487748.issue3088@psf.upfronthosting.co.za>
In-reply-to
Content
On a Linux system (FC4) with r64686 of the Py3k branch I also still get
occassional hangs (with ./python -E -bb ./Lib/test/regrtest.py -v
test_multiprocessing). Mostly this seems to occur with the very first
test executed, i.e. before any of the "test_... " lines have been generated.

The following may or may not be related. Some time ago I decided to give
valgrind a try to see if it could detect anything strange going on with
the multiprocessing tests, specifically using the 'helgrind'
thread-debugging tool that comes with it. 

Valgrind reports as its first error:

==9719== Thread #1: Bug in libpthread: sem_wait succeeded on semaphore
without prior sem_post
==9719==    at 0x4007FFF: sem_wait_WRK (hg_intercepts.c:1057)
==9719==    by 0x4008094: sem_wait@* (hg_intercepts.c:1073)
==9719==    by 0x46A0087: semlock_acquire (semaphore.c:310)
==9719==    by 0x808C121: PyEval_EvalFrameEx (ceval.c:3371)
==9719==    by 0x808D0FE: PyEval_EvalCodeEx (ceval.c:2808)
==9719==    by 0x808B9D0: PyEval_EvalFrameEx (ceval.c:3469)
==9719==    by 0x808D0FE: PyEval_EvalCodeEx (ceval.c:2808)
==9719==    by 0x80F4B65: function_call (funcobject.c:628)
==9719==    by 0x80D1207: PyObject_Call (abstract.c:2178)
==9719==    by 0x80890EC: PyEval_EvalFrameEx (ceval.c:3672)
==9719==    by 0x808C1A9: PyEval_EvalFrameEx (ceval.c:3459)
==9719==    by 0x808C1A9: PyEval_EvalFrameEx (ceval.c:3459)
==9716== Thread #1 is the program's root thread

I've been hesitant to report this as the claim that libpthread is broken
is pretty bold. I contacted the valgrind devs about this, see [1]. 
More recently, someone on the valgrind list reported problems that do
seem to indicate there are broken libpthreads out there (see [2]), as
this individual reports a semaphore wait not blocking where it should.

Could it be that the multiprocessing tests are exposing one or more bugs
in libpthread?

[1] http://thread.gmane.org/gmane.comp.debugging.valgrind/8345
[2] http://thread.gmane.org/gmane.comp.debugging.valgrind/8384
History
Date User Action Args
2008-07-03 08:14:40paulmelissetspambayes_score: 0.0177154 -> 0.017715426
recipients: + paulmelis, gvanrossum, barry, amaury.forgeotdarc, tebeka, mark.dickinson, Rhamphoryncus, donmez, roudkerk, benjamin.peterson, jnoller
2008-07-03 08:14:39paulmelissetspambayes_score: 0.0177154 -> 0.0177154
messageid: <1215072879.46.0.296280487748.issue3088@psf.upfronthosting.co.za>
2008-07-03 08:14:38paulmelislinkissue3088 messages
2008-07-03 08:14:36paulmeliscreate