classification
Title: test_multiprocessing fails consistently with 'signal 12' on FreeBSD 6.2 buildbot.
Type: behavior Stage: patch review
Components: Tests Versions: Python 3.1, Python 3.2, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: mark.dickinson Nosy List: db3l, jnoller, mark.dickinson
Priority: normal Keywords: buildbot, patch

Created on 2009-11-06 15:57 by mark.dickinson, last changed 2009-11-28 21:07 by db3l. This issue is now closed.

Files
File name Uploaded Description Edit
freebsd_multiprocessing.patch mark.dickinson, 2009-11-20 16:18
Messages (19)
msg94977 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-11-06 15:57
The x86 FreeBSD buildslave is consistently aborting the test run with a 
'Signal 12' failure in test_multiprocessing.  See e.g.,

http://www.python.org/dev/buildbot/builders/x86%20FreeBSD%20trunk/builds/2
756/steps/test/logs/stdio

(scroll all the way to the bottom to see the failure).

The failure occurs on 2.7, 3.1 and 3.2.  On 2.6, it looks as though 
test_multiprocessing is disabled:  the message gives a reference to issue 
3770.

Note that the maintainer of the buildbot (David Bolen) has offered to 
arrange ssh access for anyone wanting to look into this.  (See

http://mail.python.org/pipermail/python-dev/2009-November/093857.html

.)
msg94978 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-11-06 15:58
I mangled the buildbot results URL.  Here it is again.

http://www.python.org/dev/buildbot/builders/x86%20FreeBSD%20trunk/builds/2756/steps/test/logs/stdio
msg94979 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-11-06 16:05
I'm not sure whether this is relevant, but the configure output for the 
FreeBSD trunk build includes the line:

checking for broken sem_getvalue... yes
msg95545 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-11-20 14:01
I had an opportunity to play with a FreeBSD 7.2 box recently.

The diagnosis is simple.  The solution may be less so...

Diagnosis:  FreeBSD still considers POSIX semaphores (sem_open,
sem_close, etc.) to be experimental, so they're not enabled by default
on a standard install.  So the very first call to sem_open, from
SEM_CREATE in semaphore.c (around line 439), produces the Signal 12.

Enabling POSIX semaphores (assuming that they've been built into the
kernel, which they seem to have been by default) is as simple as
executing 'kldload sem' (as root) at a shell prompt.  After I did this,
test_multiprocessing ran and all tests passed.

So the question is what multiprocessing (and test_multiprocessing)
should do when POSIX semaphores aren't available.  I guess the options
are: (1) fail loudly, with an error message telling the user to enable
the POSIX semaphores, or (2) try to use SysV semaphores (which are
supported out of the box) instead.

In the immediate future we should probably at least detect when POSIX
semaphores don't exist, and skip test_multiprocessing in that case.
msg95548 - (view) Author: Jesse Noller (jnoller) * (Python committer) Date: 2009-11-20 14:23
Thanks mark - if POSIX semaphores aren't available, we're largely dead-
on-arrival. I thought FBSD was at the point of enabling them by default 
- boo on me. See also: http://bugs.python.org/issue3770#msg73958 and 
http://bugs.python.org/issue3770#msg83495 - with martin's help last 
pycon I cut a lot of the "HAVE_BROKEN..." values to autoconf checks in 
configure.in.

I think the right way (however unpleasant) is to check to see if we have 
POSIX semaphores, period - if we don't, _multiprocessing should just not 
compile.
msg95553 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-11-20 16:18
Here's a patch that makes FreeBSD behave exactly as though HAVE_SEM_OPEN
is not defined, when semaphores aren't available.  On my FreeBSD 7.2
test system, it results in the multiprocessing module being built
(without the contents of semaphore.c), but the test is skipped, along
with a message indicating that semaphores don't work on that system.

I added an autoconf test for the sem_open failure, and reused the
already existing pyconfig.h variable HAVE_BROKEN_POSIX_SEMAPHORES.
msg95555 - (view) Author: Jesse Noller (jnoller) * (Python committer) Date: 2009-11-20 16:58
Looks good so far to me - I'll apply and run the tests locally (but I 
don't have a fbsd box, so I'm just checking for regressions). One question 
- what's with all the 

-rm -f -r conftest*
+rm -f conftest*

Lines in there?
msg95556 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-11-20 17:05
About the rm -f -r conftest stuff:

My guess is that the last person to update configure used Apple's 
version of autoconf:  Apple seems to have silently 'fixed' autoconf 
version 2.61 to remove some (fairly benign) warnings that appear when 
running the configure script.  Which is fine, but it would be nice if 
the version string for their autoconf gave some indication that it had 
been fixed.

In any case, rm -f is what comes from standard autoconf 2.61, so I think 
that's what should be there.
msg95557 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-11-20 17:08
One other thought:  if this is applied, would it make sense to ask the 
FreeBSD buildbot maintainer to then enable the POSIX semaphores for 
FreeBSD 7.2 (but probably not for 6.4)?  It looks like FreeBSD 8.0 is just 
around the corner, and the rumours are that it'll have these semaphores 
enabled by default;  it would be nice to know about any multiprocessing 
failures on FreeBSD, so that we can be reasonably sure it'll work on 8.0.
msg95558 - (view) Author: Jesse Noller (jnoller) * (Python committer) Date: 2009-11-20 17:11
On Fri, Nov 20, 2009 at 12:08 PM, Mark Dickinson <report@bugs.python.org> wrote:
> One other thought:  if this is applied, would it make sense to ask the
> FreeBSD buildbot maintainer to then enable the POSIX semaphores for
> FreeBSD 7.2 (but probably not for 6.4)?  It looks like FreeBSD 8.0 is just
> around the corner, and the rumours are that it'll have these semaphores
> enabled by default;  it would be nice to know about any multiprocessing
> failures on FreeBSD, so that we can be reasonably sure it'll work on 8.0.

Agreed
msg95559 - (view) Author: Jesse Noller (jnoller) * (Python committer) Date: 2009-11-20 17:49
Mark, the patch looks ok on os/x and fedora core 12. Nothing jumps out at 
me as incorrect. I'm for committing and watching the BSD buildbot.
msg95561 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-11-20 19:42
Thanks, Jesse.  Applied in r76432 (trunk).  Unfortunately the buildbots 
are all purple right now;  once they're building again, and the FreeBSD 
buildbots have gone green (/me crosses fingers) I'll merge to py3k.
msg95573 - (view) Author: David Bolen (db3l) Date: 2009-11-21 00:15
Looks like some sort of master side global rebuild was initiated but
without the proper SVN information.  But I see a rebuild on 7.2 with
this patch revision that looks like it worked (still failed, but with a
different reason)

I'm not that familiar with the test harness, but would it be possible to
get test_multiprocessing to log an error when it has to be skipped (like
other tests that fail to find supporting modules and what not), so as to
highlight it in the log?  It might even warn about what to do to fix the
behavior for someone running the tests.

In any event though, I'm fine with enabling the support on the 7.2
buildbot (I'll stick it in loader.conf so I don't have to remember after
a reboot) if we're past any point of wanting to check how the test
behaves without them.  Or I guess the 6.4 buildbot can continue to serve
that purpose, right?
msg95581 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-11-21 14:00
> I'm not that familiar with the test harness, but would it be possible to
> get test_multiprocessing to log an error when it has to be skipped.

Well, there should be a skip message next to the test_multiprocessing line 
in the results.  I'm not sure whether that's the sort of thing you mean.

Unfortunately it looks like both runs (6.4 and 7.2) were prematurely 
terminated by test_curses before they even got as far as 
test_multiprocessing.  I might try running them again and hope for a 
different random test ordering.

I'm also seeing warnings about HAVE_BROKEN_POSIX_SEMAPHORES being 
redefined, in Python/thread_pthread.h;  I'll take another look at this 
next week (no access to the FreeBSD machine at the moment).
msg95582 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-11-21 14:01
> In any event though, I'm fine with enabling the support on the 7.2
> buildbot (I'll stick it in loader.conf so I don't have to remember 
after
a reboot)

That would be great---thank you!

> if we're past any point of wanting to check how the test
> behaves without them.

I'm not sure we are, just yet.  But soon... :)
msg95785 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-11-28 10:27
Hmm.  It seems that hijacking the existing HAVE_BROKEN_POSIX_SEMAPHORES 
wasn't a good idea.  I was surprised to find that OS X defines 
_POSIX_SEMAPHORES to -1, indicating a lack of POSIX semaphore support.
msg95786 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-11-28 10:45
Variable clash should be fixed in r76558.
msg95787 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-11-28 12:54
Merged to py3k, release31-maint in r76566, r76567.

David, I think we're ready to enable POSIX semaphore support on the 
FreeBSD 7.2 buildbot now, if you get the chance.
msg95799 - (view) Author: David Bolen (db3l) Date: 2009-11-28 21:07
> David, I think we're ready to enable POSIX semaphore support on the 
> FreeBSD 7.2 buildbot now, if you get the chance.

Done.  I'll double check that the module remains loaded across restarts
when there's some idle time for a restart.
History
Date User Action Args
2009-11-28 21:07:20db3lsetmessages: + msg95799
2009-11-28 12:54:25mark.dickinsonsetstatus: open -> closed
resolution: fixed
messages: + msg95787
2009-11-28 10:45:03mark.dickinsonsetmessages: + msg95786
2009-11-28 10:27:31mark.dickinsonsetmessages: + msg95785
2009-11-21 14:01:37mark.dickinsonsetmessages: + msg95582
2009-11-21 14:00:06mark.dickinsonsetassignee: mark.dickinson
messages: + msg95581
2009-11-21 00:15:45db3lsetmessages: + msg95573
2009-11-20 19:42:26mark.dickinsonsetmessages: + msg95561
2009-11-20 17:49:40jnollersetmessages: + msg95559
2009-11-20 17:11:47jnollersetmessages: + msg95558
2009-11-20 17:08:41mark.dickinsonsetmessages: + msg95557
2009-11-20 17:05:16mark.dickinsonsetmessages: + msg95556
2009-11-20 16:58:27jnollersetmessages: + msg95555
2009-11-20 16:18:53mark.dickinsonsetstage: patch review
2009-11-20 16:18:38mark.dickinsonsetfiles: + freebsd_multiprocessing.patch
keywords: + patch
messages: + msg95553
2009-11-20 14:23:03jnollersetmessages: + msg95548
2009-11-20 14:01:58mark.dickinsonsetmessages: + msg95545
2009-11-06 16:05:06mark.dickinsonsetmessages: + msg94979
2009-11-06 15:58:48mark.dickinsonsetmessages: + msg94978
2009-11-06 15:57:12mark.dickinsoncreate