classification
Title: expose pthread_sigmask(), pthread_kill(), sigpending() and sigwait() in the signal module
Type: enhancement Stage:
Components: Extension Modules, Library (Lib) Versions: Python 3.3
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: exarkun Nosy List: benjamin.peterson, exarkun, giampaolo.rodola, gregory.p.smith, haypo, jcea, loewis, marcin.bachry, neologix, pitrou, python-dev, schmichael, skrah, smarnach, spiv
Priority: normal Keywords: needs review, patch

Created on 2010-04-15 05:00 by exarkun, last changed 2012-03-30 23:24 by smarnach. This issue is now closed.

Files
File name Uploaded Description Edit
signal_pthread_sigmask-2.patch haypo, 2011-04-20 00:49 review
signalfd-3.patch haypo, 2011-05-02 15:20 review
pending_signals-3.patch haypo, 2011-05-07 10:00 review
wakeup_signum.patch haypo, 2011-05-07 10:07 review
signalfd-4.patch haypo, 2011-05-08 00:31 review
sig_wakeup_race.diff neologix, 2011-05-24 16:15 patch for wakeup FD race review
sigwait_gil.diff neologix, 2011-06-09 17:17
test_sigwait.diff neologix, 2011-06-10 09:30
Messages (68)
msg103182 - (view) Author: Jean-Paul Calderone (exarkun) * (Python committer) Date: 2010-04-15 05:00
Linux offers the signalfd syscall since 2.6.22 (with minor changes afterwards).  This call allows signals to be handled as bytes read out of a file descriptor, rather than as interruptions to the flow of a program.  Quite usefully, this file descriptor can be select()'d on (or poll()'d, epoll()'d, etc) alongside other "normal" file descriptors.

In order to effectively use signalfd(), the signals in question must be blocked, though.  So it makes sense to expose sigprocmask(2) at the same time, in order to allow this blocking to be set up.
msg103183 - (view) Author: Jean-Paul Calderone (exarkun) * (Python committer) Date: 2010-04-15 05:07
I've started on an implementation of this in /branches/signalfd-issue8407.
msg103184 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2010-04-15 05:16
Notice that 2.7 has seen its first beta release, so no new features are allowed for it. I think it's better to target this feature for 3.2.
msg103210 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-04-15 13:52
I'm with Martin, better to target 3.2 IMO.
Does signalfd really bring something compared to set_wakeup_fd()?
The one big difference I can see is that set_wakeup_fd() doesn't transmit the signal number, but this could be fixed if desired (instead of sending '\0', send a byte containing the signal number).
msg103326 - (view) Author: Jean-Paul Calderone (exarkun) * (Python committer) Date: 2010-04-16 13:52
> The one big difference I can see is that set_wakeup_fd() doesn't transmit the signal number, but this could be fixed if desired (instead of sending '\0', send a byte containing the signal number).

There's a lot more information than the signal number available as well.  The signalfd_siginfo struct has 16 fields in it now and may have more in the future.

Another advantage is that this approach allows all asynchronous preemption to be disabled.  This side-steps the possibility of hitting any signal bugs in CPython itself.  Of course, any such bugs that are found should be fixed, but fixing them is often quite challenging and involves lots of complex domain-specific knowledge.  In comparison, the signalfd and sigprocmask extensions are quite straight-forward and self-contained.

It's also possible to have several signalfds, each with a different signal mask.  set_wakeup_fd is limited to a single fd per-process.

sigprocmask has other uses all by itself, of course, like delaying the delivery of signals while some sensitive code runs.
msg104725 - (view) Author: Jean-Paul Calderone (exarkun) * (Python committer) Date: 2010-05-01 15:22
One open question regarding interaction with threading.  sigprocmask's behavior in a multithreaded program is unspecified.  pthread_sigmask should be used instead.  I could either expose both of these and let the caller choose, or I could make signal.sigprocmask use pthread_sigmask if it's available, and fall back to sigprocmask.  I don't see any disadvantages of doing the latter, and the former seems like it would create an attractive nuisance.  However, I don't have much experience with sigprocmask; I'm only interested in it because of its use in combination with signalfd.
msg105099 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-05-05 22:27
> pthread_sigmask should be used instead.  I could either expose both of > these and let the caller choose, or I could make signal.sigprocmask use > pthread_sigmask if it's available, and fall back to sigprocmask.

Or perhaps you could disable the feature if pthread_sigmask isn't available. Apparently, FreeBSD and Mac OS X have it, as well as Linux.
msg105100 - (view) Author: Tres Seaver (tseaver) * Date: 2010-05-05 22:28
Trying pthread_sigmask first, and falling back, seems like the right strategy to me.
msg105123 - (view) Author: Jean-Paul Calderone (exarkun) * (Python committer) Date: 2010-05-06 03:55
I think this is ready for a first review.  See <http://codereview.appspot.com/1132041>.  If everyone agrees this is inappropriate for 2.7, then I'll port the changes to 3.x.  I don't expect there to be much difference in the 3.x version.
msg105134 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-05-06 10:45
> If everyone agrees this is inappropriate for 2.7

I think the decision is up to Benjamin.
msg105162 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2010-05-06 20:19
Let's leave it for 3.2.
msg128979 - (view) Author: Michael Schurter (schmichael) Date: 2011-02-21 18:48
Any hopes of getting this into Python 3.3?
msg133977 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-04-18 16:00
sigprocmask or (better) pthread_sigmask is required to fix #11859 bug.

---

Python has a test for "broken pthread_sigmask". Extract of configure.in:

      AC_MSG_CHECKING(if PTHREAD_SCOPE_SYSTEM is supported)
      AC_CACHE_VAL(ac_cv_pthread_system_supported,
      [AC_RUN_IFELSE([AC_LANG_SOURCE([[#include <pthread.h>
      void *foo(void *parm) {
        return NULL;
      }
      main() {
        pthread_attr_t attr;
        pthread_t id;
        if (pthread_attr_init(&attr)) exit(-1);
        if (pthread_attr_setscope(&attr, PTHREAD_SCOPE_SYSTEM)) exit(-1);
        if (pthread_create(&id, &attr, foo, NULL)) exit(-1);
        exit(0);
      }]])],
      [ac_cv_pthread_system_supported=yes],
      [ac_cv_pthread_system_supported=no],
      [ac_cv_pthread_system_supported=no])
      ])
      AC_MSG_RESULT($ac_cv_pthread_system_supported)
      if test "$ac_cv_pthread_system_supported" = "yes"; then
        AC_DEFINE(PTHREAD_SYSTEM_SCHED_SUPPORTED, 1, [Defined if PTHREAD_SCOPE_SYSTEM supported.])
      fi
      AC_CHECK_FUNCS(pthread_sigmask,
        [case $ac_sys_system in
        CYGWIN*)
          AC_DEFINE(HAVE_BROKEN_PTHREAD_SIGMASK, 1,
            [Define if pthread_sigmask() does not work on your system.])
            ;;
        esac])

Extract of Python/thread_pthread.h:

/* On platforms that don't use standard POSIX threads pthread_sigmask()
 * isn't present.  DEC threads uses sigprocmask() instead as do most
 * other UNIX International compliant systems that don't have the full
 * pthread implementation.
 */
#if defined(HAVE_PTHREAD_SIGMASK) && !defined(HAVE_BROKEN_PTHREAD_SIGMASK)
#  define SET_THREAD_SIGMASK pthread_sigmask
#else
#  define SET_THREAD_SIGMASK sigprocmask
#endif

---

Because today more and more programs rely on threads, it is maybe not a good idea to provide a binding of sigprocmask(). I would prefer to only add pthread_sigmask() which has a determistic behaviour with threads. So only compile signal.pthread_sigmask() if pthread API is present and pthread_sigmask "is not broken".

---

About the patch: the doc should explain that the signal masks are inherited for child processes (fork() and execve()). I don't know if this behaviour is specific to Linux or not.

If we only use pthread_sigmask(), the doc is wrong: "Set the signal mask for the process." It's not for the process but only for the current thread.

How does it work if I change the signal mask of the main thread and then I create a new thread: the signal mask is inherited, or a default mask is used instead?

---

The new faulthandler uses a thread to implement a timeout: the thread uses pthread_sigmask() or sigprocmask() to ignore all signals. If I don't set the signal mask, some tests fail: check that a system call (like reading from a pipe) can be interrupted by signal. The problem is that signal may be send to the faulthandler thread, instead of the main thread. Hum, while I'm writing this, I realize that I should maybe not fallback to sigprocmask() because it may mask signals for the whole process (all threads)!
msg134111 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-04-20 00:22
signal_pthread_sigmask.patch:
 - add signal.pthread_sigmask() function with doc and tests
 - add SIG_BLOCK, SIG_UNBLOCK, SIG_SETMASK constants
 - fix #11859: fix tests of test_io using threads and an alarm: use pthread_sigmask() to ensure that only the main thread receives the SIGALRM signal

The code is based on the last version of python-signalfd:
https://code.launchpad.net/~exarkun/python-signalfd/trunk

Changes between python-signalfd and my patch:
 - rename "sigprocmask" function to "pthread_sigmask"
 - I added an unit test and the doc
 - catch PyIter_Next() error
 - set signum variable (the result of PyLong_AsLong) type to long (instead of int) and check its value (0 < signum < NSIG)
 - I adapted the code to my coding style :-)

I will work on a similar patch for signalfd() after the pthread_sigmask() patch is accepted.
msg134113 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-04-20 00:49
Oh, I forget to read again http://codereview.appspot.com/1132041.

Here is an updated patch reusing some tests and with a better documentation.
msg134861 - (view) Author: Roundup Robot (python-dev) Date: 2011-04-30 13:22
New changeset 28b9702a83d1 by Victor Stinner in branch 'default':
Issue #8407, issue #11859: Add signal.pthread_sigmask() function to fetch
http://hg.python.org/cpython/rev/28b9702a83d1
msg134865 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-04-30 14:40
signalfd.patch: Add signal.signalfd(), signal.SFD_CLOEXEC and signal.SFD_NONLOCK.

The patch is based on http://codereview.appspot.com/1132041 and the last version (bzr) of python-signalfd.

The patch uses also PyModule_AddIntMacro() for the 3 constants added in my last (pthread_sigmask), and changes pthread_sigmask() to raise an OSError instead of a RuntimeError.

Note: python-signalfd has a bug: it doesn't pass the fd argument to signalfd(), it always pass -1.
msg134973 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-05-02 13:56
test_signal.PthreadSigmaskTests fails on Mac OS X.

http://www.python.org/dev/buildbot/all/builders/PPC Leopard 3.x/builds/1785/steps/test/logs/stdio
http://www.python.org/dev/buildbot/all/builders/PPC Tiger 3.x/builds/1748/steps/test/logs/stdio
---
[186/354] test_signal
make: *** [buildbottest] User defined signal 1
----

http://www.python.org/dev/buildbot/all/builders/x86 Tiger 3.x/builds/2429/steps/test/logs/stdio
---
Re-running test 'test_signal' in verbose mode
...
test_arguments (test.test_signal.PthreadSigmaskTests) ... ok
test_block_unlock (test.test_signal.PthreadSigmaskTests) ... make: *** [buildbottest] User defined signal 1
program finished with exit code 2
---
msg134978 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-05-02 14:59
Update signalfd patch.
msg134981 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-05-02 15:14
> Update signalfd patch.
> 
> ----------
> Added file: http://bugs.python.org/file21856/signalfd-2.patch

- In the tests, you don't need sys.exc_info(), just
"except XXXError as e".
- In the doc, you wrote "file description" instead of "file descriptor"
- In test_out_of_range_signal, you should use assertRaises
msg134982 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-05-02 15:20
Fixed: updated patch (version 3).
msg135011 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-05-03 00:08
> test_signal.PthreadSigmaskTests fails on Mac OS X.

The problem is that sometimes SIG_UNBLOCK does not immediatly call the pending signal, but it calls it "later". The problem is that I don't know exactly when. I tried to wait the pending signal using signal.pause(), but I got a bus error!?

Example of the problem:

pthread_sigmask(SIG_BLOCK, [SIGUSR1])
os.kill(os.getpid(), SIGUSR1)
pthread_sigmask(SIG_UNBLOCK, [SIGUSR1])
msg135029 - (view) Author: Roundup Robot (python-dev) Date: 2011-05-03 12:11
New changeset d003ce770ba1 by Victor Stinner in branch 'default':
Issue #8407: Fix pthread_sigmask() tests on Mac OS X
http://hg.python.org/cpython/rev/d003ce770ba1
msg135032 - (view) Author: Roundup Robot (python-dev) Date: 2011-05-03 12:57
New changeset c9207c6ce24a by Victor Stinner in branch 'default':
Issue #8407: pthread_sigmask() checks immediatly if signal handlers have been
http://hg.python.org/cpython/rev/c9207c6ce24a
msg135038 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-05-03 13:29
Since the commit c9207c6ce24a, test_signal fails on OpenIndiana:

http://www.python.org/dev/buildbot/all/builders/x86%20OpenIndiana%203.x/builds/1179
======================================================================
ERROR: test_block_unlock (test.test_signal.PthreadSigmaskTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/export/home/buildbot/64bits/3.x.cea-indiana-amd64/build/Lib/test/test_signal.py", line 539, in test_block_unlock
    self.assertEqual(set(old_mask) ^ set(blocked), {signum})
  File "/export/home/buildbot/64bits/3.x.cea-indiana-amd64/build/Lib/test/test_signal.py", line 501, in handler
    1/0
ZeroDivisionError: division by zero
msg135041 - (view) Author: Roundup Robot (python-dev) Date: 2011-05-03 15:20
New changeset 96a532eaa2d1 by Victor Stinner in branch 'default':
Issue #8407: disable faulthandler timeout thread on all platforms
http://hg.python.org/cpython/rev/96a532eaa2d1
msg135087 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-05-03 23:53
> Since the commit c9207c6ce24a, test_signal fails on OpenIndiana:
> 
> http://www.python.org/dev/buildbot/all/builders/x86%20OpenIndiana%203.x/builds/1179
> ======================================================================
> ERROR: test_block_unlock (test.test_signal.PthreadSigmaskTests)
> ----------------------------------------------------------------------
> Traceback (most recent call last):
>   File "/export/home/buildbot/64bits/3.x.cea-indiana-amd64/build/Lib/test/test_signal.py", line 539, in test_block_unlock
>     self.assertEqual(set(old_mask) ^ set(blocked), {signum})
>   File "/export/home/buildbot/64bits/3.x.cea-indiana-amd64/build/Lib/test/test_signal.py", line 501, in handler
>     1/0
> ZeroDivisionError: division by zero

This is because a thread different than the main thread receives the
signal and calls the signal handler. Antoine found that "python -m
test_pydoc test_signal" is enough to reproduce the problem (on any OS,
or at least on Linux). test_pydoc loads a lot (all?) of Python modules
including _tkinter, and _tkinter (libtcl) creates a C thread which waits
events using select().

I see 3 solutions:

a) Use pthread_kill() to send the signal directly to the right thread
(the main thread)

b) Destroy _tkinter: Tcl_Finalize() exits the thread, but this function
is never called. _tkinter.c contains the following code:
#if 0
    /* This was not a good idea; through <Destroy> bindings,
       Tcl_Finalize() may invoke Python code but at that point the
       interpreter and thread state have already been destroyed! */
    Py_AtExit(Tcl_Finalize);
#endif

c) Skip the test if the _tkinter thread is present. Check if the
_tkinter module is loaded *should* be enough to check if the Tcl thread
is running. Unload the _tkinter module is not possible: modules written
in C cannot be unloaded. But it is possible to remove all references
from the Python object space, so check >'_tkinter' in sys.modules< is
maybe not reliable.

I don't know if some platforms have pthread_sigmask() but not
pthread_kill().

I have a patch to expose pthread_kill(), sigpending() and sigwait(). I
will publish it tomorrow for a review.

--

test_signal doesn't fail on all platforms. Possible reasons:

 - the platform doesn't have pthread_sigmask(), and so the test is
skipped
 - the libtcl version is different, a newer version masks maybe signals?
 - (unlikely!) os.kill() always sends the signal to the main thread
msg135112 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-05-04 10:40
> c) Skip the test if the _tkinter thread is present (...)

I opened issue #11998 for the problem with test_signal and the _tkinter module. To get back green buildbots, I commited a workaround:

New changeset 88dca05ed468 by Victor Stinner in branch 'default':
Issue #11998, issue #8407: workaround _tkinter issue in test_signal
http://hg.python.org/cpython/rev/88dca05ed468

> a) Use pthread_kill() to send the signal directly
> to the right thread (...)

I'm still working on this solution to test blocked signals even if _tkinter is loaded.
msg135118 - (view) Author: Roundup Robot (python-dev) Date: 2011-05-04 11:20
New changeset a5890ff5e3d5 by Victor Stinner in branch 'default':
Issue #8407: signal.pthread_sigmask() returns a set instead of a list
http://hg.python.org/cpython/rev/a5890ff5e3d5
msg135122 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-05-04 11:49
pending_signals.patch: add pthread_kill(), sigpending() and sigwait() functions with doc and tests.

I added many "See also" in the doc, e.g. os.kill() gives a link to signal.pthread_kill().

Note: the patch renames also BasicSignalTests to PosixTests, it's not related to the other changes.
msg135220 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-05-05 16:26
Quick review at http://bugs.python.org/review/8407/show
msg135252 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-05-05 22:53
> http://bugs.python.org/review/8407/show

Updated patch (version 2).

Note: sigpending() doesn't return an error code but -1 on error.
msg135266 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2011-05-06 08:47
Oops.
Victor, my mouse got stuck and I mistakenly removed your pending_signals-2 patch. I'm really sorry about this, could you re-post it?

To try to make up for this, a small comment:
In signal_sigwait, at the end of the function,  you do this:

    /* call the signal handler (if any) */
    if (PyErr_CheckSignals())
        return NULL;

I'm not sure I get this: sigwait is used to handle signals synchronously, and in the POSIX semantic, it's incompatible with signal handlers:

"""
sigwait suspends the calling thread until one of the signals in set is delivered to the calling thread. It then stores the number of the signal received in the location pointed to by sig and returns. The signals in set must be blocked and not ignored on entrance to sigwait. If the delivered signal has a signal handler function attached, that function is *not* called.
"""

and
"""
The effect of sigwait() on the signal actions for the signals in set is unspecified.
"""

So, if anything, you shouldn't check for a pending signal.
msg135436 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-05-07 10:00
> I mistakenly removed your pending_signals-2 patch
> I'm really sorry about this, could you re-post it?

No problem, anyway I worked on a new version in the train.

> So, if anything, you shouldn't check for a pending signal [in sigwait]

Right, fixed in the new patch.

--

pending_signals-3.patch:
 - don't check for pending signals in sigwait()
 - pthread_kill() doc: it is not a good idea to say that pthread_kill() with signum=0 can be used to check if a thread identifier is valid => such test does crash (SIGSEGV) on my Linux box. I changed the doc to say that it can be used to check if a thread is still running (which is different).
 - add a dedicated test for sigpending()
 - doc: explain how to get a thread identifier for pthread_kill()
 - don't compile pthread_kill() without threads: you cannot get a valid thread identifier without the _thread module

I think that the patch is ready to be commited. Anyone for a last review? (antoine, neologix?)
msg135437 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-05-07 10:02
Note: we might expose pth_raise() which is similar to pthread_kill(), but... pth support was deprecated by the PEP 11 and pth support will be removed from Python 3.3 source code.
msg135438 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-05-07 10:05
About signalfd(): this patch doesn't provide any information or tool to decode data written to the file descriptor. We should expose the signalfd_siginfo structure or you cannot handle more than one signal (how do you know which signal numbers were raised?). Example with ctypes:

class signalfd_siginfo(Structure):
    _fields_ = (
        ('ssi_signo', c_uint32),    # Signal number
        ('ssi_errno', c_int32),     # Error number (unused)
        ('ssi_code', c_int32),      # Signal code
        ('ssi_pid', c_uint32),      # PID of sender
        ('ssi_uid', c_uint32),      # Real UID of sender
        ('ssi_fd', c_int32),        # File descriptor (SIGIO)
        ('ssi_tid', c_uint32),      # Kernel timer ID (POSIX timers)
        ('ssi_band', c_uint32),     # Band event (SIGIO)
        ('ssi_overrun', c_uint32),  # POSIX timer overrun count
        ('ssi_trapno', c_uint32),   # Trap number that caused signal
        ('ssi_status', c_int32),    # Exit status or signal (SIGCHLD)
        ('ssi_int', c_int32),       # Integer sent by sigqueue(2)
        ('ssi_ptr', c_uint64),      # Pointer sent by sigqueue(2)
        ('ssi_utime', c_uint64),    # User CPU time consumed (SIGCHLD)
        ('ssi_stime', c_uint64),    # System CPU time consumed (SIGCHLD)
        ('ssi_addr', c_uint64),     # Address that generated signal
                                    # (for hardware-generated signals)
        ('_padding', c_char * 46),  # Pad size to 128 bytes (allow for
                                    # additional fields in the future)
    )
msg135439 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-05-07 10:07
wakeup_signum.patch: simple patch to write the signal number (as a single byte) instead of just b'\x00' into the wake up file descriptor. It gives the ability to watch more than one signal and be able to know which one was raised. Included tests demonstrate the feature. The doc explains how to decode data written to the file descriptor.
msg135440 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-05-07 10:19
pending_signals-3.patch: doc nit, the link to Thread.ident doesn't work. The doc should be replaced by something like:
              
   *thread_id* can be read from the :attr:`~threading.Thread.ident` attribute   
   of a :class:`threading.Thread` object.  For example,                         
   ``threading.current_thread().ident`` gives the identifier of the current     
   thread.

... but ident or Thread are not link, I don't know why.
msg135442 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-05-07 10:26
The threading has a function to get directly the identifier of the current thread: threading._get_ident() instead of threading.current_thread().ident. I think that threading._get_ident() is more reliable to threading.current_thread().ident because Thread.ident can be None in some cases. I created the issue #12028 to decide what to do with this function.
msg135499 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-05-07 19:38
> pending_signals-3.patch:
>  - don't check for pending signals in sigwait()
>  - pthread_kill() doc: it is not a good idea to say that pthread_kill() with signum=0 can be used to check if a thread identifier is valid => such test does crash (SIGSEGV) on my Linux box. I changed the doc to say that it can be used to check if a thread is still running (which is different).
>  - add a dedicated test for sigpending()
>  - doc: explain how to get a thread identifier for pthread_kill()
>  - don't compile pthread_kill() without threads: you cannot get a valid thread identifier without the _thread module
> 
> I think that the patch is ready to be commited. Anyone for a last
> review? (antoine, neologix?)

It looks good to me.
It's very nice to have all these extra functions :)
msg135509 - (view) Author: Roundup Robot (python-dev) Date: 2011-05-07 23:48
New changeset 1d8a57deddc4 by Victor Stinner in branch 'default':
Issue #8407: Add pthread_kill(), sigpending() and sigwait() functions to the
http://hg.python.org/cpython/rev/1d8a57deddc4
msg135511 - (view) Author: Roundup Robot (python-dev) Date: 2011-05-08 00:03
New changeset f8c49a930015 by Victor Stinner in branch 'default':
Issue #8407: The signal handler writes the signal number as a single byte
http://hg.python.org/cpython/rev/f8c49a930015
msg135512 - (view) Author: Roundup Robot (python-dev) Date: 2011-05-08 00:10
New changeset e3cb2c99a5a9 by Victor Stinner in branch 'default':
Issue #8407: Remove debug code from test_signal
http://hg.python.org/cpython/rev/e3cb2c99a5a9
msg135514 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-05-08 00:31
Update the signalfd patch (version 4) against the default branch. Specify the minimum Linux version in signalfd() doc. The patch still lacks a structure to parse the bytes written into the file (see msg135438 for a ctypes example): a struct sequence should be fine.
msg135539 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-05-08 19:50
> New changeset f8c49a930015 by Victor Stinner in branch 'default':
> Issue #8407: The signal handler writes the signal number as a single byte

Wakeup test using two pending signals fails on FreeBSD 6.4 buildbot:

======================================================================
FAIL: test_signum (test.test_signal.WakeupSignalTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd/build/Lib/test/test_signal.py", line 266, in test_signum
    self.check_signum(signal.SIGUSR1, signal.SIGALRM)
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd/build/Lib/test/test_signal.py", line 232, in check_signum
    self.assertSequenceEqual(raised, signals)
AssertionError: Sequences differ: (14,) != (30, 14)

Wakeup file only contains SIGALRM, not (SIGUSR1, SIGALRM), whereas SIGUSR1 is raised before SIGALRM.

Code of the test:
                                                                                
    def check_signum(self, *signals):                                           
        data = os.read(self.read, len(signals)+1)                               
        raised = struct.unpack('%uB' % len(data), data)                         
        self.assertSequenceEqual(raised, signals)

    def test_signum(self):                                                      
        old_handler = signal.signal(signal.SIGUSR1, lambda x,y:None)            
        self.addCleanup(signal.signal, signal.SIGUSR1, old_handler)             
        os.kill(os.getpid(), signal.SIGUSR1)                                    
        os.kill(os.getpid(), signal.SIGALRM)                                    
        self.check_signum(signal.SIGUSR1, signal.SIGALRM)
msg135580 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2011-05-09 12:26
There are warnings on the FreeBSD and OSX buildbots, where pthread_t
is not a long.

http://www.python.org/dev/buildbot/all/builders/AMD64%20FreeBSD%208.2%203.x/builds/237/steps/compile/logs/warnings%20%283%29

http://www.python.org/dev/buildbot/all/builders/AMD64%20Leopard%203.x/builds/1336/steps/compile/logs/warnings%20%2828%29
msg135581 - (view) Author: Roundup Robot (python-dev) Date: 2011-05-09 12:46
New changeset 2e0d3092249b by Victor Stinner in branch 'default':
Issue #8407: Use an explicit cast for FreeBSD
http://hg.python.org/cpython/rev/2e0d3092249b
msg136761 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2011-05-24 16:15
> New changeset f8c49a930015 by Victor Stinner in branch 'default':
> Issue #8407: The signal handler writes the signal number as a single byte
> http://hg.python.org/cpython/rev/f8c49a930015

There's a race.
If a signal is received while is_tripped is set, the signal number won't be written to the wakeup FD.
Patch attached.
msg136816 - (view) Author: Roundup Robot (python-dev) Date: 2011-05-25 00:38
New changeset 234021dcad93 by Victor Stinner in branch 'default':
Issue #8407: Fix the signal handler of the signal module: if it is called
http://hg.python.org/cpython/rev/234021dcad93
msg136817 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-05-25 00:42
> There's a race. If a signal is received while is_tripped is set, 
> the signal number won't be written to the wakeup FD.

Oh, nice catch. The "bug" is not new, Python behaves like that since Python 3.1. But in Python < 3.3, it doesn't mater because I don't think that wakeup was used to watch more than one signal. One trigger "something happened" was enough.

The wakeup fd now contains the number of each signal, and so the behaviour has to change. I applied your patch and I added a test.
msg137071 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-05-27 16:05
Oooooh, sigwait() doesn't accept a timeout! I would be nice to have also sigwaitinfo().. and maybe also its friend, sigwaitinfo() (if we implement the former, it's trivial to implement the latter). Python 3.3 adds optional timeout to subprocess.wait(), subprocess.communicate(), threading.Lock.acquire(), etc. And I love timeout! It was really useful to implement the thread of faulthandler.dump_tracebacks_later().
msg137382 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2011-05-31 18:48
> The wakeup fd now contains the number of each signal, and so the behaviour has
> to change. I applied your patch and I added a test.

Interesting. I suspected this would have an impact on the test_signal
failure on the FreeBSD 6.4 buidbot:

"""
======================================================================
FAIL: test_signum (test.test_signal.WakeupSignalTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd/build/Lib/test/test_signal.py",
line 272, in test_signum
    self.check_signum(signal.SIGUSR1, signal.SIGALRM)
  File "/usr/home/db3l/buildarea/3.x.bolen-freebsd/build/Lib/test/test_signal.py",
line 238, in check_signum
    self.assertEqual(raised, signals)
AssertionError: Tuples differ: (14, 30) != (30, 14)

First differing element 0:
14
30

- (14, 30)
+ (30, 14)
"""

This means that the signals are not delivered in order.
Normally, pending signals are checked upon return to user-space, so
trip_signal should be called when the kill syscall returns, so signal
numbers should be written in order to the wakeup FD (and here it looks
like the lowest-numbered signal is delivered first).
You could try adding a short sleep before the second kill (or just
pass unordered=True to check_signum, but in that case we don't check
the correct ordering).
msg137387 - (view) Author: Roundup Robot (python-dev) Date: 2011-05-31 20:31
New changeset 29e08a98281d by Victor Stinner in branch 'default':
Issue #8407: test_signal doesn't check signal delivery order
http://hg.python.org/cpython/rev/29e08a98281d
msg137407 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-06-01 00:27
Cool, "x86 FreeBSD 6.4 3.x" is green for the first time since a long time, thanks to my commit 29e08a98281d (test_signal doesn't check signal delivery order).
msg137945 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-06-09 10:33
> Oooooh, sigwait() doesn't accept a timeout! I would be nice to have
> also sigwaitinfo().. and maybe also its friend, sigwaitinfo() 

Oops, I mean sigtimedwait() and sigwaitinfo().
msg138017 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2011-06-09 17:17
I just noticed something "funny": signal_sigwait doesn't release the
GIL, which means that it's pretty much useless :-)
Patch attached.
Also, test_sigwait doesn't block the signal before calling sigwait: it
happens to work because there's only one thread, but it's undefined
behaviour.
msg138018 - (view) Author: Giampaolo Rodola' (giampaolo.rodola) * (Python committer) Date: 2011-06-09 17:20
This whole thread is becoming quite confusing.
It would be better to open a separate issue for any bug or feature request which is not related to "exposing signalfd(2) and pthread_sigmask".
msg138036 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-06-09 23:05
> This whole thread is becoming quite confusing.

You are complelty right, sorry to pollute this issue. Changes related to this issue:

 - expose pthread_sigmask(), pthread_kill(), sigpending(), sigwait()
 - wakeup fd now contains the file descriptor

I created issue #12303 for sigwaitinfo() and sigtimedwait(), and issue #12304 for signalfd.
msg138039 - (view) Author: Roundup Robot (python-dev) Date: 2011-06-09 23:40
New changeset a5c8b6ebe895 by Victor Stinner in branch 'default':
Issue #8407: signal.sigwait() releases the GIL
http://hg.python.org/cpython/rev/a5c8b6ebe895
msg138042 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-06-09 23:54
> I just noticed something "funny": signal_sigwait doesn't release
> the GIL, which means that it's pretty much useless :-)

sigwait() is not useless: the signal is not necessary send by another thread. The signal can be send by alarm(), from a child process, by an user, by the kernel.

Releasing the GIL is a new feature. Because it is cheap and pause() does also release the GIL, I commited your patch.

> Also, test_sigwait doesn't block the signal before calling sigwait:
> it happens to work because there's only one thread, but it's
> undefined behaviour.

test_sigwait() test pass on all 3.x buildbots (some don't have sigwait(), the test is skipped). Sometimes, test_sigwait() is executed with 2 threads, the main thread and Tkinter event loop thread, and the signal is not blocked in any thread.

On Linux, it works well with more than one thread. I added a test using a thread, we will see if it works on buildbots. The signal is raised by a thread (whereas the signal is not blocked in any thread).

I wrote a can_test_blocked_signals() function in test_signal which has hardcoded tests to check for some known C threads: the faulthandler timeout thread and for the Tkinter event loop thread. can_test_blocked_signals() returns True if pthread_kill() is available.

I don't know how it works if a thread uses pthread_kill() to raise a signal into the main thread (waiting in sigwait()), whereas the signal is not blocked in any thread.

If it is not possible to get the list of all C/Python threads and/or block a signal in all threads, we can use a subprocess without threads (with only the main thread).

Would you like to work on a patch to avoid the undefined behaviour?
msg138057 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-06-10 08:56
> On Linux, it works well with more than one thread.
> I added a test using a thread, we will see if it works
> on buildbots.

The test hangs on FreeBSD 8.2:

[235/356] test_signal
Timeout (1:00:00)!
Thread 0x0000000800e041c0:
  File "/usr/home/buildbot/buildarea/3.x.krah-freebsd/build/Lib/test/test_signal.py", line 625 in test_sigwait_thread
  File "/usr/home/buildbot/buildarea/3.x.krah-freebsd/build/Lib/unittest/case.py", line 407 in _executeTestPart
  File "/usr/home/buildbot/buildarea/3.x.krah-freebsd/build/Lib/unittest/case.py", line 462 in run
  File "/usr/home/buildbot/buildarea/3.x.krah-freebsd/build/Lib/unittest/case.py", line 514 in __call__
  File "/usr/home/buildbot/buildarea/3.x.krah-freebsd/build/Lib/unittest/suite.py", line 105 in run
  File "/usr/home/buildbot/buildarea/3.x.krah-freebsd/build/Lib/unittest/suite.py", line 67 in __call__
  File "/usr/home/buildbot/buildarea/3.x.krah-freebsd/build/Lib/unittest/suite.py", line 105 in run
  File "/usr/home/buildbot/buildarea/3.x.krah-freebsd/build/Lib/unittest/suite.py", line 67 in __call__
  File "/usr/home/buildbot/buildarea/3.x.krah-freebsd/build/Lib/test/support.py", line 1166 in run
  File "/usr/home/buildbot/buildarea/3.x.krah-freebsd/build/Lib/test/support.py", line 1254 in _run_suite
  File "/usr/home/buildbot/buildarea/3.x.krah-freebsd/build/Lib/test/support.py", line 1280 in run_unittest
  File "/usr/home/buildbot/buildarea/3.x.krah-freebsd/build/Lib/test/test_signal.py", line 687 in test_main
  File "./Lib/test/regrtest.py", line 1043 in runtest_inner
  File "./Lib/test/regrtest.py", line 841 in runtest
  File "./Lib/test/regrtest.py", line 668 in main
  File "./Lib/test/regrtest.py", line 1618 in <module>
*** Error code 1

http://www.python.org/dev/buildbot/all/builders/AMD64%20FreeBSD%208.2%203.x/builds/519/steps/test/logs/stdio

(The test has not run yet on FreeBSD 6.4 buildbot)
msg138061 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-06-10 09:12
> The test hangs on FreeBSD 8.2: (...)

See also #12310 which may be related.
msg138066 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2011-06-10 09:30
>> I just noticed something "funny": signal_sigwait doesn't release
>> the GIL, which means that it's pretty much useless :-)
>
> sigwait() is not useless: the signal is not necessary send by another thread. The signal can be send by alarm(), from a child process, by an user, by the kernel.
>

If it doesn't release the GIL, it is useless.
The common usage pattern for sigwait() is to create a dedicated thread
for signals management. If this thread calls sigwait without releasing
the GIL, the whole process is suspended, which is definitely not what
you want.

>> Also, test_sigwait doesn't block the signal before calling sigwait:
>> it happens to work because there's only one thread, but it's
>> undefined behaviour.
>
> test_sigwait() test pass on all 3.x buildbots (some don't have sigwait(), the test is skipped). Sometimes, test_sigwait() is executed with 2 threads, the main thread and Tkinter event loop thread, and the signal is not blocked in any thread.
>

It's mere luck:

    def test_sigwait(self):
        old_handler = signal.signal(signal.SIGALRM, self.handler)
        self.addCleanup(signal.signal, signal.SIGALRM, old_handler)

        signal.alarm(1)
        self.assertEqual(signal.sigwait([signal.SIGALRM]), signal.SIGALRM)

Comment out the first two lines that change the signal handler, and
the test will be killed by SIGALRM's default handler ("Alarm clock").
I tested on Linux, and if a the signal isn't blocked before calling sigwait:
- if a custom signal handler is installed, then it is not called
- if the default signal handler is in place, then it's called (and
with SIGALRM the process gets killed)
This is a typical example of what POSIX calls "undefined behavior".
If pthread_sigmask is called before sigwait, then it works as expected.

If we really wanted to test this properly, the way to go would be to
fork() a child process (that way we're sure there's only one thread),
have it block and sigwait() SIGUSR1 without touching the handler, send
it SIGUSR1, and check its return code.
Patch attached.
msg138072 - (view) Author: Roundup Robot (python-dev) Date: 2011-06-10 10:49
New changeset a17710e27ea2 by Victor Stinner in branch 'default':
Issue #8407: Make signal.sigwait() tests more reliable
http://hg.python.org/cpython/rev/a17710e27ea2
msg138073 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-06-10 10:59
neologix> Patch attached.

I wrote a different patch based on your patch. The main change is that I chose to keep signal.alarm(1) instead of sending the signal from the parent process to the child process, because I don't think that time.sleep(0.5) is a reliable synchronization code to wait until the child process is waiting in sigwait().

test_sigwait_thread() uses time.sleep(1) to wait until the main thread is waiting in sigwait(), but it is not mandatory because the signal is already blocked in the thread. I wrote test_sigwait_thread() to ensure that sigwait() releases the GIL, not to check that if a thread raises a signal while sigwait() is waiting, sigwait() does correctly receive it.

We may use time.sleep() in test_sigwait() if the signal is blocked in the parent process before the fork() (and unblocked in the parent process after the fork, but before sending the signal to the child process), but I think that signal.alarm() is more reliable and simple.

--

test_sigwait(_thread) was the last known bug of this issue, so let's close it. Reopen it if you see other bugs in pthread_sigmask(), pthread_kill(), sigpending() and sigwait() functions, or maybe better: open new issues because this issue has a too long history! See issue #12303 for sigwaitinfo() and sigtimedwait(), and issue #12304 for signalfd.
msg138076 - (view) Author: Roundup Robot (python-dev) Date: 2011-06-10 11:53
New changeset 37a87b709403 by Victor Stinner in branch 'default':
Issue #8407: write error message on sigwait test failure
http://hg.python.org/cpython/rev/37a87b709403
msg138077 - (view) Author: Roundup Robot (python-dev) Date: 2011-06-10 12:02
New changeset 60b1ab4d0cd4 by Victor Stinner in branch 'default':
Issue #8407: skip sigwait() tests if pthread_sigmask() is missing
http://hg.python.org/cpython/rev/60b1ab4d0cd4
msg157157 - (view) Author: Sven Marnach (smarnach) Date: 2012-03-30 23:24
The documentation has been left in a confusing state by this patch -- at least confusing to me.  I've created issue14456 with further details.
History
Date User Action Args
2012-03-30 23:24:33smarnachsetnosy: + smarnach
messages: + msg157157
2011-06-10 16:06:18tseaversetnosy: - tseaver
2011-06-10 12:02:40python-devsetmessages: + msg138077
2011-06-10 11:53:46python-devsetmessages: + msg138076
2011-06-10 10:59:18hayposetstatus: open -> closed
resolution: fixed
messages: + msg138073
2011-06-10 10:49:51python-devsetmessages: + msg138072
2011-06-10 09:30:10neologixsetfiles: + test_sigwait.diff

messages: + msg138066
2011-06-10 09:12:23hayposetmessages: + msg138061
2011-06-10 08:56:55hayposetmessages: + msg138057
2011-06-09 23:54:23hayposetmessages: + msg138042
2011-06-09 23:40:15python-devsetmessages: + msg138039
2011-06-09 23:05:39hayposetmessages: + msg138036
title: expose signalfd(2) and pthread_sigmask in the signal module -> expose pthread_sigmask(), pthread_kill(), sigpending() and sigwait() in the signal module
2011-06-09 17:20:11giampaolo.rodolasetmessages: + msg138018
2011-06-09 17:17:20neologixsetfiles: + sigwait_gil.diff

messages: + msg138017
2011-06-09 10:33:42hayposetmessages: + msg137945
2011-06-08 19:02:34giampaolo.rodolasetnosy: + giampaolo.rodola
2011-06-01 00:27:33hayposetmessages: + msg137407
2011-05-31 20:31:47python-devsetmessages: + msg137387
2011-05-31 18:48:54neologixsetmessages: + msg137382
2011-05-27 16:05:20hayposetmessages: + msg137071
2011-05-25 00:42:10hayposetmessages: + msg136817
2011-05-25 00:38:29python-devsetmessages: + msg136816
2011-05-24 16:15:33neologixsetfiles: + sig_wakeup_race.diff

messages: + msg136761
2011-05-09 12:46:19python-devsetmessages: + msg135581
2011-05-09 12:26:17skrahsetnosy: + skrah
messages: + msg135580
2011-05-09 02:30:41jceasetnosy: + jcea
2011-05-08 19:50:43hayposetmessages: + msg135539
2011-05-08 00:31:32hayposetfiles: + signalfd-4.patch

messages: + msg135514
2011-05-08 00:10:49python-devsetmessages: + msg135512
2011-05-08 00:03:44python-devsetmessages: + msg135511
2011-05-07 23:48:30python-devsetmessages: + msg135509
2011-05-07 19:38:06pitrousetmessages: + msg135499
2011-05-07 10:26:33hayposetmessages: + msg135442
2011-05-07 10:19:11hayposetmessages: + msg135440
2011-05-07 10:11:04hayposetfiles: - pending_signals.patch
2011-05-07 10:07:17hayposetfiles: + wakeup_signum.patch

messages: + msg135439
2011-05-07 10:05:23hayposetmessages: + msg135438
2011-05-07 10:02:13hayposetmessages: + msg135437
2011-05-07 10:00:59hayposetfiles: + pending_signals-3.patch

messages: + msg135436
2011-05-06 08:47:04neologixsetnosy: + neologix
messages: + msg135266
2011-05-06 08:39:44neologixsetfiles: - pending_signals-2.patch
2011-05-05 22:53:02hayposetfiles: + pending_signals-2.patch

messages: + msg135252
2011-05-05 16:26:26pitrousetmessages: + msg135220
2011-05-04 11:49:27hayposetfiles: + pending_signals.patch

messages: + msg135122
2011-05-04 11:20:45python-devsetmessages: + msg135118
2011-05-04 10:40:42hayposetmessages: + msg135112
2011-05-03 23:53:06hayposetmessages: + msg135087
2011-05-03 15:20:59python-devsetmessages: + msg135041
2011-05-03 13:29:30hayposetmessages: + msg135038
2011-05-03 12:57:19python-devsetmessages: + msg135032
2011-05-03 12:11:51python-devsetmessages: + msg135029
2011-05-03 00:08:33hayposetmessages: + msg135011
2011-05-02 15:21:05hayposetfiles: - signalfd-2.patch
2011-05-02 15:20:58hayposetfiles: + signalfd-3.patch

messages: + msg134982
2011-05-02 15:14:12pitrousetmessages: + msg134981
2011-05-02 15:00:01hayposetfiles: - signalfd.patch
2011-05-02 14:59:56hayposetfiles: + signalfd-2.patch

messages: + msg134978
2011-05-02 13:56:56hayposetmessages: + msg134973
2011-05-01 22:47:51haypounlinkissue11859 dependencies
2011-04-30 14:40:54hayposetfiles: + signalfd.patch

messages: + msg134865
2011-04-30 13:22:08python-devsetnosy: + python-dev
messages: + msg134861
2011-04-22 22:29:46hayposettitle: expose signalfd(2) and sigprocmask(2) in the signal module -> expose signalfd(2) and pthread_sigmask in the signal module
2011-04-20 21:42:25hayposetfiles: - signal_pthread_sigmask.patch
2011-04-20 00:49:30hayposetfiles: + signal_pthread_sigmask-2.patch

messages: + msg134113
2011-04-20 00:22:56haypolinkissue11859 dependencies
2011-04-20 00:22:30hayposetfiles: + signal_pthread_sigmask.patch
keywords: + patch
messages: + msg134111
2011-04-18 16:00:15hayposetmessages: + msg133977
2011-04-18 15:43:46hayposetnosy: + haypo
2011-02-21 18:48:47schmichaelsetnosy: + schmichael

messages: + msg128979
versions: + Python 3.3, - Python 2.7
2010-05-06 20:19:54benjamin.petersonsetmessages: + msg105162
2010-05-06 10:45:49pitrousetnosy: + benjamin.peterson
messages: + msg105134
2010-05-06 03:55:50exarkunsetkeywords: + needs review

messages: + msg105123
2010-05-05 22:28:50tseaversetnosy: + tseaver
messages: + msg105100
2010-05-05 22:27:42pitrousetmessages: + msg105099
2010-05-01 15:36:03pitrousetnosy: + gregory.p.smith, - gps
2010-05-01 15:22:24exarkunsetnosy: + gps
2010-05-01 15:22:04exarkunsetmessages: + msg104725
2010-04-16 13:52:31exarkunsetmessages: + msg103326
2010-04-15 13:52:50pitrousetnosy: + pitrou
messages: + msg103210
2010-04-15 13:50:12spivsetnosy: + spiv
2010-04-15 06:34:02marcin.bachrysetnosy: + marcin.bachry
2010-04-15 05:16:52loewissetnosy: + loewis
messages: + msg103184
2010-04-15 05:07:54exarkunsetmessages: + msg103183
2010-04-15 05:00:36exarkuncreate