Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

expose pthread_sigmask(), pthread_kill(), sigpending() and sigwait() in the signal module #52654

Closed
exarkun mannequin opened this issue Apr 15, 2010 · 68 comments
Closed
Labels
extension-modules C modules in the Modules dir stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@exarkun
Copy link
Mannequin

exarkun mannequin commented Apr 15, 2010

BPO 8407
Nosy @loewis, @gpshead, @jcea, @pitrou, @vstinner, @giampaolo, @benjaminp, @skrah, @smarnach
Files
  • signal_pthread_sigmask-2.patch
  • signalfd-3.patch
  • pending_signals-3.patch
  • wakeup_signum.patch
  • signalfd-4.patch
  • sig_wakeup_race.diff: patch for wakeup FD race
  • sigwait_gil.diff
  • test_sigwait.diff
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2011-06-10.10:59:18.072>
    created_at = <Date 2010-04-15.05:00:36.517>
    labels = ['extension-modules', 'type-feature', 'library']
    title = 'expose pthread_sigmask(), pthread_kill(), sigpending() and sigwait() in the signal module'
    updated_at = <Date 2012-03-30.23:24:33.524>
    user = 'https://bugs.python.org/exarkun'

    bugs.python.org fields:

    activity = <Date 2012-03-30.23:24:33.524>
    actor = 'smarnach'
    assignee = 'exarkun'
    closed = True
    closed_date = <Date 2011-06-10.10:59:18.072>
    closer = 'vstinner'
    components = ['Extension Modules', 'Library (Lib)']
    creation = <Date 2010-04-15.05:00:36.517>
    creator = 'exarkun'
    dependencies = []
    files = ['21736', '21857', '21921', '21922', '21930', '22093', '22300', '22312']
    hgrepos = []
    issue_num = 8407
    keywords = ['patch', 'needs review']
    message_count = 68.0
    messages = ['103182', '103183', '103184', '103210', '103326', '104725', '105099', '105100', '105123', '105134', '105162', '128979', '133977', '134111', '134113', '134861', '134865', '134973', '134978', '134981', '134982', '135011', '135029', '135032', '135038', '135041', '135087', '135112', '135118', '135122', '135220', '135252', '135266', '135436', '135437', '135438', '135439', '135440', '135442', '135499', '135509', '135511', '135512', '135514', '135539', '135580', '135581', '136761', '136816', '136817', '137071', '137382', '137387', '137407', '137945', '138017', '138018', '138036', '138039', '138042', '138057', '138061', '138066', '138072', '138073', '138076', '138077', '157157']
    nosy_count = 15.0
    nosy_names = ['loewis', 'gregory.p.smith', 'jcea', 'spiv', 'exarkun', 'pitrou', 'vstinner', 'giampaolo.rodola', 'benjamin.peterson', 'marcin.bachry', 'schmichael', 'skrah', 'neologix', 'python-dev', 'smarnach']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = None
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue8407'
    versions = ['Python 3.3']

    @exarkun
    Copy link
    Mannequin Author

    exarkun mannequin commented Apr 15, 2010

    Linux offers the signalfd syscall since 2.6.22 (with minor changes afterwards). This call allows signals to be handled as bytes read out of a file descriptor, rather than as interruptions to the flow of a program. Quite usefully, this file descriptor can be select()'d on (or poll()'d, epoll()'d, etc) alongside other "normal" file descriptors.

    In order to effectively use signalfd(), the signals in question must be blocked, though. So it makes sense to expose sigprocmask(2) at the same time, in order to allow this blocking to be set up.

    @exarkun exarkun mannequin self-assigned this Apr 15, 2010
    @exarkun exarkun mannequin added extension-modules C modules in the Modules dir stdlib Python modules in the Lib dir type-feature A feature request or enhancement labels Apr 15, 2010
    @exarkun
    Copy link
    Mannequin Author

    exarkun mannequin commented Apr 15, 2010

    I've started on an implementation of this in /branches/signalfd-issue8407.

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Apr 15, 2010

    Notice that 2.7 has seen its first beta release, so no new features are allowed for it. I think it's better to target this feature for 3.2.

    @pitrou
    Copy link
    Member

    pitrou commented Apr 15, 2010

    I'm with Martin, better to target 3.2 IMO.
    Does signalfd really bring something compared to set_wakeup_fd()?
    The one big difference I can see is that set_wakeup_fd() doesn't transmit the signal number, but this could be fixed if desired (instead of sending '\0', send a byte containing the signal number).

    @exarkun
    Copy link
    Mannequin Author

    exarkun mannequin commented Apr 16, 2010

    The one big difference I can see is that set_wakeup_fd() doesn't transmit the signal number, but this could be fixed if desired (instead of sending '\0', send a byte containing the signal number).

    There's a lot more information than the signal number available as well. The signalfd_siginfo struct has 16 fields in it now and may have more in the future.

    Another advantage is that this approach allows all asynchronous preemption to be disabled. This side-steps the possibility of hitting any signal bugs in CPython itself. Of course, any such bugs that are found should be fixed, but fixing them is often quite challenging and involves lots of complex domain-specific knowledge. In comparison, the signalfd and sigprocmask extensions are quite straight-forward and self-contained.

    It's also possible to have several signalfds, each with a different signal mask. set_wakeup_fd is limited to a single fd per-process.

    sigprocmask has other uses all by itself, of course, like delaying the delivery of signals while some sensitive code runs.

    @exarkun
    Copy link
    Mannequin Author

    exarkun mannequin commented May 1, 2010

    One open question regarding interaction with threading. sigprocmask's behavior in a multithreaded program is unspecified. pthread_sigmask should be used instead. I could either expose both of these and let the caller choose, or I could make signal.sigprocmask use pthread_sigmask if it's available, and fall back to sigprocmask. I don't see any disadvantages of doing the latter, and the former seems like it would create an attractive nuisance. However, I don't have much experience with sigprocmask; I'm only interested in it because of its use in combination with signalfd.

    @pitrou
    Copy link
    Member

    pitrou commented May 5, 2010

    pthread_sigmask should be used instead. I could either expose both of > these and let the caller choose, or I could make signal.sigprocmask use > pthread_sigmask if it's available, and fall back to sigprocmask.

    Or perhaps you could disable the feature if pthread_sigmask isn't available. Apparently, FreeBSD and Mac OS X have it, as well as Linux.

    @tseaver
    Copy link

    tseaver commented May 5, 2010

    Trying pthread_sigmask first, and falling back, seems like the right strategy to me.

    @exarkun
    Copy link
    Mannequin Author

    exarkun mannequin commented May 6, 2010

    I think this is ready for a first review. See <http://codereview.appspot.com/1132041\>. If everyone agrees this is inappropriate for 2.7, then I'll port the changes to 3.x. I don't expect there to be much difference in the 3.x version.

    @pitrou
    Copy link
    Member

    pitrou commented May 6, 2010

    If everyone agrees this is inappropriate for 2.7

    I think the decision is up to Benjamin.

    @benjaminp
    Copy link
    Contributor

    Let's leave it for 3.2.

    @schmichael
    Copy link
    Mannequin

    schmichael mannequin commented Feb 21, 2011

    Any hopes of getting this into Python 3.3?

    @vstinner
    Copy link
    Member

    sigprocmask or (better) pthread_sigmask is required to fix bpo-11859 bug.

    ---

    Python has a test for "broken pthread_sigmask". Extract of configure.in:

          AC_MSG_CHECKING(if PTHREAD_SCOPE_SYSTEM is supported)
          AC_CACHE_VAL(ac_cv_pthread_system_supported,
          [AC_RUN_IFELSE([AC_LANG_SOURCE([[#include <pthread.h>
          void *foo(void *parm) {
            return NULL;
          }
          main() {
            pthread_attr_t attr;
            pthread_t id;
            if (pthread_attr_init(&attr)) exit(-1);
            if (pthread_attr_setscope(&attr, PTHREAD_SCOPE_SYSTEM)) exit(-1);
            if (pthread_create(&id, &attr, foo, NULL)) exit(-1);
            exit(0);
          }]])],
          [ac_cv_pthread_system_supported=yes],
          [ac_cv_pthread_system_supported=no],
          [ac_cv_pthread_system_supported=no])
          ])
          AC_MSG_RESULT($ac_cv_pthread_system_supported)
          if test "$ac_cv_pthread_system_supported" = "yes"; then
            AC_DEFINE(PTHREAD_SYSTEM_SCHED_SUPPORTED, 1, [Defined if PTHREAD_SCOPE_SYSTEM supported.])
          fi
          AC_CHECK_FUNCS(pthread_sigmask,
            [case $ac_sys_system in
            CYGWIN*)
              AC_DEFINE(HAVE_BROKEN_PTHREAD_SIGMASK, 1,
                [Define if pthread_sigmask() does not work on your system.])
                ;;
            esac])

    Extract of Python/thread_pthread.h:

    /* On platforms that don't use standard POSIX threads pthread_sigmask()
     * isn't present.  DEC threads uses sigprocmask() instead as do most
     * other UNIX International compliant systems that don't have the full
     * pthread implementation.
     */
    #if defined(HAVE_PTHREAD_SIGMASK) && !defined(HAVE_BROKEN_PTHREAD_SIGMASK)
    #  define SET_THREAD_SIGMASK pthread_sigmask
    #else
    #  define SET_THREAD_SIGMASK sigprocmask
    #endif

    Because today more and more programs rely on threads, it is maybe not a good idea to provide a binding of sigprocmask(). I would prefer to only add pthread_sigmask() which has a determistic behaviour with threads. So only compile signal.pthread_sigmask() if pthread API is present and pthread_sigmask "is not broken".

    ---

    About the patch: the doc should explain that the signal masks are inherited for child processes (fork() and execve()). I don't know if this behaviour is specific to Linux or not.

    If we only use pthread_sigmask(), the doc is wrong: "Set the signal mask for the process." It's not for the process but only for the current thread.

    How does it work if I change the signal mask of the main thread and then I create a new thread: the signal mask is inherited, or a default mask is used instead?

    ---

    The new faulthandler uses a thread to implement a timeout: the thread uses pthread_sigmask() or sigprocmask() to ignore all signals. If I don't set the signal mask, some tests fail: check that a system call (like reading from a pipe) can be interrupted by signal. The problem is that signal may be send to the faulthandler thread, instead of the main thread. Hum, while I'm writing this, I realize that I should maybe not fallback to sigprocmask() because it may mask signals for the whole process (all threads)!

    @vstinner
    Copy link
    Member

    signal_pthread_sigmask.patch:

    • add signal.pthread_sigmask() function with doc and tests
    • add SIG_BLOCK, SIG_UNBLOCK, SIG_SETMASK constants
    • fix bpo-11859: fix tests of test_io using threads and an alarm: use pthread_sigmask() to ensure that only the main thread receives the SIGALRM signal

    The code is based on the last version of python-signalfd:
    https://code.launchpad.net/~exarkun/python-signalfd/trunk

    Changes between python-signalfd and my patch:

    • rename "sigprocmask" function to "pthread_sigmask"
    • I added an unit test and the doc
    • catch PyIter_Next() error
    • set signum variable (the result of PyLong_AsLong) type to long (instead of int) and check its value (0 < signum < NSIG)
    • I adapted the code to my coding style :-)

    I will work on a similar patch for signalfd() after the pthread_sigmask() patch is accepted.

    @vstinner
    Copy link
    Member

    Oh, I forget to read again http://codereview.appspot.com/1132041.

    Here is an updated patch reusing some tests and with a better documentation.

    @vstinner vstinner changed the title expose signalfd(2) and sigprocmask(2) in the signal module expose signalfd(2) and pthread_sigmask in the signal module Apr 22, 2011
    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Apr 30, 2011

    New changeset 28b9702a83d1 by Victor Stinner in branch 'default':
    Issue bpo-8407, issue bpo-11859: Add signal.pthread_sigmask() function to fetch
    http://hg.python.org/cpython/rev/28b9702a83d1

    @vstinner
    Copy link
    Member

    signalfd.patch: Add signal.signalfd(), signal.SFD_CLOEXEC and signal.SFD_NONLOCK.

    The patch is based on http://codereview.appspot.com/1132041 and the last version (bzr) of python-signalfd.

    The patch uses also PyModule_AddIntMacro() for the 3 constants added in my last (pthread_sigmask), and changes pthread_sigmask() to raise an OSError instead of a RuntimeError.

    Note: python-signalfd has a bug: it doesn't pass the fd argument to signalfd(), it always pass -1.

    @vstinner
    Copy link
    Member

    vstinner commented May 2, 2011

    test_signal.PthreadSigmaskTests fails on Mac OS X.

    http://www.python.org/dev/buildbot/all/builders/PPC Leopard 3.x/builds/1785/steps/test/logs/stdio
    http://www.python.org/dev/buildbot/all/builders/PPC Tiger 3.x/builds/1748/steps/test/logs/stdio
    ---
    [186/354] test_signal
    make: *** [buildbottest] User defined signal 1
    ----

    http://www.python.org/dev/buildbot/all/builders/x86 Tiger 3.x/builds/2429/steps/test/logs/stdio
    ---
    Re-running test 'test_signal' in verbose mode
    ...
    test_arguments (test.test_signal.PthreadSigmaskTests) ... ok
    test_block_unlock (test.test_signal.PthreadSigmaskTests) ... make: *** [buildbottest] User defined signal 1
    program finished with exit code 2
    ---

    @vstinner
    Copy link
    Member

    vstinner commented May 2, 2011

    Update signalfd patch.

    @pitrou
    Copy link
    Member

    pitrou commented May 2, 2011

    Update signalfd patch.

    ----------
    Added file: http://bugs.python.org/file21856/signalfd-2.patch

    • In the tests, you don't need sys.exc_info(), just
      "except XXXError as e".
    • In the doc, you wrote "file description" instead of "file descriptor"
    • In test_out_of_range_signal, you should use assertRaises

    @vstinner
    Copy link
    Member

    vstinner commented May 2, 2011

    Fixed: updated patch (version 3).

    @vstinner
    Copy link
    Member

    vstinner commented May 3, 2011

    test_signal.PthreadSigmaskTests fails on Mac OS X.

    The problem is that sometimes SIG_UNBLOCK does not immediatly call the pending signal, but it calls it "later". The problem is that I don't know exactly when. I tried to wait the pending signal using signal.pause(), but I got a bus error!?

    Example of the problem:

    pthread_sigmask(SIG_BLOCK, [SIGUSR1])
    os.kill(os.getpid(), SIGUSR1)
    pthread_sigmask(SIG_UNBLOCK, [SIGUSR1])

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented May 3, 2011

    New changeset d003ce770ba1 by Victor Stinner in branch 'default':
    Issue bpo-8407: Fix pthread_sigmask() tests on Mac OS X
    http://hg.python.org/cpython/rev/d003ce770ba1

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented May 3, 2011

    New changeset c9207c6ce24a by Victor Stinner in branch 'default':
    Issue bpo-8407: pthread_sigmask() checks immediatly if signal handlers have been
    http://hg.python.org/cpython/rev/c9207c6ce24a

    @vstinner
    Copy link
    Member

    vstinner commented May 3, 2011

    Since the commit c9207c6ce24a, test_signal fails on OpenIndiana:

    http://www.python.org/dev/buildbot/all/builders/x86%20OpenIndiana%203.x/builds/1179
    ======================================================================
    ERROR: test_block_unlock (test.test_signal.PthreadSigmaskTests)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "/export/home/buildbot/64bits/3.x.cea-indiana-amd64/build/Lib/test/test_signal.py", line 539, in test_block_unlock
        self.assertEqual(set(old_mask) ^ set(blocked), {signum})
      File "/export/home/buildbot/64bits/3.x.cea-indiana-amd64/build/Lib/test/test_signal.py", line 501, in handler
        1/0
    ZeroDivisionError: division by zero

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented May 8, 2011

    New changeset f8c49a930015 by Victor Stinner in branch 'default':
    Issue bpo-8407: The signal handler writes the signal number as a single byte
    http://hg.python.org/cpython/rev/f8c49a930015

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented May 8, 2011

    New changeset e3cb2c99a5a9 by Victor Stinner in branch 'default':
    Issue bpo-8407: Remove debug code from test_signal
    http://hg.python.org/cpython/rev/e3cb2c99a5a9

    @vstinner
    Copy link
    Member

    vstinner commented May 8, 2011

    Update the signalfd patch (version 4) against the default branch. Specify the minimum Linux version in signalfd() doc. The patch still lacks a structure to parse the bytes written into the file (see msg135438 for a ctypes example): a struct sequence should be fine.

    @vstinner
    Copy link
    Member

    vstinner commented May 8, 2011

    New changeset f8c49a930015 by Victor Stinner in branch 'default':
    Issue bpo-8407: The signal handler writes the signal number as a single byte

    Wakeup test using two pending signals fails on FreeBSD 6.4 buildbot:

    ======================================================================
    FAIL: test_signum (test.test_signal.WakeupSignalTests)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "/usr/home/db3l/buildarea/3.x.bolen-freebsd/build/Lib/test/test_signal.py", line 266, in test_signum
        self.check_signum(signal.SIGUSR1, signal.SIGALRM)
      File "/usr/home/db3l/buildarea/3.x.bolen-freebsd/build/Lib/test/test_signal.py", line 232, in check_signum
        self.assertSequenceEqual(raised, signals)
    AssertionError: Sequences differ: (14,) != (30, 14)

    Wakeup file only contains SIGALRM, not (SIGUSR1, SIGALRM), whereas SIGUSR1 is raised before SIGALRM.

    Code of the test:

    def check_signum(self, \*signals):                                           
        data = os.read(self.read, len(signals)+1)                               
        raised = struct.unpack('%uB' % len(data), data)                         
        self.assertSequenceEqual(raised, signals)
    
        def test_signum(self):                                                      
            old_handler = signal.signal(signal.SIGUSR1, lambda x,y:None)            
            self.addCleanup(signal.signal, signal.SIGUSR1, old_handler)             
            os.kill(os.getpid(), signal.SIGUSR1)                                    
            os.kill(os.getpid(), signal.SIGALRM)                                    
            self.check_signum(signal.SIGUSR1, signal.SIGALRM)

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented May 9, 2011

    New changeset 2e0d3092249b by Victor Stinner in branch 'default':
    Issue bpo-8407: Use an explicit cast for FreeBSD
    http://hg.python.org/cpython/rev/2e0d3092249b

    @neologix
    Copy link
    Mannequin

    neologix mannequin commented May 24, 2011

    New changeset f8c49a930015 by Victor Stinner in branch 'default':
    Issue bpo-8407: The signal handler writes the signal number as a single byte
    http://hg.python.org/cpython/rev/f8c49a930015

    There's a race.
    If a signal is received while is_tripped is set, the signal number won't be written to the wakeup FD.
    Patch attached.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented May 25, 2011

    New changeset 234021dcad93 by Victor Stinner in branch 'default':
    Issue bpo-8407: Fix the signal handler of the signal module: if it is called
    http://hg.python.org/cpython/rev/234021dcad93

    @vstinner
    Copy link
    Member

    There's a race. If a signal is received while is_tripped is set,
    the signal number won't be written to the wakeup FD.

    Oh, nice catch. The "bug" is not new, Python behaves like that since Python 3.1. But in Python < 3.3, it doesn't mater because I don't think that wakeup was used to watch more than one signal. One trigger "something happened" was enough.

    The wakeup fd now contains the number of each signal, and so the behaviour has to change. I applied your patch and I added a test.

    @vstinner
    Copy link
    Member

    Oooooh, sigwait() doesn't accept a timeout! I would be nice to have also sigwaitinfo().. and maybe also its friend, sigwaitinfo() (if we implement the former, it's trivial to implement the latter). Python 3.3 adds optional timeout to subprocess.wait(), subprocess.communicate(), threading.Lock.acquire(), etc. And I love timeout! It was really useful to implement the thread of faulthandler.dump_tracebacks_later().

    @neologix
    Copy link
    Mannequin

    neologix mannequin commented May 31, 2011

    The wakeup fd now contains the number of each signal, and so the behaviour has
    to change. I applied your patch and I added a test.

    Interesting. I suspected this would have an impact on the test_signal
    failure on the FreeBSD 6.4 buidbot:

    """
    ======================================================================
    FAIL: test_signum (test.test_signal.WakeupSignalTests)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "/usr/home/db3l/buildarea/3.x.bolen-freebsd/build/Lib/test/test_signal.py",
    line 272, in test_signum
        self.check_signum(signal.SIGUSR1, signal.SIGALRM)
      File "/usr/home/db3l/buildarea/3.x.bolen-freebsd/build/Lib/test/test_signal.py",
    line 238, in check_signum
        self.assertEqual(raised, signals)
    AssertionError: Tuples differ: (14, 30) != (30, 14)

    First differing element 0:
    14
    30

    • (14, 30)
      + (30, 14)
      """

    This means that the signals are not delivered in order.
    Normally, pending signals are checked upon return to user-space, so
    trip_signal should be called when the kill syscall returns, so signal
    numbers should be written in order to the wakeup FD (and here it looks
    like the lowest-numbered signal is delivered first).
    You could try adding a short sleep before the second kill (or just
    pass unordered=True to check_signum, but in that case we don't check
    the correct ordering).

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented May 31, 2011

    New changeset 29e08a98281d by Victor Stinner in branch 'default':
    Issue bpo-8407: test_signal doesn't check signal delivery order
    http://hg.python.org/cpython/rev/29e08a98281d

    @vstinner
    Copy link
    Member

    vstinner commented Jun 1, 2011

    Cool, "x86 FreeBSD 6.4 3.x" is green for the first time since a long time, thanks to my commit 29e08a98281d (test_signal doesn't check signal delivery order).

    @vstinner
    Copy link
    Member

    vstinner commented Jun 9, 2011

    Oooooh, sigwait() doesn't accept a timeout! I would be nice to have
    also sigwaitinfo().. and maybe also its friend, sigwaitinfo()

    Oops, I mean sigtimedwait() and sigwaitinfo().

    @neologix
    Copy link
    Mannequin

    neologix mannequin commented Jun 9, 2011

    I just noticed something "funny": signal_sigwait doesn't release the
    GIL, which means that it's pretty much useless :-)
    Patch attached.
    Also, test_sigwait doesn't block the signal before calling sigwait: it
    happens to work because there's only one thread, but it's undefined
    behaviour.

    @giampaolo
    Copy link
    Contributor

    This whole thread is becoming quite confusing.
    It would be better to open a separate issue for any bug or feature request which is not related to "exposing signalfd(2) and pthread_sigmask".

    @vstinner
    Copy link
    Member

    vstinner commented Jun 9, 2011

    This whole thread is becoming quite confusing.

    You are complelty right, sorry to pollute this issue. Changes related to this issue:

    • expose pthread_sigmask(), pthread_kill(), sigpending(), sigwait()
    • wakeup fd now contains the file descriptor

    I created issue bpo-12303 for sigwaitinfo() and sigtimedwait(), and issue bpo-12304 for signalfd.

    @vstinner vstinner changed the title expose signalfd(2) and pthread_sigmask in the signal module expose pthread_sigmask(), pthread_kill(), sigpending() and sigwait() in the signal module Jun 9, 2011
    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Jun 9, 2011

    New changeset a5c8b6ebe895 by Victor Stinner in branch 'default':
    Issue bpo-8407: signal.sigwait() releases the GIL
    http://hg.python.org/cpython/rev/a5c8b6ebe895

    @vstinner
    Copy link
    Member

    vstinner commented Jun 9, 2011

    I just noticed something "funny": signal_sigwait doesn't release
    the GIL, which means that it's pretty much useless :-)

    sigwait() is not useless: the signal is not necessary send by another thread. The signal can be send by alarm(), from a child process, by an user, by the kernel.

    Releasing the GIL is a new feature. Because it is cheap and pause() does also release the GIL, I commited your patch.

    Also, test_sigwait doesn't block the signal before calling sigwait:
    it happens to work because there's only one thread, but it's
    undefined behaviour.

    test_sigwait() test pass on all 3.x buildbots (some don't have sigwait(), the test is skipped). Sometimes, test_sigwait() is executed with 2 threads, the main thread and Tkinter event loop thread, and the signal is not blocked in any thread.

    On Linux, it works well with more than one thread. I added a test using a thread, we will see if it works on buildbots. The signal is raised by a thread (whereas the signal is not blocked in any thread).

    I wrote a can_test_blocked_signals() function in test_signal which has hardcoded tests to check for some known C threads: the faulthandler timeout thread and for the Tkinter event loop thread. can_test_blocked_signals() returns True if pthread_kill() is available.

    I don't know how it works if a thread uses pthread_kill() to raise a signal into the main thread (waiting in sigwait()), whereas the signal is not blocked in any thread.

    If it is not possible to get the list of all C/Python threads and/or block a signal in all threads, we can use a subprocess without threads (with only the main thread).

    Would you like to work on a patch to avoid the undefined behaviour?

    @vstinner
    Copy link
    Member

    On Linux, it works well with more than one thread.
    I added a test using a thread, we will see if it works
    on buildbots.

    The test hangs on FreeBSD 8.2:

    [235/356] test_signal
    Timeout (1:00:00)!
    Thread 0x0000000800e041c0:
    File "/usr/home/buildbot/buildarea/3.x.krah-freebsd/build/Lib/test/test_signal.py", line 625 in test_sigwait_thread
    File "/usr/home/buildbot/buildarea/3.x.krah-freebsd/build/Lib/unittest/case.py", line 407 in _executeTestPart
    File "/usr/home/buildbot/buildarea/3.x.krah-freebsd/build/Lib/unittest/case.py", line 462 in run
    File "/usr/home/buildbot/buildarea/3.x.krah-freebsd/build/Lib/unittest/case.py", line 514 in __call__
    File "/usr/home/buildbot/buildarea/3.x.krah-freebsd/build/Lib/unittest/suite.py", line 105 in run
    File "/usr/home/buildbot/buildarea/3.x.krah-freebsd/build/Lib/unittest/suite.py", line 67 in __call__
    File "/usr/home/buildbot/buildarea/3.x.krah-freebsd/build/Lib/unittest/suite.py", line 105 in run
    File "/usr/home/buildbot/buildarea/3.x.krah-freebsd/build/Lib/unittest/suite.py", line 67 in __call__
    File "/usr/home/buildbot/buildarea/3.x.krah-freebsd/build/Lib/test/support.py", line 1166 in run
    File "/usr/home/buildbot/buildarea/3.x.krah-freebsd/build/Lib/test/support.py", line 1254 in _run_suite
    File "/usr/home/buildbot/buildarea/3.x.krah-freebsd/build/Lib/test/support.py", line 1280 in run_unittest
    File "/usr/home/buildbot/buildarea/3.x.krah-freebsd/build/Lib/test/test_signal.py", line 687 in test_main
    File "./Lib/test/regrtest.py", line 1043 in runtest_inner
    File "./Lib/test/regrtest.py", line 841 in runtest
    File "./Lib/test/regrtest.py", line 668 in main
    File "./Lib/test/regrtest.py", line 1618 in <module>
    *** Error code 1

    http://www.python.org/dev/buildbot/all/builders/AMD64%20FreeBSD%208.2%203.x/builds/519/steps/test/logs/stdio

    (The test has not run yet on FreeBSD 6.4 buildbot)

    @vstinner
    Copy link
    Member

    The test hangs on FreeBSD 8.2: (...)

    See also bpo-12310 which may be related.

    @neologix
    Copy link
    Mannequin

    neologix mannequin commented Jun 10, 2011

    > I just noticed something "funny": signal_sigwait doesn't release
    > the GIL, which means that it's pretty much useless :-)

    sigwait() is not useless: the signal is not necessary send by another thread. The signal can be send by alarm(), from a child process, by an user, by the kernel.

    If it doesn't release the GIL, it is useless.
    The common usage pattern for sigwait() is to create a dedicated thread
    for signals management. If this thread calls sigwait without releasing
    the GIL, the whole process is suspended, which is definitely not what
    you want.

    > Also, test_sigwait doesn't block the signal before calling sigwait:
    > it happens to work because there's only one thread, but it's
    > undefined behaviour.

    test_sigwait() test pass on all 3.x buildbots (some don't have sigwait(), the test is skipped). Sometimes, test_sigwait() is executed with 2 threads, the main thread and Tkinter event loop thread, and the signal is not blocked in any thread.

    It's mere luck:

        def test_sigwait(self):
            old_handler = signal.signal(signal.SIGALRM, self.handler)
            self.addCleanup(signal.signal, signal.SIGALRM, old_handler)
    
            signal.alarm(1)
            self.assertEqual(signal.sigwait([signal.SIGALRM]), signal.SIGALRM)

    Comment out the first two lines that change the signal handler, and
    the test will be killed by SIGALRM's default handler ("Alarm clock").
    I tested on Linux, and if a the signal isn't blocked before calling sigwait:

    • if a custom signal handler is installed, then it is not called
    • if the default signal handler is in place, then it's called (and
      with SIGALRM the process gets killed)
      This is a typical example of what POSIX calls "undefined behavior".
      If pthread_sigmask is called before sigwait, then it works as expected.

    If we really wanted to test this properly, the way to go would be to
    fork() a child process (that way we're sure there's only one thread),
    have it block and sigwait() SIGUSR1 without touching the handler, send
    it SIGUSR1, and check its return code.
    Patch attached.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Jun 10, 2011

    New changeset a17710e27ea2 by Victor Stinner in branch 'default':
    Issue bpo-8407: Make signal.sigwait() tests more reliable
    http://hg.python.org/cpython/rev/a17710e27ea2

    @vstinner
    Copy link
    Member

    neologix> Patch attached.

    I wrote a different patch based on your patch. The main change is that I chose to keep signal.alarm(1) instead of sending the signal from the parent process to the child process, because I don't think that time.sleep(0.5) is a reliable synchronization code to wait until the child process is waiting in sigwait().

    test_sigwait_thread() uses time.sleep(1) to wait until the main thread is waiting in sigwait(), but it is not mandatory because the signal is already blocked in the thread. I wrote test_sigwait_thread() to ensure that sigwait() releases the GIL, not to check that if a thread raises a signal while sigwait() is waiting, sigwait() does correctly receive it.

    We may use time.sleep() in test_sigwait() if the signal is blocked in the parent process before the fork() (and unblocked in the parent process after the fork, but before sending the signal to the child process), but I think that signal.alarm() is more reliable and simple.

    --

    test_sigwait(_thread) was the last known bug of this issue, so let's close it. Reopen it if you see other bugs in pthread_sigmask(), pthread_kill(), sigpending() and sigwait() functions, or maybe better: open new issues because this issue has a too long history! See issue bpo-12303 for sigwaitinfo() and sigtimedwait(), and issue bpo-12304 for signalfd.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Jun 10, 2011

    New changeset 37a87b709403 by Victor Stinner in branch 'default':
    Issue bpo-8407: write error message on sigwait test failure
    http://hg.python.org/cpython/rev/37a87b709403

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Jun 10, 2011

    New changeset 60b1ab4d0cd4 by Victor Stinner in branch 'default':
    Issue bpo-8407: skip sigwait() tests if pthread_sigmask() is missing
    http://hg.python.org/cpython/rev/60b1ab4d0cd4

    @smarnach
    Copy link
    Mannequin

    smarnach mannequin commented Mar 30, 2012

    The documentation has been left in a confusing state by this patch -- at least confusing to me. I've created bpo-14456 with further details.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    extension-modules C modules in the Modules dir stdlib Python modules in the Lib dir type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants