This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: multiprocessing: handling of errno after signals in sem_acquire()
Type: behavior Stage: needs patch
Components: Library (Lib) Versions: Python 3.1, Python 3.2, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: jnoller Nosy List: asksol, jnoller, loewis, ryles, trent, tumert
Priority: normal Keywords:

Created on 2009-06-29 00:55 by ryles, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Messages (5)
msg89804 - (view) Author: Ryan Leslie (ryles) Date: 2009-06-29 00:55
While developing an application, an inconsistency was noted where,
depending on the particular signal handler in use,
multiprocessing.Queue.put() may (or may not) raise OSError() after
sys.exit() was called by the handler. The following example, which was
tested with Python 2.6.1 on Linux, demonstrates this.

#!/usr/bin/env python

import multiprocessing
import signal
import sys

def handleKill(signum, frame):
   #sys.stdout.write("Exit requested by signal.\n")
   print "Exit requested by signal."
   sys.exit(1)
signal.signal(signal.SIGTERM, handleKill)

queue = multiprocessing.Queue(maxsize=1)
queue.put(None)
queue.put(None)

When the script is run, the process will block (as expected) on the
second queue.put(). If (from another terminal) I send the process
SIGTERM, I consistently see:

$ ./q.py
Exit requested by signal.
$

Now, if I modify the above program by commenting out the 'print', and
uncommenting the 'sys.stdout' (a very subtle change), I would expect
the result to be the same when killing the process. Instead, I
consistently see:

$ ./q.py
Exit requested by signal.
Traceback (most recent call last):
 File "./q.py", line 15, in <module>
   queue.put(None)
 File "python2.6/multiprocessing/queues.py", line 75, in put
   if not self._sem.acquire(block, timeout):
OSError: [Errno 0] Error
$ 

After debugging this further, the issue appears to be in
semlock_acquire() or semaphore.c in Modules/_multiprocessing:
http://svn.python.org/view/python/trunk/Modules/_multiprocessing/semaphore.c?revision=71009&view=markup

The relevant code from (the Unix version of) semlock_acquire() is:

do {
               Py_BEGIN_ALLOW_THREADS
               if (blocking && timeout_obj == Py_None)
                       res = sem_wait(self->handle);
               else if (!blocking)
                       res = sem_trywait(self->handle);
               else
                       res = sem_timedwait(self->handle, &deadline);
               Py_END_ALLOW_THREADS
               if (res == MP_EXCEPTION_HAS_BEEN_SET)
                       break;
       } while (res < 0 && errno == EINTR && !PyErr_CheckSignals());

       if (res < 0) {
               if (errno == EAGAIN || errno == ETIMEDOUT)
                       Py_RETURN_FALSE;
               else if (errno == EINTR)
                       return NULL;
               else
                       return PyErr_SetFromErrno(PyExc_OSError);
       }

In both versions of the program (print vs. sys.stdout), sem_wait() is
being interrupted and is returning -1 with errno set to EINTR. This is
what I expected. Also, in both cases it seems that the loop is
(correctly) terminating with PyErr_CheckSignals() returning non-zero.
This makes sense too; the call is executing our signal handler, and then
returning -1 since our particular handler raises SystemExit.

However, I suspect that depending on the exact code executed
for the signal handler, errno may or may not wind up being reset in
some nested call of PyErr_CheckSignals(). I believe that the
error checking code below the do-while (where sem_wait() is called),
needed errno to have the value set by sem_wait(), and the author
wasn't expecting anything else to have changed it. In the "print"
version, errno apparently winds up unchanged with EINTR, resulting in
the `return NULL' statement. In the "sys.stdout" version (and probably
many others), errno winds up being reset to 0, and the error handling
results in the `return PyErr_SetFromErrno(PyExc_OSError)' statement.

To patch this up, we can probably just save errno as, say, `wait_errno'
at the end of the loop body, and then use it within the error handling
block that follows. However, the rest of the code should probably be
checked for this type of issue.
msg89805 - (view) Author: Jesse Noller (jnoller) * (Python committer) Date: 2009-06-29 01:35
Thank you Ryan
msg221265 - (view) Author: Tumer Topcu (tumert) Date: 2014-06-22 15:56
Looks like the suggested fix is there in v2.7.6:

do {
        Py_BEGIN_ALLOW_THREADS
        if (blocking && timeout_obj == Py_None)
            res = sem_wait(self->handle);
        else if (!blocking)
            res = sem_trywait(self->handle);
        else
            res = sem_timedwait(self->handle, &deadline);
        Py_END_ALLOW_THREADS
        err = errno;
        if (res == MP_EXCEPTION_HAS_BEEN_SET)
            break;
    } while (res < 0 && errno == EINTR && !PyErr_CheckSignals());

    if (res < 0) {
        errno = err;
        if (errno == EAGAIN || errno == ETIMEDOUT)
            Py_RETURN_FALSE;
        else if (errno == EINTR)
            return NULL;
        else
            return PyErr_SetFromErrno(PyExc_OSError);
    }


But I am still able to reproduce the issue following the exact same steps written.
msg221272 - (view) Author: Tumer Topcu (tumert) Date: 2014-06-22 16:37
Nevermind the last comment (curse of using a loaner laptop), tried again after compiling against the latest repo all works as expected. I believe this issue can be closed.
msg221276 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2014-06-22 16:42
Thanks, closing as fixed.
History
Date User Action Args
2022-04-11 14:56:50adminsetgithub: 50611
2014-06-22 16:42:38loewissetstatus: open -> closed

nosy: + loewis
messages: + msg221276

resolution: fixed
2014-06-22 16:37:22tumertsetmessages: + msg221272
2014-06-22 15:56:54tumertsetnosy: + trent, tumert
messages: + msg221265
2010-08-27 15:29:26BreamoreBoysetstage: needs patch
versions: - Python 2.6, Python 3.0
2010-08-27 13:52:44asksolsetnosy: + asksol
2009-06-29 01:35:11jnollersetpriority: normal
assignee: jnoller
messages: + msg89805
2009-06-29 00:56:45rylessetnosy: + jnoller
2009-06-29 00:55:52rylescreate