classification
Title: Py_signal_pipe
Type: Stage:
Components: Interpreter Core Versions: Python 2.6
process
Status: closed Resolution: rejected
Dependencies: Superseder: Patch for signal.set_wakeup_fd
View: 1583
Assigned To: gvanrossum Nosy List: Rhamphoryncus, anthonybaxter, exarkun, gustavo, gvanrossum, jafo, loewis, mwh
Priority: normal Keywords: patch

Created on 2006-09-24 14:13 by gustavo, last changed 2007-12-10 23:50 by gvanrossum. This issue is now closed.

Files
File name Uploaded Description Edit
python-signals.diff gustavo, 2007-01-25 18:11 updated patch (puts back "if (!is_tripped) return 0;" optimisation accidentally deleted).
python-signals-minimal.diff gustavo, 2007-12-09 12:32
Messages (30)
msg51147 - (view) Author: Gustavo J. A. M. Carneiro (gustavo) * Date: 2006-09-24 14:13
Problem: how to wakeup extension modules running poll()
so that they can let python check for signals.

Solution: use a pipe to communicate between signal
handlers and main thread.  The read end of the pipe can
then be monitored by poll/select for input events and
wake up poll().  As a side benefit, it avoids the usage
of Py_AddPendingCall / Py_MakePendingCalls, which are
patently not "async safe".

All explained in this thread:

http://mail.python.org/pipermail/python-dev/2006-September/068569.html
msg51148 - (view) Author: Adam Olsen (Rhamphoryncus) Date: 2006-09-26 23:53
Logged In: YES 
user_id=12364

I've looked over the patch, although I haven't tested it.  I
have the following suggestions:

* Needs documentation explaining the signal weirdness (may
drop signals, may delay indefinitely, new handlers may get
signals intended for old, etc)
* Needs to be explicit that users must only poll/select to
check for readability of the pipe, NOT read from it
* The comment for is_tripped refers to sigcheck(), which
doesn't exist
* I think we should be more paranoid about the range of
possible signals.  NSIG does not appear to be defined by
SUSv2 (no clue about Posix).  We should size the Handlers
array to UCHAR_MAX and set any signals outside the range of
0..UCHAR_MAX to either 0 (null signal) or UCHAR_MAX.  I'm
not sure we should ever use NSIG.
* In signal_hander() sizeof(signum_c) is inherently 1. ;)
* The set_nonblock macro doesn't check for errors from
fcntl().  I'm not sure it's worth having a macro for that
anyway.
* Needs some documentation of the assumptions about
read()/write() being memory barriers.
* In check_signals() sizeof(signum) is inherently 1.
* There's a blank line with tabs near the end of
check_signals() ;)
* PyErr_SetInterrupt() should use a compile-time check for
SIGINT being within 0..UCHAR_MAX, assuming NSIG is ripped
out entierly.
* PyErr_SetInterrupt() needs to set is_tripped after the
call to write(), not before.
* PyOS_InterruptOccurred() should probably still check that
it's called from the main thread.
msg51149 - (view) Author: Gustavo J. A. M. Carneiro (gustavo) * Date: 2006-09-27 13:42
Logged In: YES 
user_id=908

> * Needs documentation ...

  True, I'll try to add more documentation...

> * I think we should be more paranoid about the range of
possible signals.  NSIG does not appear to be defined by
SUSv2 (no clue about Posix).  We should size the Handlers
array to UCHAR_MAX and set any signals outside the range of
0..UCHAR_MAX to either 0 (null signal) or UCHAR_MAX.  I'm
not sure we should ever use NSIG.

I disagree.  Creating an array of size UCHAR_MAX is just
wasting memory.  If you check the original python code,
there's already fallback code to define NSIG if it's not
already defined (if not defined, it could end up being
defines as 64).

> * In signal_hander() sizeof(signum_c) is inherently 1. ;)

  And? I occasionally hear horror stories of platforms where
 sizeof(char) != 1, I'm not taking any chances :)

> * PyOS_InterruptOccurred() should probably still check
that it's called from the main thread.

check_signals already bails out if that is the case.  But in
fact it bails out without setting the interrupt_occurred
output parameter, so I fixed that.

fcntl error checking... will work on it.
msg51150 - (view) Author: Gustavo J. A. M. Carneiro (gustavo) * Date: 2006-09-27 14:34
Logged In: YES 
user_id=908

and of course this

 > * PyErr_SetInterrupt() needs to set is_tripped after the
call to write(), not before.

is correct, good catch.

New patch uploaded.
msg51151 - (view) Author: Adam Olsen (Rhamphoryncus) Date: 2006-09-28 02:50
Logged In: YES 
user_id=12364

Any compiler where sizeof(char) != 1 is *deeply* broken.  In
C, a byte isn't always 8 bits (if it uses bits at all!). 
It's possible for a char to take (for instance) 32 bits, but
sizeof(char) will STILL return 1 in such a case.  A mention
of this in the wild is here:
http://lkml.org/lkml/1998/1/22/4
If you find a compiler that's broken, I'd love to hear about
it. :)

# error Too many signals to fit on an unsigned char!
Should be "in", not "on" :)

A comment in signal_handler() about ignoring the return
value of write() may be good.

initsignal() should avoid not replace
Py_signal_pipe/Py_signal_pipe_w if called a second time
(which is possible, right?).  If so, it should probably not
set them until after setting non-blocking mode.

check_signals() should not call
PyEval_CallObject(Handlers[signum].func, ...) if func is
NULL, which may happen after finisignal() clears it.
msg51152 - (view) Author: Gustavo J. A. M. Carneiro (gustavo) * Date: 2006-09-28 15:31
Logged In: YES 
user_id=908

> ...sizeof(char) will STILL return 1 in such a case...

Even if sizeof(char) == 1, 'sizeof(signum_c)' is much more
readable than just a plain '1'.
msg51153 - (view) Author: Adam Olsen (Rhamphoryncus) Date: 2006-09-29 21:09
Logged In: YES 
user_id=12364

I'm concerned about the interface to
PyOS_InterruptOccurred().  The original version peeked ahead
for only that signal, and handled it manually.  No need to
report errors.  The new version will first call arbitrary
python functions to handle any earlier signals, then an
arbitrary python function for the interrupt itself, and then
will not report any errors they produce.  It may not even
get to the interrupt, even if one is waiting.

I'm not sure PyOS_InterruptOccurred() is called when
arbitrary python code is acceptable.  I suspect it should be
dropped entierly, in favour of a more robust API.

Otoh, some of it appears quite crufty.  One version in
intrcheck.c lacks a return statement, invoking undefined
behavior in C.

One other concern I have is that signalmodule.c should never
been unloaded, if loaded via dlopen.  A delayed signal
handler may reference it indefinitely.  However, I see no
sane way to enforce this.
msg51154 - (view) Author: Gustavo J. A. M. Carneiro (gustavo) * Date: 2007-01-25 17:57
Damn this SF bug tracker! ;(

The patch I uploaded (yes, it was me, not anonymous) fixes some bugs and also fixes http://www.python.org/sf/1643738
msg51155 - (view) Author: Gustavo J. A. M. Carneiro (gustavo) * Date: 2007-01-25 18:11
File Added: python-signals.diff
msg51156 - (view) Author: Adam Olsen (Rhamphoryncus) Date: 2007-01-25 18:38
gustavo, there's two patches attached and it's not entirely clear which one is current.  Please delete the older one.
msg51157 - (view) Author: Jean-Paul Calderone (exarkun) * (Python committer) Date: 2007-01-25 19:22
The attached patch also fixes a bug in the order in which signal handlers are run.  Previously, they would be run in numerically ascending signal number order.  With the patch attached, they will be run in the order they are processed by Python.

msg51158 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2007-01-29 08:41
I'm -1 on this patch. The introduction of a pipe makes it essentially gtk-specific: It will only work with gtk (for a while, until other frameworks catch up - which may take years), and it will only wake up a gtk thread that is in the gtk poll call.

It fails to support cases where the main thread blocks in a different blocking call (i.e. neither select nor poll). I think a better mechanism is needed to support that case, e.g. by waking up the main thread with pthread_kill.
msg51159 - (view) Author: Gustavo J. A. M. Carneiro (gustavo) * Date: 2007-01-29 11:07
But if you think about it, support for other cases have to be extensions of this patch.  In an async handler it's not safe to do about anything.  The current framework is not async safe, it just happens to work most of the time.

If we use pthread_kill we will start to enter platform-specific code; what will happen in systems without POSIX threads?  What signal do we use to wake up the main thread?  Do system calls that receive signals return EINTR for this platform or not (can we guarantee it always happens)?  Which one is the main thread anyway?

In any case, anything we want to do can be layered on top of the Py_signal_pipe API in a very safe way, because reading from a pipe is decoupled from the async handler, therefore this handler is allowed to safely do anything it wants, like pthread_kill.  But IMHO that part should be left out of Python; let the frameworks do it themselves.
msg51160 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2007-01-29 18:36
Can you please explain in what sense the current framework isn't "async safe"? You might be referring to "async-signal-safe functions", which is a term specified by POSIX, referring to functions that may be called in a signal handler. The Python signal handler, signal_handler, calls these functions:

* getpid
* Py_AddPendingCall
* PyOS_setsig
** sigemptyset
** sigaction

AFAICT, this is the complete list of functions called in a signal handler. Of these, only getpid, sigemptyset, and sigaction are library functions, and they are all specified as async-signal safe. So the current implementation is async-signal safe.

Usage of pthread_kill wouldn't make it more platform-specific than your patch. pthread_kill is part of the POSIX standard, and so is pipe(2). So both changes work on a POSIX system, and neither change would be portable if all you have is standard C.
msg51161 - (view) Author: Gustavo J. A. M. Carneiro (gustavo) * Date: 2007-01-29 18:53
Py_AddPendingCall is not async safe.  It's obvious looking at the code, and it even says so in a comment:

	/* XXX Begin critical section */
	/* XXX If you want this to be safe against nested
	   XXX asynchronous calls, you'll have to work harder! */
msg51162 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2007-01-29 20:01
I see. I think this can be fixed fairly easily: install the signal handlers with sigaction, and prevent any nested delivery of signals through sa_mask. Then, no two signal handlers will get invoked simultaneously.
msg51163 - (view) Author: Adam Olsen (Rhamphoryncus) Date: 2007-01-29 22:11
As far as I can tell, the sig_mask argument of sigaction only applies to the thread in which the signal handler gets called.  If you have multiple threads you could still have one signal handler running per thread.

http://www.opengroup.org/onlinepubs/009695399/functions/sigaction.html
msg51164 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2007-01-29 22:25
Right. To prevent the simultaneous invocation of Py_AddPendingCall from multiple threads, two alternatives are possible:
a) protect the routine with a thread mutex, if threading is available
b) use pthread_kill in threads other than the main thread (as I proposed earlier); those other threads then wouldn't call Py_AddPendingCall anymore
msg51165 - (view) Author: Adam Olsen (Rhamphoryncus) Date: 2007-01-29 23:59
Unfortunately, neither the mutex functions nor pthread_kill() are listed as async-signal-safe:
http://www.opengroup.org/onlinepubs/009695399/functions/xsh_chap02_04.html

Personally, I'd be just as happy to raise an exception if an attempt is made to import both signal and threading: doing it safely and reliably is just too difficult, so we shouldn't promote a false sense of security.
msg51166 - (view) Author: Gustavo J. A. M. Carneiro (gustavo) * Date: 2007-01-30 10:48
Adam, forbidding signals + threads is not possible.  signalmodule is _always_ imported by python; it is builtin.  That's because it is signal module that provides support for raising KeyboardInterrupt when Ctrl-C is pressed.  Therefore. "signals + threads == forbidden" simplifies to "threads == forbidden", and if threads are forbidden then let's just say Python loses a lot of support :)

Regarding async safety, I think that generally any system call that may potentially suspend a process, such as blocking system calls and, most notably, mutexes, are not safe to use in async handlers.  Writing to a pipe is safe only because the file descriptor is in non-blocking mode.  OTOH, in non-blocking mode the file descriptor can only accept a few bytes at a time, hence the (documented) limitation of not accepting large bursts of signals.
msg51167 - (view) Author: Adam Olsen (Rhamphoryncus) Date: 2007-01-30 21:41
Ouch, I forgot all about Ctrl-C.  It would have been an amusing if short-lived "solution". ;)

The safety issue is not about suspending the process, which is safe albeit not very useful.  The issue is libc (or even the kernel?) having internal data structures in an inconsistent state and being unable to handle reentrant or threaded access; random corruption, crashes, hangs, or any sort of "undefined" behaviour can result if this is not heeded.  write() is one of the precious few libc functions that is labeled as not vulnerable to such things.
msg51168 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2007-01-30 22:53
Whether pthread_kill is async-signal-safe depends on the operating system. For example, on Solaris, it is documented as async-signal-safe.
msg51169 - (view) Author: Adam Olsen (Rhamphoryncus) Date: 2007-01-30 23:31
I've noticed that, but it does little to help us write portable code that covers all of CPython's platforms.  Unless you have a collection of such platform-specific async-signal-safe functions to cover them all?
msg58264 - (view) Author: Sean Reifschneider (jafo) * (Python committer) Date: 2007-12-07 04:18
The thread at:

http://bugzilla.gnome.org/show_bug.cgi?id=481569

requests "attention" so I'm formally rejecting it.  :-)

My reading of the python-dev thread on this is that the current patch
does not reasonably fix the problem.  However, I'd like mwh and
anthonybaxters input on this.  What would we need to be able to accept
this patch?  My understanding is that there are some things that the
current patch does not address.  Should I bring it up on python-dev?

The current opinion in the python-dev thread seems to be that the
current solution would go into trunk rather than into a maint branch for
an existing release, so it's not going to be a quick fix in any case.
msg58275 - (view) Author: Gustavo J. A. M. Carneiro (gustavo) * Date: 2007-12-07 13:36
"My understanding is that there are some things that the current patch
does not address."

Well, I don't know what those things are, so it's hard for me to address
them. :-)
msg58307 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2007-12-08 23:15
I've been told to look at this one more time.
msg58317 - (view) Author: Gustavo J. A. M. Carneiro (gustavo) * Date: 2007-12-09 12:32
Minimal patch that just adds the pipe but does not attempt to fix
anything else.
msg58321 - (view) Author: Adam Olsen (Rhamphoryncus) Date: 2007-12-09 17:43
The minimal patch doesn't initialize dummy_char or dummy_c.  It's
harmless here, but please fix it. ;)

sizeof(dummy_char) will always be 1 (C defines sizeof as multiples of
char.)  The convention seems to be hardcoding 1 instead.
msg58322 - (view) Author: Gustavo J. A. M. Carneiro (gustavo) * Date: 2007-12-09 17:49
> The minimal patch doesn't initialize dummy_char or dummy_c.  It's
harmless here, but please fix it. ;)

The variable is called 'dummy' for a reason.  The value written or read
is irrevelant...
msg58391 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2007-12-10 23:49
Rejected, in favor of #1583 which lets the app choose whether to use a
fd or not.  We had an extensive #python-dev discussion on this and this
time the rejection is irrevocable.
History
Date User Action Args
2007-12-10 23:50:16gvanrossumsetresolution: accepted -> rejected
superseder: Patch for signal.set_wakeup_fd
2007-12-10 23:49:41gvanrossumsetstatus: open -> closed
resolution: accepted
messages: + msg58391
2007-12-09 17:49:54gustavosetmessages: + msg58322
2007-12-09 17:43:00Rhamphoryncussetmessages: + msg58321
2007-12-09 12:32:39gustavosetfiles: + python-signals-minimal.diff
messages: + msg58317
2007-12-08 23:15:02gvanrossumsetstatus: closed -> open
assignee: gvanrossum
resolution: rejected -> (no value)
messages: + msg58307
nosy: + gvanrossum
2007-12-07 13:36:47gustavosetmessages: + msg58275
2007-12-07 04:18:03jafosetstatus: open -> closed
nosy: + mwh, jafo, anthonybaxter
resolution: rejected
messages: + msg58264
2006-09-24 14:13:07gustavocreate