This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: signal handler never gets called
Type: behavior Stage: patch review
Components: Documentation, Interpreter Core, Library (Lib) Versions: Python 3.6, Python 3.4, Python 3.5, Python 2.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: docs@python Nosy List: Devin Jeanpierre, Netto, Patrick Fink, amcnabb, georg.brandl, neologix, pitrou, pts, s7v7nislands, terry.reedy, tim.peters, vstinner
Priority: normal Keywords: patch

Created on 2009-02-19 13:47 by pts, last changed 2022-04-11 14:56 by admin.

Files
File name Uploaded Description Edit
tsig.py pts, 2009-02-19 13:47 Python script which triggers the strange behavior
select_select.diff Devin Jeanpierre, 2015-05-25 05:25 review
Messages (9)
msg82472 - (view) Author: Péter Szabó (pts) Date: 2009-02-19 13:47
According to http://docs.python.org/dev/library/signal.html , if I set
up a signal handler in the main thread, and then have the signal
delivered to the process, then the signal handler will be called in the
main thread. The attached Python script I've written, however, doesn't
work that way: sometimes the signal is completely lost, and the signal
handler is not called.

Here is how it should work. The code has two threads: the main thread
and the subthread. There is also a signal handler installed. The main
thread is running select.select(), waiting for a filehandle to become
readable. Then the subthread sends a signal to the process. The signal
handler writes a byte to the pipe. The select wakes up raising
'Interrupted system call' because of the signal.

I'm running Ubuntu Hardy on x86_64. With Python 2.4.5 and Python 2.5.2,
sometimes the signal handler is not called, and the select continues
waiting indefinitely. This is what I get on stdout in Python 2.4.5:

main pid=8555
--- 0
A
B
S
T
U
handler arg1=10 arg2=<frame object at 0x79ab40>
select got="(4, 'Interrupted system call')"
read str='W'
--- 1
A
B
S
T
U

This means that iteration 0 completed successfully: the signal handler
got called, and the select raised 'Interrupted system call'. However,
iteration 1 was stuck: the signal handler was never called, and the
select waits indefinitely.

The script seems to work in Python 2.4.3, but it hangs in iteration
about 60000.
msg100306 - (view) Author: Andrew McNabb (amcnabb) Date: 2010-03-02 19:53
I'm seeing something very similar to this.  In my case, I have a single-threaded program, and select fails to be interrupted by SIGCHLD.  I'm still tracking down more details, so I'll report back if I find more information.
msg100309 - (view) Author: Andrew McNabb (amcnabb) Date: 2010-03-02 21:27
Sorry for the noise.  It turns out that my problem was unrelated.
msg102829 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2010-04-11 12:32
I think two things can trigger this problem, both have to do with how signals are handled by the interpreter.
Contrarily to what you may think, when a signal is received, its handler is _not_ called. Instead, it's Modules/signalmodule.c signal_handler() that's called. This handler stores the reception of the signal inside a table, and schedules the execution of the associated handler for later:

signal_handler(int sig_num)
{
[...]
                Handlers[sig_num].tripped = 1;
                /* Set is_tripped after setting .tripped, as it gets
                   cleared in PyErr_CheckSignals() before .tripped. */
                is_tripped = 1;
                Py_AddPendingCall(checksignals_witharg, NULL);
[...]
}

checksignal_withargs() calls PyErr_CheckSignals(), which in turn calls the handler.
The pending calls are checked periodically from the interpreter main loop, in Python/ceval.c: when _Py_Ticker reaches 0, then we check for pending calls, and if there are any, we run the pending calls, hence checksignals_witharg, and the handler.
This is actually a documented behaviour, quoting signal documentation:
"Although Python signal handlers are called asynchronously as far as the Python user is concerned, they can only occur between the “atomic” instructions of the Python interpreter. This means that signals arriving during long calculations implemented purely in C (such as regular expression matches on large bodies of text) may be delayed for an arbitrary amount of time."

But there's a race, imagine this happens:
- a thread (or a process for that matter) receives a signal
- signal_handler schedules the associated handler
- before _Py_Ticker reaches 0 and is checked from the interpreter main loop, a blocking call is made
- since the process is blocked in the call, the main eval loop doesn't run, and the handler doesn't get called until the process leaves the call and enters the main eval loop again. If the call doesn't return (e.g. select without timeout), then the process remains stuck forever.

This problem can also happen even if the signal is sent after select is called:
- the main thread calls select
- the second thread runs, and sends a signal to the process
- the signal is not received by the main thread, but by the second thread
- the second thread schedules execution of the handler
- since the main thread is blocked in select, the handler never gets called

But this case is quite flaky, because the documentation warns you:
"Some care must be taken if both signals and threads are used in the same program. The fundamental thing to remember in using signals and threads simultaneously is: always perform signal() operations in the main thread of execution. Any thread can perform an alarm(), getsignal(), pause(), setitimer() or getitimer(); only the main thread can set a new signal handler, and the main thread will be the only one to receive signals (this is enforced by the Python signal module, even if the underlying thread implementation supports sending signals to individual threads). This means that signals can’t be used as a means of inter-thread communication. Use locks instead."

Sending signals to a process with multiple threads is risky, you should use locks.

Finally, I think that the documentation should be rephrased:
"and the main thread will be the only one to receive signals (this is enforced by the Python signal module, even if the underlying thread implementation supports sending signals to individual threads)."
It's false. What's guaranteed is that the signal handler will only be executed on behalf of the main thread, but any thread can _receive_ a signal.
And comments in Modules/signalmodule.c are misleading:
   We still have the problem that in some implementations signals
   generated by the keyboard (e.g. SIGINT) are delivered to all
   threads (e.g. SGI), while in others (e.g. Solaris) such signals are
   delivered to one random thread (an intermediate possibility would
   be to deliver it to the main thread -- POSIX?).  For now, we have
   a working implementation that works in all three cases -- the
   handler ignores signals if getpid() isn't the same as in the main
   thread.  XXX This is a hack.

Sounds strange. If only a thread other than the main thread receives the signal and you ignore it, then it's lost, isn't it ?
Furthermore, under Linux 2.6 and NPTL, getpid() returns the main thread PID even from another thread.

Peers ?
msg102850 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-04-11 16:14
Thanks for the detailed analysis, Charles-François.

> Finally, I think that the documentation should be rephrased:

Yes, I think so.

> Furthermore, under Linux 2.6 and NPTL, getpid() returns the main thread
> PID even from another thread.

Yes, those threads belong to the same process.

But as mentioned, signals are a rather fragile inter-process communication device; just use a specific file descriptor.
And if you still wanna use signals, there's set_wakeup_fd():
http://docs.python.org/library/signal.html#signal.set_wakeup_fd
msg244015 - (view) Author: Devin Jeanpierre (Devin Jeanpierre) * Date: 2015-05-25 05:25
Agree with Charles-François's second explanation. This makes it very hard to reliably handle signals -- basically everyone has to remember to use set_wakeup_fd, and most people don't. For example, gunicorn is likely vulnerable to this because it doesn't use set_wakeup_fd. I suspect most code using select + signals is wrong.

I've attached a patch which fixes the issue for select(), but not any other functions. If it's considered a good patch, I can work on the rest of the functions in the select module. (Also, tests for the details of the behavior.)

Also the patch is pretty hokey, so I'd appreciate feedback if it's going to go in. :)
msg244019 - (view) Author: Devin Jeanpierre (Devin Jeanpierre) * Date: 2015-05-25 08:31
Adding haypo since apparently he's been touching signals stuff a lot lately, maybe has some useful thoughts / review? :)
msg246963 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2015-07-20 05:32
This was turned into a doc issue, with no patch forthcoming, but Devin has submitted a bugfix.  Should this be turned back into a bug issue?
msg314669 - (view) Author: Patrick Fink (Patrick Fink) Date: 2018-03-29 20:04
A workaround to handle signals reliably that I successfully tested now is to execute everything within a subthread and let the main thread just join this subthread. Like:

signal.signal(MY_SIGNAL, signal_handler)
threading.Thread(target = my_main_function)
thread.start()
thread.join()

Doing it like this, the main thread should always listen to signals disregarding whether the subthread is stuck.
History
Date User Action Args
2022-04-11 14:56:45adminsetgithub: 49565
2018-03-29 20:05:00Patrick Finksetnosy: + Patrick Fink
messages: + msg314669
2015-07-20 05:32:53terry.reedysetversions: + Python 3.4, Python 3.5, Python 3.6, - Python 2.6, Python 3.1, Python 3.2
nosy: + terry.reedy

messages: + msg246963

stage: patch review
2015-05-25 08:31:44Devin Jeanpierresetnosy: + vstinner
messages: + msg244019
2015-05-25 05:25:12Devin Jeanpierresetfiles: + select_select.diff

nosy: + Devin Jeanpierre
messages: + msg244015

keywords: + patch
2010-12-31 03:36:57s7v7nislandssetnosy: + s7v7nislands
2010-10-29 10:07:21adminsetassignee: georg.brandl -> docs@python
2010-07-01 09:25:50Nettosetnosy: + Netto
2010-04-11 16:14:01pitrousetpriority: normal

assignee: georg.brandl
components: + Documentation
versions: + Python 2.6, Python 3.1, Python 2.7, Python 3.2, - Python 2.5, Python 2.4
nosy: + pitrou, tim.peters, georg.brandl

messages: + msg102850
2010-04-11 12:32:12neologixsetnosy: + neologix
messages: + msg102829
2010-03-02 21:27:32amcnabbsetmessages: + msg100309
2010-03-02 19:53:21amcnabbsetnosy: + amcnabb
messages: + msg100306
2009-02-19 13:47:06ptscreate