Issue 9504: signal.signal/signal.alarm not working as expected

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/53713

classification

Title:	signal.signal/signal.alarm not working as expected
Type:	behavior	Stage:	resolved
Components:		Versions:	Python 3.2, Python 3.3, Python 2.7

process

Status:	closed	Resolution:	out of date
Dependencies:		Superseder:
Assigned To:		Nosy List:	alanwilter, georg.brandl, iritkatriel, jdemeyer, martin.panter, neologix, nirai, pitrou, ronaldoussoren, vstinner
Priority:	normal	Keywords:	patch

Created on 2010-08-04 08:33 by alanwilter, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name	Uploaded	Description	Edit
io_signal.diff	neologix, 2011-01-09 18:36

Messages (16)
msg112771 - (view)	Author: Alan Wilter (alanwilter)	Date: 2010-08-04 08:33
I have this example code to illustrate a problem I am having with python3. It works fine with python 2.6 and 2.7 but does not with python 3.1. from __future__ import print_function import os, subprocess, signal def signal_handler( signum, frame ): print( "PID: %s" % pid ) print( "Timed out! Process %s killed, max exec time (%ss) exceeded" % (pid, timeTol ) ) os.kill( int( pid ), 15 ) raise Exception( "Taking too long to finish... aborting!" ) if __name__ == '__main__': timeTol = 5 cmd = 'find /' signal.signal(signal.SIGALRM, signal_handler) signal.alarm(timeTol) p = subprocess.Popen(cmd, shell=True, stderr = subprocess.STDOUT, stdout = subprocess.PIPE) pid = p.pid out = str( p.communicate()[0].decode() ) print(out) Executing it: time python3.1 timout.py PID: 27687 Timed out! Process 27687 killed, max exec time (5s) exceeded Traceback (most recent call last): File "timout.py", line 23, in <module> out = str( p.communicate()[0].decode() ) File "/sw/lib/python3.1/subprocess.py", line 719, in communicate stdout = self.stdout.read() File "timout.py", line 9, in signal_handler raise Exception( "Taking too long to finish... aborting!" ) Exception: Taking too long to finish... aborting! python3.1 timout.py 0.52s user 3.88s system 19% cpu 22.841 total #### It prints essentially the same thing with a very big difference it takes 22 seconds and actually the alarm only works when the whole task ('find /') is finished.
msg125815 - (view)	Author: Charles-François Natali (neologix) *	Date: 2011-01-09 00:26
It's due to the way the python interpreter handles signals: when the signal is received, python runs a stub signal handler that just sets a flag indicating that the signal has been received: the actual handler is executed later, synchronously, mainly from the main evaluation loop. The problem here is that since the main process is performing a read until EOF is reached (i.e. until the subprocess exits), the C code loops around the read(2) call without giving a chance for the handler to run. A patch for is similar issue has been applied, see http://bugs.python.org/issue9617# But it only fixed writes, not reads. I think _bufferedreader_read_all and _bufferedreader_read_generic should be fixed too to call PyErr_CheckSignals().
msg125830 - (view)	Author: Georg Brandl (georg.brandl) *	Date: 2011-01-09 07:43
Antoine?
msg125845 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2011-01-09 15:13
Charles-François' analysis seems to be right. Note that the actual issue here is that read() always succeeds, returning a partial result (because you're executing a command, 'find /', which outputs a lot of data). If read() were interrupted before anything could be read, it would return EINTR and the handler would get executed immediately. Anyone wants to propose a patch + tests?
msg125851 - (view)	Author: Charles-François Natali (neologix) *	Date: 2011-01-09 18:36
> Antoine Pitrou <pitrou@free.fr> added the comment: > > Charles-François' analysis seems to be right. Note that the actual issue here is that read() always succeeds, returning a partial result (because you're executing a command, 'find /', which outputs a lot of data). If read() were interrupted before anything could be read, it would return EINTR and the handler would get executed immediately. > > Anyone wants to propose a patch + tests? > Attached is a tentative patch: it checks for pending signals inside _bufferedreader_read_generic, fileio_readall, and rawiobase_readall. I tested quickly on regular files and on pipes, and handlers are now called on time. As a an obvious bonus, it's also easier to interrupt a script doing I/O with a KeyboardInterrupt (SIGINT). If nothing seems blatantly wrong about this patch, I will go ahead and try to write some tests for this. Charles > ---------- > versions: +Python 3.2 > > _______________________________________ > Python tracker <report@bugs.python.org> > <http://bugs.python.org/issue9504> > _______________________________________ >
msg125863 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2011-01-09 20:59
> Attached is a tentative patch: it checks for pending signals inside > _bufferedreader_read_generic, fileio_readall, and rawiobase_readall. > I tested quickly on regular files and on pipes, and handlers are now > called on time. As a an obvious bonus, it's also easier to interrupt a > script doing I/O with a KeyboardInterrupt (SIGINT). > If nothing seems blatantly wrong about this patch, I will go ahead and > try to write some tests for this. A potential downside is that the bytes which have been read simply get lost. But since a signal can happen at any moment I guess nothing can be done to protect from it in the general case, anyway. As for the tests, the SignalsTest class in test_io.py is a natural recipient.
msg127960 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2011-02-05 01:05
It seems that my patch for issue10956 also fixes this issue (although it doesn't try to fix FileIO). It also has tests and fixes read() as well as write(). I'm sorry for not noticing that the two issues were so similar. Suggest closing this issue as duplicate.
msg127964 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2011-02-05 01:12
Sorry, bad analysis. The patch in issue10956 only seemed to work because my system is too fast and "find /" in the script returned in less than 5 seconds. Replacing it with "yes" shows that the two issued are actually independent.
msg254946 - (view)	Author: Martin Panter (martin.panter) *	Date: 2015-11-20 01:53
With Python’s current technique for signal handling, I think there is always going to be a window where a signal can arrive at the low level after checking the Python flag and before starting a blocking call. If you need to robustly handle a signal maybe it is better to use set_signal_fd() and select() or something. I haven’t looked at the patch closely, but it sounds like it could reduce the size of the window. This would make interactive signals like SIGINT more reliable, so it may be worthwhile. It probably needs properly reviewing in light of the recent EINTR (PEP 457) changes though.
msg254948 - (view)	Author: Martin Panter (martin.panter) *	Date: 2015-11-20 01:59
Sorry, EINTR changes were PEP 475.
msg255484 - (view)	Author: Jeroen Demeyer (jdemeyer) *	Date: 2015-11-27 20:35
Just a comment: if you need really robust signal handling, you just cannot do it with pure Python. I would recommend using Cython, where one has complete control over when signals are checked.
msg255499 - (view)	Author: STINNER Victor (vstinner) *	Date: 2015-11-27 22:44
I don't understand this issue. I would expect that a system call fails with EINTR if it is interrupted by a signal. Python explicitly configures the C library to get EINTR errors: automatic retry is disabled when Python sets its signal handlers. If a system call doesn't fail, what shoud we do with the result, especially when reading data (read/recv)? Yes, we _can_ checks to check regulary if we got a C signal by adding checks, but it's unclear to me if we must raise the exception immediatly, or later. Immedialty if I understand correctly the initial message. So should we check all system calls, or only system calls well known to block on I/O, like read/write, send/recv, select, etc.? Do we try to workaround design choices by the Linux kernel and the GNU libc? Right now, my favorite choice would be to do nothing. Yes, sometimes signals are delayed. Sometimes, a signal doesn't interrupt immediatly a blocked program. It's not perfect. But it's already very complex, I'm not sure that I want to make the code even more complex to check for signals in more places.
msg255520 - (view)	Author: Martin Panter (martin.panter) *	Date: 2015-11-28 03:04
Okay, I tend to agree, and don’t have a problem with rejecting this.
msg255536 - (view)	Author: Ronald Oussoren (ronaldoussoren) *	Date: 2015-11-28 10:48
Victor: read(2) and recv(2) are defined as returning all available data when interrupted by a signal and might therefore not return an error. The same is true for write(2): it returns the number of bytes written when there are any bytes written (instead of setting errno to SIGINT and returning an error). I'd expect the C implementation of readall to behave the same as the Python implementation: a signal handler that raises an exception can cause data loss due to discarding the buffer that's used to collect the data. The data loss in the script isn't worse than it currently is, AFAIK it would currently lose data due to raising the exception just after the call to readall in the next loop through the eval loop. If I read the issue correctly the signal handling issue mentioned here is basically similar to the one in this scriptlet: # BEGIN import signal, sys import subprocess def handler(sig, frames): raise RuntimeError("Alarm!") signal.signal(signal.SIGALRM, handler) signal.alarm(5) b = sys.stdin.buffer.raw.readall() print(len(b)) signal.alarm(0) #END When you run this with a large file on stdin (I've used /dev/zero) the script will continue to run indefinitely and cannot be cleanly interrupted at all (on my system its eventually killed by the system due to running out of its rlimit limitations, otherwise it can only be killed by sending it an uncatchable signal like SIGKILL or SIGABRT).
msg408259 - (view)	Author: Irit Katriel (iritkatriel) *	Date: 2021-12-10 23:20
I'm unable to reproduce this on 3.11 on a Mac. Has it been fixed?
msg408280 - (view)	Author: STINNER Victor (vstinner) *	Date: 2021-12-11 00:30
https://www.python.org/dev/peps/pep-0475/ changed deeply how Python handles signals. Python now tries to restart syscalls interrupted by signals (EINTR). A Python function only raises an exception if the Python signal handler raises an exception.

History
Date	User	Action	Args
2022-04-11 14:57:04	admin	set	github: 53713
2021-12-11 19:08:53	iritkatriel	set	status: open -> closed resolution: out of date stage: needs patch -> resolved
2021-12-11 00:30:23	vstinner	set	status: pending -> open messages: + msg408280
2021-12-10 23:20:56	iritkatriel	set	status: open -> pending nosy: + iritkatriel messages: + msg408259
2015-11-28 10:48:14	ronaldoussoren	set	nosy: + ronaldoussoren messages: + msg255536
2015-11-28 03:04:18	martin.panter	set	messages: + msg255520
2015-11-27 22:44:51	vstinner	set	messages: + msg255499
2015-11-27 20:35:53	jdemeyer	set	nosy: + jdemeyer messages: + msg255484
2015-11-20 14:45:14	vstinner	set	nosy: + vstinner
2015-11-20 01:59:53	martin.panter	set	messages: + msg254948
2015-11-20 01:53:04	martin.panter	set	nosy: + martin.panter messages: + msg254946
2011-06-30 17:58:24	pitrou	set	assignee: pitrou -> stage: patch review -> needs patch versions: + Python 3.3, - Python 3.1
2011-05-10 02:30:36	nirai	set	nosy: + nirai
2011-02-05 01:12:58	pitrou	set	versions: + Python 2.7 nosy: georg.brandl, pitrou, neologix, alanwilter messages: + msg127964 resolution: duplicate -> (no value) superseder: file.write and file.read don't handle EINTR -> stage: patch review
2011-02-05 01:05:51	pitrou	set	nosy: georg.brandl, pitrou, neologix, alanwilter resolution: duplicate superseder: file.write and file.read don't handle EINTR messages: + msg127960
2011-01-09 20:59:24	pitrou	set	nosy: georg.brandl, pitrou, neologix, alanwilter messages: + msg125863
2011-01-09 18:36:40	neologix	set	files: + io_signal.diff messages: + msg125851 keywords: + patch nosy: georg.brandl, pitrou, neologix, alanwilter
2011-01-09 15:13:57	pitrou	set	nosy: georg.brandl, pitrou, neologix, alanwilter messages: + msg125845 versions: + Python 3.2
2011-01-09 07:43:29	georg.brandl	set	assignee: pitrou messages: + msg125830 nosy: + georg.brandl
2011-01-09 01:41:53	eric.araujo	set	nosy: + pitrou
2011-01-09 00:26:56	neologix	set	nosy: + neologix messages: + msg125815
2010-08-04 08:33:55	alanwilter	create