classification
Title: Interrupted system calls are not retried
Type: behavior Stage:
Components: Interpreter Core Versions: Python 2.7
process
Status: closed Resolution: duplicate
Dependencies: Superseder: file readline, readlines & readall methods can lose data on EINTR
View: 12268
Assigned To: Nosy List: DasIch, Trundle, aronacher, benjamin.peterson, exarkun, loewis, ned.deily, pitrou, ronaldoussoren, stutzbach
Priority: normal Keywords:

Created on 2010-09-16 02:02 by aronacher, last changed 2012-07-12 21:36 by pitrou. This issue is now closed.

Messages (20)
msg116504 - (view) Author: Armin Ronacher (aronacher) * (Python committer) Date: 2010-09-16 02:02
Currently Python does not check fread and other IO calls for EINTR.  This usually is not an issue, but on OS X a continued program will be sent an SIGCONT signal which causes fread to be interrupted.

Testcase:

mitsuhiko@nausicaa:~$ python2.7
Python 2.7 (r27:82508, Jul  3 2010, 21:12:11) 
[GCC 4.0.1 (Apple Inc. build 5493)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from signal import SIGCONT, signal
>>> def show_signal(*args):
...  print 'Got SIGCONT'
...  
>>> signal(SIGCONT, show_signal)
0
>>> import sys
>>> sys.stdin.read()
^Z
[1]+  Stopped                 python2.7
mitsuhiko@nausicaa:~$ fg
python2.7
Got SIGCONT
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IOError: [Errno 4] Interrupted system call
>>> 

Expected behavior: on fg it should continue to read.  The solution would be to loop all calls to fread and friends until errno is no longer EINTR.  Now the question is how to best do that.  I can't think of a portable way to define a macro that continues to run an expression until errno is EINTR, maybe someone else has an idea.

Otherwise it would be possible to just put the loops by hand around each fread/fgetc etc. call, but that would make the code quite a bit more ugly.

Technically I suppose the problem applies to all platforms, on OS X it's just easier to trigger.
msg116505 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2010-09-16 03:09
The test fails exactly the same way using a python 2.6.6 on a current Debian (testing) Linux 2.6.32 so I think it better to remove the OS X from the title.  Also the versions field refers to where a potential fix might be applied; that rules out 2.5 and 2.6 since it is not a security problem.

I was also curious if calling signal.siginterrupt for SIGCONT had any effect on this.  Neither False nor True on either OS X or linux seemed to change the behavior.
msg116521 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2010-09-16 11:28
I fail to see why this is a bug. If the system call is interrupted, why should Python not report that?
msg116524 - (view) Author: Armin Ronacher (aronacher) * (Python committer) Date: 2010-09-16 11:47
One could argue of course that every user of Python should handle EINTR, but that's something I think should be solved in the IO library because very few people know that one is supposed to restart syscalls on EINTR on POSIX systems.

Ruby for instance handles EINTR properly:

mitsuhiko@nausicaa:~$ ruby -e 'puts $stdin.read.inspect'
^Z
[1]+  Stopped
mitsuhiko@nausicaa:~$ fg
ruby -e 'puts $stdin.read.inspect'
test
"test\n"



So does perl:

mitsuhiko@nausicaa:~$ perl -e 'chomp($x = <STDIN>); print $x'
^Z
[1]+  Stopped
mitsuhiko@nausicaa:~$ fg
perl -e 'chomp($x = <STDIN>); print $x'
test
test
msg116525 - (view) Author: Armin Ronacher (aronacher) * (Python committer) Date: 2010-09-16 11:49
Interestingly even PHP handles that properly.
msg116529 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2010-09-16 12:03
> One could argue of course that every user of Python should handle
> EINTR, but that's something I think should be solved in the IO
> library because very few people know that one is supposed to restart
> syscalls on EINTR on POSIX systems.
>
> Ruby for instance handles EINTR properly:

Hmm. So under what conditions should it continue, and under what 
conditions should it raise an exception (when errno is EINTR)?
msg116530 - (view) Author: Armin Ronacher (aronacher) * (Python committer) Date: 2010-09-16 12:05
> Hmm. So under what conditions should it continue, and under what 
> conditions should it raise an exception (when errno is EINTR)?

EINTR indicates a temporary failure.  In that case it should always retry.

A common macro for handling that might look like this:

#define RETRY_ON_EINTR(x) ({ \
  typeof(x) rv; \
  do { rv = x; } while (rv < 0 && errno == EINTR); \
  rv;\
})

But from what I understand, braces in parentheses are a GCC extension.
msg116532 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2010-09-16 12:13
Am 16.09.10 14:06, schrieb Armin Ronacher:
>
> Armin Ronacher<armin.ronacher@active-4.com>  added the comment:
>
>> Hmm. So under what conditions should it continue, and under what
>> conditions should it raise an exception (when errno is EINTR)?
>
> EINTR indicates a temporary failure.  In that case it should always retry.

But Ruby doesn't. If you send SIGINT, it will print

-e:1:in `read': Interrupt
	from -e:1

If you send SIGHUP, it will print

Hangup

So it is surely more complex than "always retry".
msg116534 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2010-09-16 12:18
Wouldn't retrying on EINTR cause havoc when you try to interrupt a process?

That is: what would happen with the proposed patch when a python script does a read that takes a very long time and the user tries to interrupt the script (by using Ctrl+C to send a SIGTERM)?

If I my understanding of is correct the patch will ensure that the process does not get interupted because the default SIGTERM handler just sets a flag that's periodicly checked in the python interpreter loop.  With the proposed patch python would not get around to checking that flag until the I/O operation is finished.
msg116538 - (view) Author: Armin Ronacher (aronacher) * (Python committer) Date: 2010-09-16 12:34
The following minimal C code shows how EINTR can be handled:

#include <stdlib.h>
#include <stdio.h>
#include <errno.h>
#include <signal.h>

#define BUFFER_SIZE 1024


int
main()
{
    char buffer[BUFFER_SIZE];
    printf("PID = %d\n", getpid());
    while (1) {
        int rv = fgetc(stdin);
        if (rv < 0) {
            if (feof(stdin))
                break;
            if (errno == EINTR)
                continue;
            printf("Call failed with %d\n", errno);
            return 1;
        }
        else
            fputc(rv, stdout);
    }
    return 0;
}



Test application:

mitsuhiko@nausicaa:/tmp$ ./a.out 
PID = 22806
Terminated
mitsuhiko@nausicaa:/tmp$ ./a.out 
PID = 22809

mitsuhiko@nausicaa:/tmp$ ./a.out 
PID = 22812
^Z
[2]+  Stopped                 ./a.out
mitsuhiko@nausicaa:/tmp$ fg
./a.out
test
test
foo
foo

First signal sent was TERM, second was INT.  Last case was sending to background, receiving the ignored SIGCONT signal, fgetc returning -1 and fgetc being called again because of errno being EINTR.
msg116539 - (view) Author: Armin Ronacher (aronacher) * (Python committer) Date: 2010-09-16 12:36
> Wouldn't retrying on EINTR cause havoc when you try to interrupt a process?

All your C applications are doing it, why should Python cause havok there?  Check the POSIX specification on that if you don't trust me.

> That is: what would happen with the proposed patch when a python script
> does a read that takes a very long time and the user tries to interrupt 
> the script (by using Ctrl+C to send a SIGTERM)?
EINTR is only returned if nothing was read so far and the call was interrupted in case of fread.

Here a quick explanation from the GNU's libc manual:
http://www.gnu.org/s/libc/manual/html_node/Interrupted-Primitives.html
msg116540 - (view) Author: Armin Ronacher (aronacher) * (Python committer) Date: 2010-09-16 12:38
There is a funny story related to that though :)

"BSD avoids EINTR entirely and provides a more convenient approach:
 to restart the interrupted primitive, instead of making it fail."

BSD does, but the Mach/XNU kernel combo on OS X is not.  Which is why all the shipped BSD tools have that bug, but if you run their GNU equivalents on OS X everything work as expected.
msg116541 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-09-16 12:55
Some parts of the stdlib already retry manually (such as SocketIO, subprocess, multiprocessing, socket.sendall), so it doesn't sound unreasonable for the IO lib to retry too.

There are/were other people complaining in similar cases: #7978, #1628205.
msg116544 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2010-09-16 13:27
On 16 Sep, 2010, at 14:36, Armin Ronacher wrote:

> 
> Armin Ronacher <armin.ronacher@active-4.com> added the comment:
> 
>> Wouldn't retrying on EINTR cause havoc when you try to interrupt a process?
> 
> All your C applications are doing it, why should Python cause havok there?  Check the POSIX specification on that if you don't trust me.
> 
>> That is: what would happen with the proposed patch when a python script
>> does a read that takes a very long time and the user tries to interrupt 
>> the script (by using Ctrl+C to send a SIGTERM)?
> EINTR is only returned if nothing was read so far and the call was interrupted in case of fread.
> 
> Here a quick explanation from the GNU's libc manual:
> http://www.gnu.org/s/libc/manual/html_node/Interrupted-Primitives.html

You conveniently didn't quote the part of my message where I explained why I think there may be a problem.

CPython's signal handlers just set a global flag to indicate that a signal occurred and run the actual signal handler later on from the main interpreter loop, see signal_handler in Modules/signal.c and intcatcher in Parser/intrcheck.c.  

The latter contains the default handler for SIGINT and that already contains code that deals with SIGINT not having any effect (when you sent SIGINT twice in a row without CPython running pending calls the interpreter gets aborted). 

Because Python's signal handlers only set a flag and do the actual action later on blindly rerunning system calls when errno == EINTR may result in programs that don't seem to react to signals at all.
msg116545 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2010-09-16 13:29
On 16 Sep, 2010, at 14:38, Armin Ronacher wrote:

> 
> Armin Ronacher <armin.ronacher@active-4.com> added the comment:
> 
> There is a funny story related to that though :)
> 
> "BSD avoids EINTR entirely and provides a more convenient approach:
> to restart the interrupted primitive, instead of making it fail."
> 
> BSD does, but the Mach/XNU kernel combo on OS X is not.  Which is why all the shipped BSD tools have that bug, but if you run their GNU equivalents on OS X everything work as expected.

setting the SA_RESTART in the call to sigaction should work (on OSX HAVE_SIGACTION is defined), unless the manpage is lying.

Ronald
msg116546 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-09-16 13:33
> Because Python's signal handlers only set a flag and do the actual
> action later on blindly rerunning system calls when errno == EINTR may
> result in programs that don't seem to react to signals at all.

You just need to call PyErr_CheckSignals() and check its result.
msg116547 - (view) Author: Armin Ronacher (aronacher) * (Python committer) Date: 2010-09-16 13:36
> setting the SA_RESTART in the call to sigaction should work (on OSX HAVE_SIGACTION is defined), unless the manpage is lying.

It should work, haven't tried.  From what I understand on a BSD system, retrying is the default.
msg116548 - (view) Author: Armin Ronacher (aronacher) * (Python committer) Date: 2010-09-16 13:40
> You conveniently didn't quote the part of my message where I explained 
> why I think there may be a problem.
I understand that, but there are already cases in Python where EINTR is handled properly.  In fact, quoting socketmodule.c:

    if (res == EINTR && PyErr_CheckSignals())
msg116549 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2010-09-16 13:42
On 16 Sep, 2010, at 15:40, Armin Ronacher wrote:

> 
> Armin Ronacher <armin.ronacher@active-4.com> added the comment:
> 
>> You conveniently didn't quote the part of my message where I explained 
>> why I think there may be a problem.
> I understand that, but there are already cases in Python where EINTR is handled properly.  In fact, quoting socketmodule.c:
> 
>    if (res == EINTR && PyErr_CheckSignals())

This looks fine.

Ronald
msg165335 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-07-12 21:35
This has been fixed in issue #10956 (Python 3 and the io module) and issue #12268 (Python 2's file objects).
History
Date User Action Args
2012-07-12 21:36:23pitrousetsuperseder: file readline, readlines & readall methods can lose data on EINTR
resolution: out of date -> duplicate
versions: - Python 3.2, Python 3.3
2012-07-12 21:35:33pitrousetstatus: open -> closed
versions: - Python 3.1
messages: + msg165335

components: + Interpreter Core
resolution: out of date
2012-07-12 21:02:50DasIchsetnosy: + DasIch
2010-09-16 13:42:04ronaldoussorensetmessages: + msg116549
2010-09-16 13:40:22aronachersetmessages: + msg116548
2010-09-16 13:36:02aronachersetmessages: + msg116547
versions: + Python 3.3
2010-09-16 13:33:23pitrousetmessages: + msg116546
2010-09-16 13:29:15ronaldoussorensetmessages: + msg116545
2010-09-16 13:27:57ronaldoussorensetmessages: + msg116544
2010-09-16 13:21:01Trundlesetnosy: + Trundle
2010-09-16 12:55:10pitrousetnosy: + pitrou, stutzbach, benjamin.peterson, exarkun

messages: + msg116541
versions: - Python 3.3
2010-09-16 12:38:39aronachersetmessages: + msg116540
2010-09-16 12:36:48aronachersetmessages: + msg116539
2010-09-16 12:34:04aronachersetmessages: + msg116538
2010-09-16 12:18:58ronaldoussorensetnosy: + ronaldoussoren
messages: + msg116534
2010-09-16 12:13:09loewissetmessages: + msg116532
2010-09-16 12:05:59aronachersetmessages: + msg116530
2010-09-16 12:03:37loewissetmessages: + msg116529
2010-09-16 11:49:29aronachersetmessages: + msg116525
2010-09-16 11:47:39aronachersetmessages: + msg116524
2010-09-16 11:28:48loewissetnosy: + loewis
messages: + msg116521
2010-09-16 10:25:11aronachersetversions: + Python 3.3
2010-09-16 03:09:55ned.deilysetnosy: + ned.deily
title: Interrupted system calls are not retried on OS X -> Interrupted system calls are not retried
messages: + msg116505

versions: - Python 2.6, Python 2.5, Python 3.3
2010-09-16 02:02:52aronachercreate