classification
Title: subprocess is not EINTR-safe
Type: behavior Stage:
Components: Library (Lib) Versions: Python 3.2, Python 3.1, Python 2.7, Python 2.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: gregory.p.smith Nosy List: amaury.forgeotdarc, astrand, cmiller, dmalcolm, gregory.p.smith, jnoller, jyasskin, mathmodave, mattjohnston, mpitt, naufraghi, nirs, r.david.murray, rnk, schmir, timjr
Priority: normal Keywords: patch

Created on 2004-11-17 21:07 by astrand, last changed 2010-03-01 00:44 by gregory.p.smith. This issue is now closed.

Files
File name Uploaded Description Edit
nointr.patch astrand, 2004-11-17 21:09 Patch for subprocess.py and test_subprocess.py
no-EINTR-subprocess.py-25-maint-r65475.patch naufraghi, 2008-12-24 10:36 no EINTR patch upgraded to 25-maint r65475
trunk-diff-unified.txt cmiller, 2009-08-12 18:31 patch against trunk
bugtest.py mathmodave, 2010-01-14 10:58 Reproduces fault
Messages (29)
msg23177 - (view) Author: Peter ├ůstrand (astrand) * (Python committer) Date: 2004-11-17 21:07
The subprocess module is not safe for use with signals,
because it doesn't retry the system calls upon EINTR.
However, as far as I understand it, this is true for
most other Python modules as well, so it isn't obvious
that the subprocess needs to be fixed. 

The problem was first noticed by John P Speno. 
msg23178 - (view) Author: Peter ├ůstrand (astrand) * (Python committer) Date: 2004-11-17 21:15
Logged In: YES 
user_id=344921

One way of testing subprocess for signal-safeness is to
insert these lines just after _cleanup():

import signal
signal.signal(signal.SIGALRM, lambda x,y: 1)
signal.alarm(1)
import time
time.sleep(0.99)

Then run test_subprocess.py. 
msg23179 - (view) Author: Matt Johnston (mattjohnston) Date: 2004-12-22 07:07
Logged In: YES 
user_id=785805

I've hit this on a Solaris 9 box without explicitly using
signals. Using the DCOracle module, a seperate Oracle
process is executed. When this terminates, a SIGCHLD is sent
to the calling python process, which may be in the middle of
a select() in the communicate() call, causing EINTR. From
the output of truss (like strace), a sigchld handler doesn't
appear to be getting explicitly installed by the Oracle module.

SunOS 5.9 Generic_112233-01 sun4u sparc SUNW,Sun-Fire-280R
msg23180 - (view) Author: Martin Pitt (mpitt) Date: 2007-02-26 12:15
I just got two different Ubuntu bug reports about this problem as well, and I'm unsure how to circumvent this at the application level.

  http://librarian.launchpad.net/6514580/Traceback.txt
  http://librarian.launchpad.net/6527195/Traceback.txt

(from https://launchpad.net/bugs/87292 and its duplicate)
msg23181 - (view) Author: Martin Pitt (mpitt) Date: 2007-03-14 22:36
I updated Peter's original patch to 2.5+svn fixes and added proper tests to test_subprocess.py. It works great now.

What do you think about this approach? Fixing it only in submodule feels a bit strange, but then again, this is meant to be an easy to use abstraction, and most of the people that were hit by this (according to Google) encountered the problem in subprocess.

I don't see how to attach something here, so I attached the updated patch to the Ubuntu bug (https://launchpad.net/bugs/87292):

  http://librarian.launchpad.net/6807594/subprocess-eintr-safety.patch

Thanks,

Martin
msg23182 - (view) Author: Martin Pitt (mpitt) Date: 2007-03-15 08:57
Updated patch: http://librarian.launchpad.net/6820347/subprocess-eintr-safety.patch

I removed the _read_all() function, since that broke the 1 MB exception limit and does not seem to be necessary in the first place.
msg23183 - (view) Author: Tim Daly, Jr. (timjr) Date: 2007-07-27 04:14
I hit this in Python 2.5.1 on an Intel Mac in a PyQt application.  mpitt's patch at http://librarian.launchpad.net/6820347/subprocess-eintr-safety.patch fixed it for me.
msg57055 - (view) Author: Ralf Schmitt (schmir) Date: 2007-11-02 16:30
In normal application code that signal.alarm is called for a reason. And
the caller most probably expects it to interrupt the read calls. So I
think the proposed patch is wrong. 

What was the signal interrupting the read calls? maybe SIGPIPE?
msg64821 - (view) Author: Ralf Schmitt (schmir) Date: 2008-04-01 20:36
Of course the signal handler may raise an exception, so my last argument
isn't that good.
msg64832 - (view) Author: Jeffrey Yasskin (jyasskin) * (Python committer) Date: 2008-04-02 03:02
I think the proper behavior on EINTR may depend on which subprocess call
we're in. For example, the user can easily loop on .wait() herself if
she wants to ignore EINTR. But it's a lot harder to loop on Popen() if
the read() in _execute_child is interrupted. So my inclination would be
to let EINTR be raised in the first case, and use a loop to handle it in
the second.

Regarding the patch, a wrapper function called as
"retry_on_eintr(obj.write, data)" might be a cleaner way to do it.
msg69397 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2008-07-07 20:50
fyi - To fix issue #2113 I added handling of a select.error errno.EINTR
being raised during the select.select call in r64756.
msg70433 - (view) Author: Jesse Noller (jnoller) * (Python committer) Date: 2008-07-30 17:27
I think this should be resolved in 2.6/3.0 if possible. Especially if 
distributions like Ubuntu are self-patching the fix into place. For 
reference, see: http://mg.pov.lt/blog/subprocess-in-2.4
msg72593 - (view) Author: Matteo Bertini (naufraghi) Date: 2008-09-05 13:40
I'd like to suggest to rise the priority of this bug.
Till this bus is around, no way using any module using subprocess.Popen 
form a PyQt app (and I suppose PyGtk and wxPython too).
msg72596 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2008-09-05 14:43
Two remarks:
1 - The part of the patch around the call to select.select() is already
in trunk since r64756, almost in the same form. good.

2 - the patch seems to replace all calls to os.write, os.read and
os.waipid. But it is based on a very old version of subprocess, and
r38169 added a new call to os.waitpid. I don't know if it should be
replaced as well.
msg72709 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2008-09-06 21:51
its too late in the release process for subprocess internals being EINTR
safe to make it into 2.6 but it is reasonable for 2.6.1.
msg73336 - (view) Author: Matteo Bertini (naufraghi) Date: 2008-09-17 17:32
Upgrade subprocess.py patch to 25-maint r65475
(apply cleanly with http://bugs.python.org/issue2113 fixed)
msg74916 - (view) Author: Matteo Bertini (naufraghi) Date: 2008-10-17 14:52
Factorized try-except code, merged r65475 from 25-maint.
Protetect std[in|out|err] read and write too.
msg74924 - (view) Author: Matteo Bertini (naufraghi) Date: 2008-10-17 16:30
Ups, forgot a _no_intr around select.select

Index: subprocess.py
===================================================================
--- subprocess.py	(revisione 19645)
+++ subprocess.py	(copia locale)
@@ -1178,7 +1178,7 @@
 
             input_offset = 0
             while read_set or write_set:
-                rlist, wlist, xlist = select.select(read_set, 
write_set, [])
+                rlist, wlist, xlist = _no_intr(select.select)(read_set, 
write_set, [])
 
                 if self.stdin in wlist:
                     # When select has indicated that the file is 
writable,
msg78033 - (view) Author: Matteo Bertini (naufraghi) Date: 2008-12-18 17:21
Python 2.5.3 is near but the I think the fix in
http://svn.python.org/view?rev=65475&view=rev
is not enough, there are a lot of other places where EINTR can cause and 
error.
msg78103 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2008-12-20 13:52
naufraghi> there are a lot of other places where EINTR 
naufraghi> can cause and error.

Can you write a list of these places?
msg78246 - (view) Author: Matteo Bertini (naufraghi) Date: 2008-12-23 18:54
Please have a look at the proposed patch:

http://bugs.python.org/file11511/subprocess-eintr-safety-25maint-
r65475.patch

the list is more or less the patch itself.
msg78253 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2008-12-24 02:03
Instead of define a method for each "syscall", you can write a generic 
function that retry at OSError(EINTR):

def no_intr(self, func, *args, **kw):
  while True:
    try:
      return func(*args, **kw)
    except OSError, e:
      if e.errno == errno.EINTR:
        continue
      else:
        raise

x=_waitpid_no_intr(pid, options) becomes x=no_intr(os.waitpid, pid, 
options).
msg78254 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2008-12-24 02:08
Oh, the new patch (subprocess-retry-on-EINTR-std-in-out-err.diff) has 
already a generic function (_no_intr) which is a little bit different 
than mine (looks like a decorator). Can you delete your old patch?
msg78259 - (view) Author: Matteo Bertini (naufraghi) Date: 2008-12-24 10:36
no EINTR patch upgraded to 25-maint r65475 that protects:

*) all direct calls
*) all returned fd

I hope :P
msg79391 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2009-01-08 01:29
Since Python 2.5 only accept security fixes, you should update your 
patch to Python trunk.
msg91502 - (view) Author: Chad Miller (cmiller) Date: 2009-08-12 18:31
File
"/home/cmiller/work/cabzr/desktopcouch/getport-at-call-time/desktopcouch/start_local_couchdb.py",
line 93, in run_couchdb
    retcode = subprocess.call(local_exec)
  File "/usr/lib/python2.6/subprocess.py", line 444, in call
    return Popen(*popenargs, **kwargs).wait()
  File "/usr/lib/python2.6/subprocess.py", line 1123, in wait
    pid, sts = os.waitpid(self.pid, 0)
exceptions.OSError: [Errno 4] Interrupted system call

Now what?  The process started, but I have no way of knowing when it
finishes or the exit value when it does, because I don't have access to
the Popen object.  Nor can I even kill it and try again, because I can't
get he process id.

try/except in my code can never help.  It must be put in the stdlib.

Or, if this is too egregious, then the docs should scream that
subprocess.call can never safely be used, and users should avoid it.



  File
"/home/cmiller/work/cabzr/desktopcouch/getport-at-call-time/desktopcouch/start_local_couchdb.py",
line 93, in run_couchdb
    retcode = subprocess.call(local_exec)
  File "/usr/lib/python2.6/subprocess.py", line 444, in call
    return Popen(*popenargs, **kwargs).wait()
  File "/usr/lib/python2.6/subprocess.py", line 595, in __init__
    errread, errwrite)
  File "/usr/lib/python2.6/subprocess.py", line 1084, in _execute_child
    data = os.read(errpipe_read, 1048576) # Exceptions limited to 1 MB
exceptions.OSError: [Errno 4] Interrupted system call


This os.read is a byproduct of something the Popen.__init__
implementation must do, and it is safe for it to continue to get the
information it needs, without the user's knowledge.

The process is started, then this is aborted before the Popen.stdout and
.stderr are set up, leaving the object in a weird state.
msg97756 - (view) Author: David Oxley (mathmodave) Date: 2010-01-14 10:58
Another instance of a blocking function within subprocess not being protected against EINTR 

Python 2.6.4, subprocess.py, Popen function, line 1115:
data = os.read(errpipe_read, 1048576) # Exceptions limited to 1 MB

If a signal arrives while blocked in this read, the EINTR/OSError is passed up to whatever called subprocess.Popen. Retrying the Popen doesn't help because the child process may already have started but the caller has no way to know this nor does the caller have any control over the child process.

===

In the example code, the first subprocess.Popen starts without issue but while in the second Popen call, p1's SIGCHLD is received by the parent. 
p2 is never set, but the second copy of /bin/date starts running anyway.

The "preexec_fn = lambda : time.sleep(2)" in the second Popen is a little contrived but serves to guarantee that the SIGCHLD will break the Popen for the purposes of the demonstration. I have seen this failure mode when using vanilla Popen calls although you have to be lucky/unlucky to see it.

====

This is in python 2.6.4:
> md5sum subprocess.py
2ac8cefe8301eadce87630b230d6fff2  subprocess.py

====

I expect the fix is equivalent to cmiller's trunk-diff-unified.txt
msg100228 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2010-03-01 00:05
fixed in trunk r78523.  backporting to 2.6 and 3.1.
msg100230 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2010-03-01 00:44
merged into 2.6 and 3.1 release branches.
History
Date User Action Args
2010-03-01 00:44:37gregory.p.smithsetversions: + Python 3.1, Python 2.7, Python 3.2
2010-03-01 00:44:25gregory.p.smithsetstatus: open -> closed
resolution: fixed
messages: + msg100230
2010-03-01 00:05:54gregory.p.smithsetmessages: + msg100228
2010-02-28 22:15:19gregory.p.smithsetassignee: astrand -> gregory.p.smith
2010-01-23 21:28:51nirssetnosy: + nirs
2010-01-14 10:59:01mathmodavesetfiles: + bugtest.py
nosy: + mathmodave
messages: + msg97756

2009-10-16 21:11:25dmalcolmsetnosy: + dmalcolm
2009-10-13 01:33:41rnksetnosy: + rnk
2009-08-25 20:01:47r.david.murraysetnosy: + r.david.murray
2009-08-12 18:31:51cmillersetfiles: + trunk-diff-unified.txt
nosy: + cmiller
messages: + msg91502

2009-03-24 23:01:39hayposetnosy: - haypo
2009-01-08 01:29:25hayposetmessages: + msg79391
2008-12-24 10:36:22naufraghisetfiles: + no-EINTR-subprocess.py-25-maint-r65475.patch
messages: + msg78259
2008-12-24 10:21:21naufraghisetfiles: - subprocess-eintr-safety-25maint-r65475.patch
2008-12-24 09:42:06naufraghisetfiles: - subprocess-retry-on-EINTR-std-in-out-err.diff
2008-12-24 02:08:18hayposetmessages: + msg78254
2008-12-24 02:03:03hayposetmessages: + msg78253
2008-12-23 18:54:57naufraghisetmessages: + msg78246
2008-12-20 14:38:28loewissetversions: - Python 2.5, Python 2.4, Python 2.5.3
2008-12-20 13:52:48hayposetmessages: + msg78103
2008-12-18 17:29:20hayposetnosy: + haypo
2008-12-18 17:21:23naufraghisetmessages: + msg78033
versions: + Python 2.5.3
2008-10-17 16:30:23naufraghisetmessages: + msg74924
2008-10-17 14:52:57naufraghisetfiles: + subprocess-retry-on-EINTR-std-in-out-err.diff
messages: + msg74916
2008-09-17 17:32:20naufraghisetfiles: + subprocess-eintr-safety-25maint-r65475.patch
keywords: + patch
messages: + msg73336
2008-09-06 21:51:32gregory.p.smithsetpriority: low -> normal
type: behavior
messages: + msg72709
versions: + Python 2.6, Python 2.5
2008-09-05 14:43:23amaury.forgeotdarcsetnosy: + amaury.forgeotdarc
messages: + msg72596
2008-09-05 13:40:43naufraghisetnosy: + naufraghi
messages: + msg72593
2008-07-30 17:27:25jnollersetnosy: + jnoller
messages: + msg70433
2008-07-07 20:50:59gregory.p.smithsetmessages: + msg69397
2008-04-02 03:02:19jyasskinsetnosy: + jyasskin
messages: + msg64832
2008-04-01 22:12:19gregory.p.smithsetnosy: + gregory.p.smith
2008-04-01 20:36:28schmirsetmessages: + msg64821
2007-11-02 16:30:18schmirsetnosy: + schmir
messages: + msg57055
2004-11-17 21:07:57astrandcreate