classification
Title: file.write + closed pipe = no error
Type: behavior Stage:
Components: Interpreter Core Versions: Python 2.6, Python 2.5
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: akuchling Nosy List: akuchling, edemaine, forest_atq, naufraghi, sascha_silbe, schmir (6)
Priority: normal Keywords

Created on 2006-05-15 16:10 by edemaine, last changed 2009-10-24 16:10 by naufraghi.

Files
File name Uploaded Description Edit Remove
test.c edemaine, 2006-05-15 16:10 C program illustrating fwrite behavior
blah.py edemaine, 2006-07-02 12:35 Test case illustrating bug
Messages (7)
msg28534 - (view) Author: Erik Demaine (edemaine) Date: 2006-05-15 16:10
I am writing a Python script on Linux that gets called
via ssh (ssh hostname script.py) and I would like it to
know when its stdout gets closed because the ssh
connection gets killed.  I assumed that it would
suffice to write to stdout, and that I would get an
error if stdout was no longer connected to anything. 
This is not the case, however.  I believe it is because
of incorrect error checking in Objects/fileobject.c's
file_write.

Consider this example:

while True:
__print 'Hello'
__time.sleep (1)

If this program is run via ssh and then the ssh
connection dies, the program continues running forever
(or at least, over 10 hours).  No exceptions are thrown.

In contrast, this example does die as soon as the ssh
connection dies (within one second):

while True:
__os.write (1, 'Hello')
__time.sleep (1)

I claim that this is because os.write does proper error
checking, but file.write seems not to.  I was surprised
to find this intricacy in fwrite().  Consider the
attached C program, test.c.  (Warning: If you run it,
it will create a file /tmp/hello, and it will keep
running until you kill it.)  While the ssh connection
remains open, fwrite() reports a length of 6 bytes
written, ferror() reports no error, and errno remains
0.  Once the ssh connection dies, fwrite() still
reports a length of 6 bytes written (surprise!), but
ferror(stdout) reports an error, and errno changes to 5
(EIO).  So apparently one cannot tell from the return
value of fwrite() alone whether the write actually
succeeded; it seems necessary to call ferror() to
determine whether the write caused an error.

I think the only change necessary is on line 2443 of
file_write() in Objects/fileobject.c (in svn version
46003):

2441        n2 = fwrite(s, 1, n, f->f_fp);
2442        Py_END_ALLOW_THREADS
2443        if (n2 != n) {
2444                PyErr_SetFromErrno(PyExc_IOError);
2445                clearerr(f->f_fp);

I am not totally sure whether the "n2 != n" condition
should be changed to "n2 != n || ferror (f->f_fp)" or
simply "ferror (f->f_fp)", but I believe that the
condition should be changed to one of these
possibilities.  The current behavior is wrong.

Incidentally, you'll notice that the C code has to turn
off signal SIGPIPE (like Python does) in order to not
die right away.  However, I could not get Python to die
by re-enabling SIGPIPE.  I tried "signal.signal
(signal.SIGPIPE, signal.SIG_DFL)" and "signal.signal
(signal.SIGPIPE, lambda x, y: sys.exit ())" and neither
one caused death of the script when the ssh connection
died.  Perhaps I'm not using the signal module correctly?

I am on Linux 2.6.11 on a two-CPU Intel Pentium 4, and
I am running the latest Subversion version of Python,
but my guess is that this error transcends most if not
all versions of Python.
msg28535 - (view) Author: Erik Demaine (edemaine) Date: 2006-05-15 16:26
Logged In: YES 
user_id=265183

One more thing: fwrite() is used in a couple of other
places, and I think the same comment applies to them.  They are:

- file_writelines() in Objects/fileobject.c
- w_string() in Python/marshal.c doesn't seem to have any
error checking?  (At least no ferror() call in marhsal.c...)
- string_print() in Objects/stringobject.c doesn't seem to
have any error checking (but I'm not quite sure what this
means in Python land).
- flush_data() in Modules/_hotshot.c
- array_tofile() in Modules/arraymodule.c
- write_file() in Modules/cPickle.c
- putshort(), putlong(), writeheader(), writetab() [and the
functions that call them] in Modules/rgbimgmodule.c
- svc_writefile() in Modules/svmodule.c
msg28536 - (view) Author: A.M. Kuchling (akuchling) Date: 2006-06-03 20:16
Logged In: YES 
user_id=11375

I agree with your analysis, and think your suggested fixes are correct.

However, I'm unable to construct a small test case that exercises this bug.  I 
can't even replicate the problem with SSH; when I run a remote script with 
SSH and then kill SSH with Ctrl-C, the write() gets a -1.  Are you terminating 
SSH in some other way?  (I'd really like to have a test case for this problem 
before committing the fix.)
msg28537 - (view) Author: Erik Demaine (edemaine) Date: 2006-07-02 12:35
Logged In: YES 
user_id=265183

A simple test case is this Python script (fleshed out from
previous example), also attached:

import sys, time
while True:
__print 'Hello'
__sys.stdout.flush ()
__time.sleep (1)

Save as blah.py on machine foo, run 'ssh foo python blah.py'
on machine bar--you will see 'Hello' every second--then, in
another shell on bar, kill the ssh process on bar.  blah.py
should still be running on foo.  ('foo' and 'bar' can
actually be the same machine.)

The example from the original bug report that uses
os.write() instead of print was an example that *does* work.
msg28538 - (view) Author: Erik Demaine (edemaine) Date: 2006-08-09 16:13
Logged In: YES 
user_id=265183

Just to clarify (as I reread your question): I'm killing the
ssh via UNIX (or Cygwin) 'kill' command, not via CTRL-C.  I
didn't try, but it may be that CTRL-C works fine.
msg59630 - (view) Author: Ralf Schmitt (schmir) Date: 2008-01-09 22:29
the c program is broken as it does not check the error code of fflush.
The real problem is buffering.

while True:
__print 'Hello'
__time.sleep (1)

will not notice an error until the buffers are flushed.
Running python t.py |head -n2 and killing head does not give me an
error. with PYTHONUNBUFFERED=1 or when using sys.stdout.flush() the
program breaks with:

~/ PYTHONUNBUFFERED=1 python t.py|head -n2                       
ralf@rat64 ok
Hello
Hello
Traceback (most recent call last):
  File "t.py", line 5, in <module>
    print "Hello"
IOError: [Errno 32] Broken pipe
msg59631 - (view) Author: Ralf Schmitt (schmir) Date: 2008-01-09 22:34
ahh.no. the c program does the fflush on the logfile...sorry.
History
Date User Action Args
2009-10-24 16:10:58naufraghisetnosy: + naufraghi
type: behavior
2009-10-07 18:23:12forest_atqsetnosy: + forest_atq

versions: + Python 2.6
2009-03-25 12:57:51sascha_silbesetnosy: + sascha_silbe
2008-01-09 22:34:31schmirsetmessages: + msg59631
2008-01-09 22:29:07schmirsetnosy: + schmir
messages: + msg59630
2006-05-15 16:10:06edemainecreate