This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author mgedmin
Recipients
Date 2002-02-23.13:44:18
SpamBayes Score
Marked as misclassified
Message-id
In-reply-to
Content
fread(3) manual page states
       fread  and  fwrite return the number of items
successfully
       read or written (i.e., not the number of
characters).   If
       an error occurs, or the end-of-file is reached,
the return
       value is a short item count (or zero).

Python only checks ferror status when the return value
is zero (Objects/fileobject.c line 550 from
Python-2.1.2 sources).  I agree that it is a good idea
to delay exception throwing until after the user has
processed the partial chunk of data returned by fread,
but there are two problems with the current
implementation: loss of errno and occasional loss of data.

Both problems are illustrated with this scenario taken
from real life:

  suppose the file descriptor refers to a pipe, and we
set O_NONBLOCK mode with fcntl (the application was
reading from multiple pipes in a select() loop and
couldn't afford to block)
  fread(4096) returns an incomplete block and sets
errno to EAGAIN
  chunksize != 0 so we do not check ferror() and return
successfully
  the next time file.read() is called we reset errno
and do fread(4096) again.  It returns a full block
(i.e. bytesread == buffersize on line 559), so we
repeat the loop and call fread(0).  It returns 0, of
course.  Now we check ferror() and find it was set
(by a previous fread(4096) called maybe a century ago).
The errno information is already lost, so we throw
an IOError with errno=0.  And also lose that 4K chunk
of valuable user data.

Regarding solutions, I can see two alternatives:
- call clearerr(f->f_fp) just before fread(), where
Python currently sets errno = 0;  This makes sure that
we do not have stale ferror() flag and errno is valid,
but we might not notice some errors.  That doesn't
matter for EAGAIN, and for errors that occur reliably
if we repeat fread() on the same stream.  We might
still lose data if an exception is thrown on the second
or later loop iteration.
- always check for ferror() immediatelly after fread().
- regarding data loss, maybe it is possible to store
the errno somewhere inside the file object and delay
exception throwing if we have successfully read some
data (i.e. bytesread > 0).  The exception could be
thrown on the next call to file.read() before
performing anything else.
History
Date User Action Args
2007-08-23 13:59:21adminlinkissue521782 messages
2007-08-23 13:59:21admincreate