Issue 10956: file.write and file.read don't handle EINTR

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/55165

classification

Title:	file.write and file.read don't handle EINTR
Type:	behavior	Stage:	resolved
Components:	IO	Versions:	Python 3.2, Python 3.3, Python 2.7

process

Status:	closed	Resolution:	fixed
Dependencies:		Superseder:
Assigned To:	pitrou	Nosy List:	amaury.forgeotdarc, eggy, neologix, pitrou
Priority:	normal	Keywords:	patch

Created on 2011-01-20 14:12 by eggy, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name	Uploaded	Description	Edit
fwrite.py	eggy, 2011-01-20 14:12
fread.py	eggy, 2011-01-20 14:13
fwrite.py	eggy, 2011-01-20 15:08
eintr_io.patch	pitrou, 2011-01-21 00:06

Messages (10)
msg126616 - (view)	Author: Mark Florisson (eggy) *	Date: 2011-01-20 14:12
In both Python versions EINTR is not handled properly in the file.write and file.read methods. ------------------------- file.write ------------------------- In Python 2, file.write can write a short amount of bytes, and when it is interrupted there is no way to tell how many bytes it actually wrote. In Python 2 it raises an IOError with EINTR, whereas in Python 3 it simply stops writing and returns the amount of bytes written. Here is the output of fwrite with Python 2.7 (see attached files). Note also how inconsistent the IOError vs OSError difference is: python2.7 fwrite.py Writing 100000 bytes, interrupt me with SIGQUIT (^\) ^\^$3, <frame object at 0x9535ab4>) Traceback (most recent call last): File "fwrite.py", line 16, in <module> print(write_file.write(b'a' * 100000)) IOError: [Errno 4] Interrupted system call read 65536 bytes ^\(3, <frame object at 0x9535ab4>) Traceback (most recent call last): File "fwrite.py", line 21, in <module> print('read %d bytes' % len(os.read(r, 100000))) OSError: [Errno 4] Interrupted system call Because os.read blocks on the second call to read, we know that only 65536 of the 100000 bytes were written. ------------------------- file.read ------------------------- When interrupting file.read in Python 3, it may have read bytes that are inaccessible. In Python 2 it returns the bytes, whereas in Python 3 it raises an IOError with EINTR. A demonstration: $ python3.2 fread.py Writing 7 bytes Reading 20 bytes... interrupt me with SIGQUIT (^$ ^\(3, <frame object at 0x8e1d2d4>) Traceback (most recent call last): File "fread.py", line 18, in <module> print('Read %d bytes using file.read' % len(read_file.read(20))) IOError: [Errno 4] Interrupted system call Reading any remaining bytes... ^\(3, <frame object at 0x8e1d2d4>) Traceback (most recent call last): File "fread.py", line 23, in <module> print('reading: %r' % os.read(r, 4096)) OSError: [Errno 4] Interrupted system call Note how in Python 2 it stops reading when interrupted and it returns our bytes, but in Python 3 it raises IOError while there is no way to access the bytes that it read. So basically, this behaviour is just plain wrong as EINTR is not an error, and this behaviour makes it impossible for the caller to handle the situation correctly. Here is how I think Python should behave. I think that it should be possible to interrupt both read and write calls, however, it should also be possible for the user to handle these cases. file.write, on EINTR, could decide to continue writing if no Python signal handler raised an exception. Analogously, file.read could decide to keep on reading on EINTR if no Python signal handler raised an exception. This way, it is possible for the programmer to write interruptable code while at the same time having proper file.write and file.read behaviour in case code should not be interrupted. KeyboardInterrupt would still interrupt read and write calls, because it raises an exception. If the programmer decided that writes should finish before allowing such an exception, the programmer could replace the default signal handler for SIGINT. So, in pseudo-code: bytes_written = 0 while bytes_written < len(buf): result = write(buf) if result < 0: if errno == EINTR if PyErr_CheckSignals() < 0: /* Propagate exception from signal handler */ return NULL continue else: PyErr_SetFromErrno(PyExc_IOError) return NULL buf += result bytes_written += result return bytes_written Similar code could be used for file.read with the obvious adjustments. However, in case of an error (either from the write call or from a Python signal handler), it would still be unclear how many bytes were actually written. Maybe (I think this part would be bonus points) we could put the number of bytes written on the exception object in this case, or make it retrievable in some other thread-safe way. For files with file descriptors in nonblocking mode (and maybe other cases) it will still return a short amount of bytes.
msg126617 - (view)	Author: Mark Florisson (eggy) *	Date: 2011-01-20 14:13
Here is fread.py (why can you only attach one file at a time? :P)
msg126620 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2011-01-20 14:55
What behaviour would you expect instead?
msg126621 - (view)	Author: Mark Florisson (eggy) *	Date: 2011-01-20 14:56
I think this sums it up: file.write, on EINTR, could decide to continue writing if no Python signal handler raised an exception. Analogously, file.read could decide to keep on reading on EINTR if no Python signal handler raised an exception.
msg126622 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2011-01-20 14:59
Oops, sorry, had missed the relevant part in your original message.
msg126623 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2011-01-20 15:03
> file.write, on EINTR, could decide to continue writing if no Python > signal handler raised an exception. > Analogously, file.read could decide to keep on reading on EINTR if no > Python signal handler raised an exception. Ok. This would only be done in buffered mode, though, so your fwrite.py example would have to be changed slightly (drop the ",0" in fdopen()).
msg126624 - (view)	Author: Mark Florisson (eggy) *	Date: 2011-01-20 15:08
> Ok. This would only be done in buffered mode, though, so your fwrite.py example would have to be changed slightly (drop the ",0" in fdopen()). Indeed, good catch. So apparently file.write (in buffered mode) is also "incorrect" in Python 3.
msg126668 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2011-01-21 00:06
Here is a patch for Python 3.2+.
msg127961 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2011-02-05 01:06
I hadn't noticed that issue9504 is similar. The patch there does less things, although it also touches FileIO.readall().
msg129434 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2011-02-25 21:36
Committed in r88610 (3.3), r88611 (3.2) and r88612 (2.7).

History
Date	User	Action	Args
2022-04-11 14:57:11	admin	set	github: 55165
2011-02-25 21:36:28	pitrou	set	status: open -> closed messages: + msg129434 resolution: fixed stage: patch review -> resolved
2011-02-05 01:13:11	pitrou	set	assignee: pitrou nosy: + neologix
2011-02-05 01:12:58	pitrou	unlink	issue9504 superseder
2011-02-05 01:06:29	pitrou	set	messages: + msg127961
2011-02-05 01:05:51	pitrou	link	issue9504 superseder
2011-01-21 00:06:29	pitrou	set	files: + eintr_io.patch nosy: + amaury.forgeotdarc messages: + msg126668 keywords: + patch stage: patch review
2011-01-20 15:08:00	eggy	set	files: + fwrite.py messages: + msg126624
2011-01-20 15:03:09	pitrou	set	messages: + msg126623
2011-01-20 14:59:52	pitrou	set	messages: + msg126622
2011-01-20 14:56:57	eggy	set	messages: + msg126621
2011-01-20 14:55:25	pitrou	set	nosy: + pitrou messages: + msg126620 versions: - Python 2.6, Python 2.5, Python 3.1
2011-01-20 14:13:11	eggy	set	files: + fread.py messages: + msg126617
2011-01-20 14:12:26	eggy	create