classification
Title: file readline, readlines & readall methods can lose data on EINTR
Type: Stage: patch review
Components: Versions: Python 3.2, Python 3.3, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: gregory.p.smith Nosy List: Gary.Miguel, asvetlov, gregory.p.smith, haypo, jcon, mark.dickinson, pitrou, python-dev
Priority: high Keywords: needs review, patch

Created on 2011-06-05 20:39 by gregory.p.smith, last changed 2013-02-01 21:20 by gregory.p.smith. This issue is now closed.

Files
File name Uploaded Description Edit
fix-signal-eintr-read-27-gps02.diff gregory.p.smith, 2012-06-25 08:01 fixes for 2.7's read, readline, readlines and readinto with unittests review
Messages (21)
msg137714 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2011-06-05 20:39
The file object readline() and readlines() methods can lose data when an underlying read system call is interrupted.  They will abort with an IOError in this case but any incomplete line data they have read will be discarded.

readline() and readlines() should never raise an IOError for the EINTR interrupted system call case.  They should handle that gracefully, retrying their reads after letting any Python signal handlers run.
msg137716 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2011-06-05 21:47
3.x has the same issue.  unittest & patch forthcoming that addresses that as well.

2.6 also has the issue but it is in security fix only mode so I won't backport to that.
msg137722 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2011-06-06 00:36
.readall() and the equivalent unbounded .read() also have this problem.
msg137731 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2011-06-06 07:34
I haven't looked beyond the reading methods it is possible that some of the write implementations have a similar issue.  Patch gps02 for 3.2 attached.

I'll use that as the basis for a stand alone test_file_eintr.py targeted at 2.7.
msg137799 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-06-07 09:57
I'm not sure why you're creating a separate test file. There are already signals-related tests in test_io. Also, perhaps you can reuse the idioms used there, rather than spawn subprocesses.
msg163742 - (view) Author: Roundup Robot (python-dev) Date: 2012-06-24 07:24
New changeset 781b95159954 by Gregory P. Smith in branch '3.2':
Fixes issue #12268: File readline, readlines and read() or readall() methods
http://hg.python.org/cpython/rev/781b95159954

New changeset 19a6bef57490 by Gregory P. Smith in branch 'default':
Fixes issue #12268: File readline, readlines and read() or readall() methods
http://hg.python.org/cpython/rev/19a6bef57490
msg163744 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2012-06-24 07:27
I'm leaving this open as I still need to audit the write methods and commit the fix(es) for 2.7.

I tried to merge the test into test_io's signals tests but I could not get that to actually work to reproduce the original problem so I kept my process based test_file_eintr one which easily reproduces it prior to these patches being applied instead.
msg163794 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-06-24 15:20
For the record, there was a crash on the ARM buildbot:


[196/368/1] test_io
Timeout (1:00:00)!
Thread 0x2aac5300:
  File "/var/lib/buildbot/buildarea/3.x.warsaw-ubuntu-arm/build/Lib/test/test_io.py", line 3051 in check_interrupted_write_retry
  File "/var/lib/buildbot/buildarea/3.x.warsaw-ubuntu-arm/build/Lib/test/test_io.py", line 3073 in test_interrupted_write_retry_text
  File "/var/lib/buildbot/buildarea/3.x.warsaw-ubuntu-arm/build/Lib/unittest/case.py", line 385 in _executeTestPart
  File "/var/lib/buildbot/buildarea/3.x.warsaw-ubuntu-arm/build/Lib/unittest/case.py", line 440 in run
  File "/var/lib/buildbot/buildarea/3.x.warsaw-ubuntu-arm/build/Lib/unittest/case.py", line 492 in __call__
  File "/var/lib/buildbot/buildarea/3.x.warsaw-ubuntu-arm/build/Lib/unittest/suite.py", line 105 in run
  File "/var/lib/buildbot/buildarea/3.x.warsaw-ubuntu-arm/build/Lib/unittest/suite.py", line 67 in __call__
  File "/var/lib/buildbot/buildarea/3.x.warsaw-ubuntu-arm/build/Lib/unittest/suite.py", line 105 in run
  File "/var/lib/buildbot/buildarea/3.x.warsaw-ubuntu-arm/build/Lib/unittest/suite.py", line 67 in __call__
  File "/var/lib/buildbot/buildarea/3.x.warsaw-ubuntu-arm/build/Lib/unittest/runner.py", line 168 in run
  File "/var/lib/buildbot/buildarea/3.x.warsaw-ubuntu-arm/build/Lib/test/support.py", line 1383 in _run_suite
  File "/var/lib/buildbot/buildarea/3.x.warsaw-ubuntu-arm/build/Lib/test/support.py", line 1417 in run_unittest
  File "/var/lib/buildbot/buildarea/3.x.warsaw-ubuntu-arm/build/Lib/test/test_io.py", line 3121 in test_main
  File "/var/lib/buildbot/buildarea/3.x.warsaw-ubuntu-arm/build/Lib/test/regrtest.py", line 1194 in runtest_inner
  File "/var/lib/buildbot/buildarea/3.x.warsaw-ubuntu-arm/build/Lib/test/regrtest.py", line 905 in runtest
  File "/var/lib/buildbot/buildarea/3.x.warsaw-ubuntu-arm/build/Lib/test/regrtest.py", line 708 in main
  File "/var/lib/buildbot/buildarea/3.x.warsaw-ubuntu-arm/build/Lib/test/__main__.py", line 13 in <module>
  File "/var/lib/buildbot/buildarea/3.x.warsaw-ubuntu-arm/build/Lib/runpy.py", line 75 in _run_code
  File "/var/lib/buildbot/buildarea/3.x.warsaw-ubuntu-arm/build/Lib/runpy.py", line 162 in _run_module_as_main
make: *** [buildbottest] Error 1
msg163910 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2012-06-25 08:01
I'm attaching an updated patch for 2.7.  It fixes read, readline, readlines and readinto and includes tests.

More code auditing for other methods to fix is still needed.
msg163911 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2012-06-25 08:05
The 3.* ubuntu arm buildbot hanging in test_io is very odd.

I'm going to undo my supposedly straight forward signal.alarm(...) to signal.setitimer(...) change first to see if that is related.
msg163913 - (view) Author: Roundup Robot (python-dev) Date: 2012-06-25 08:16
New changeset 95b071194ddd by Gregory P. Smith in branch '3.2':
Backout change e8f44ebacda7052267318cecf5b6f128d35add17.  Reverting the test
http://hg.python.org/cpython/rev/95b071194ddd

New changeset b4ae7aa21b46 by Gregory P. Smith in branch 'default':
Backout change e8f44ebacda7052267318cecf5b6f128d35add17.  Reverting the test
http://hg.python.org/cpython/rev/b4ae7aa21b46
msg163992 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-06-25 17:04
I don't think setitimer() is the culprit, rather the fact that the timeout is too short. You could try setting it to e.g. 0.4.
msg164028 - (view) Author: Roundup Robot (python-dev) Date: 2012-06-26 03:57
New changeset 67dc99a989cd by Gregory P. Smith in branch '2.7':
Fixes issue #12268 for file readline, readlines and read() and readinto methods.
http://hg.python.org/cpython/rev/67dc99a989cd
msg164029 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2012-06-26 04:04
The uses of fwrite() and fflush() also need this EINTR treatment in 2.7.  I haven't checked the write paths in 3.2 yet.

Also, the fix change to 3.2's _io module needs backporting to 2.7's _io module for people using that.
msg172777 - (view) Author: Roundup Robot (python-dev) Date: 2012-10-12 20:02
New changeset 751a91e332d9 by Gregory P. Smith in branch '2.7':
Fixes Issue #12268 for the io module - File readline, readlines and
http://hg.python.org/cpython/rev/751a91e332d9
msg176886 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-12-04 09:34
Is there anything left to do here?
msg176926 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2012-12-04 17:32
Yes. See my comment from June. The write paths need to be taken care of.
msg176936 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-12-04 19:21
Well, this issue is about "readline, readlines & readall". It would be easier to follow if you opened a separate issue.
msg181107 - (view) Author: Roundup Robot (python-dev) Date: 2013-02-01 21:12
New changeset a5e7b38caee2 by Gregory P. Smith in branch '2.7':
Additional fix for Issue #12268: The io module file object writelines() methods
http://hg.python.org/cpython/rev/a5e7b38caee2

New changeset 2fd669aa4abc by Gregory P. Smith in branch '3.2':
Additional fix for Issue #12268: The io module file object writelines() methods no longer abort early when one of its write system calls is interrupted (EINTR).
http://hg.python.org/cpython/rev/2fd669aa4abc

New changeset 30fc620e240e by Gregory P. Smith in branch '3.3':
Additional fix for issue #12268: The io module file object write methods no
http://hg.python.org/cpython/rev/30fc620e240e

New changeset 8f72519fd0e9 by Gregory P. Smith in branch 'default':
Additional fix for issue #12268: The io module file object write methods no
http://hg.python.org/cpython/rev/8f72519fd0e9
msg181108 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-02-01 21:17
Oh, so we can now implement a version of writelines() using writev()!

2013/2/1 Roundup Robot <report@bugs.python.org>:
>
> Roundup Robot added the comment:
>
> New changeset a5e7b38caee2 by Gregory P. Smith in branch '2.7':
> Additional fix for Issue #12268: The io module file object writelines() methods
> http://hg.python.org/cpython/rev/a5e7b38caee2
>
> New changeset 2fd669aa4abc by Gregory P. Smith in branch '3.2':
> Additional fix for Issue #12268: The io module file object writelines() methods no longer abort early when one of its write system calls is interrupted (EINTR).
> http://hg.python.org/cpython/rev/2fd669aa4abc
>
> New changeset 30fc620e240e by Gregory P. Smith in branch '3.3':
> Additional fix for issue #12268: The io module file object write methods no
> http://hg.python.org/cpython/rev/30fc620e240e
>
> New changeset 8f72519fd0e9 by Gregory P. Smith in branch 'default':
> Additional fix for issue #12268: The io module file object write methods no
> http://hg.python.org/cpython/rev/8f72519fd0e9
>
> ----------
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue12268>
> _______________________________________
msg181109 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2013-02-01 21:19
it was easier to just take care of auditing the write calls as part of this given the code change was directly related to it.

On Python 2.7 most of the write calls in the builtin file object (Objects/fileobject.c) rather than the new io module use the libc fwrite() call which, in linux man pages at least, is non-specific about what happens on EINTR (does it retry internally or does it return the number of bytes written so far?).  Those could well abort leading to an error.

Setting up a testcase fo to confirm that with is painful (time consuming) so I can't claim the non io module based write's do not still have an EINTR issue on 2.7.

Workaround: Use the io module instead of the builtin open() or file() calls in Python 2.7.

If someone can confirm that with a test case, it'd make another good issue to open.

As for the writev comment... go ahead. :)
History
Date User Action Args
2013-02-01 21:20:00gregory.p.smithsetstatus: open -> closed
resolution: fixed
2013-02-01 21:19:52gregory.p.smithsetmessages: + msg181109
2013-02-01 21:17:13hayposetmessages: + msg181108
2013-02-01 21:12:25python-devsetmessages: + msg181107
2012-12-05 10:58:42asvetlovsetnosy: + asvetlov
2012-12-04 19:21:47pitrousetmessages: + msg176936
2012-12-04 17:32:20gregory.p.smithsetmessages: + msg176926
2012-12-04 09:34:17pitrousetmessages: + msg176886
2012-12-04 01:20:08Gary.Miguelsetnosy: + Gary.Miguel
2012-10-12 21:34:55hayposetnosy: + haypo
2012-10-12 20:02:20python-devsetmessages: + msg172777
2012-10-12 08:23:42mark.dickinsonsetnosy: + mark.dickinson
2012-09-29 19:12:33gregory.p.smithsetpriority: normal -> high
2012-07-12 21:36:23pitroulinkissue9867 superseder
2012-06-26 04:04:59gregory.p.smithsetmessages: + msg164029
2012-06-26 03:57:58python-devsetmessages: + msg164028
2012-06-25 17:04:36pitrousetmessages: + msg163992
2012-06-25 08:40:22gregory.p.smithsetfiles: - test_and_fix_readers_3.2-gps02.diff
2012-06-25 08:16:04python-devsetmessages: + msg163913
2012-06-25 08:07:36gregory.p.smithsetfiles: - file-signal-eintr-27.diff
2012-06-25 08:05:19gregory.p.smithsetmessages: + msg163911
2012-06-25 08:01:39gregory.p.smithsetfiles: + fix-signal-eintr-read-27-gps02.diff

messages: + msg163910
versions: - Python 3.1
2012-06-24 15:20:35pitrousetmessages: + msg163794
2012-06-24 07:27:14gregory.p.smithsetmessages: + msg163744
2012-06-24 07:24:47python-devsetnosy: + python-dev
messages: + msg163742
2011-06-07 09:57:38pitrousetmessages: + msg137799
stage: patch review
2011-06-07 04:44:54gregory.p.smithsetnosy: + pitrou
2011-06-06 07:35:56gregory.p.smithsetfiles: - test_fileio_readers_3.2-gps01.diff
2011-06-06 07:34:35gregory.p.smithsetfiles: + test_and_fix_readers_3.2-gps02.diff

messages: + msg137731
2011-06-06 04:07:18gregory.p.smithsetfiles: + test_fileio_readers_3.2-gps01.diff
2011-06-06 01:38:38jconsetnosy: + jcon
2011-06-06 00:36:30gregory.p.smithsetmessages: + msg137722
title: file readline & readlines methods can lose data on EINTR -> file readline, readlines & readall methods can lose data on EINTR
2011-06-05 21:47:15gregory.p.smithsetmessages: + msg137716
versions: + Python 3.1, Python 3.2, Python 3.3
2011-06-05 20:39:02gregory.p.smithcreate