New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
file readline, readlines & readall methods can lose data on EINTR #56477
Comments
The file object readline() and readlines() methods can lose data when an underlying read system call is interrupted. They will abort with an IOError in this case but any incomplete line data they have read will be discarded. readline() and readlines() should never raise an IOError for the EINTR interrupted system call case. They should handle that gracefully, retrying their reads after letting any Python signal handlers run. |
3.x has the same issue. unittest & patch forthcoming that addresses that as well. 2.6 also has the issue but it is in security fix only mode so I won't backport to that. |
.readall() and the equivalent unbounded .read() also have this problem. |
I haven't looked beyond the reading methods it is possible that some of the write implementations have a similar issue. Patch gps02 for 3.2 attached. I'll use that as the basis for a stand alone test_file_eintr.py targeted at 2.7. |
I'm not sure why you're creating a separate test file. There are already signals-related tests in test_io. Also, perhaps you can reuse the idioms used there, rather than spawn subprocesses. |
New changeset 781b95159954 by Gregory P. Smith in branch '3.2': New changeset 19a6bef57490 by Gregory P. Smith in branch 'default': |
I'm leaving this open as I still need to audit the write methods and commit the fix(es) for 2.7. I tried to merge the test into test_io's signals tests but I could not get that to actually work to reproduce the original problem so I kept my process based test_file_eintr one which easily reproduces it prior to these patches being applied instead. |
For the record, there was a crash on the ARM buildbot: [196/368/1] test_io |
I'm attaching an updated patch for 2.7. It fixes read, readline, readlines and readinto and includes tests. More code auditing for other methods to fix is still needed. |
The 3.* ubuntu arm buildbot hanging in test_io is very odd. I'm going to undo my supposedly straight forward signal.alarm(...) to signal.setitimer(...) change first to see if that is related. |
New changeset 95b071194ddd by Gregory P. Smith in branch '3.2': New changeset b4ae7aa21b46 by Gregory P. Smith in branch 'default': |
I don't think setitimer() is the culprit, rather the fact that the timeout is too short. You could try setting it to e.g. 0.4. |
New changeset 67dc99a989cd by Gregory P. Smith in branch '2.7': |
The uses of fwrite() and fflush() also need this EINTR treatment in 2.7. I haven't checked the write paths in 3.2 yet. Also, the fix change to 3.2's _io module needs backporting to 2.7's _io module for people using that. |
New changeset 751a91e332d9 by Gregory P. Smith in branch '2.7': |
Is there anything left to do here? |
Yes. See my comment from June. The write paths need to be taken care of. |
Well, this issue is about "readline, readlines & readall". It would be easier to follow if you opened a separate issue. |
New changeset a5e7b38caee2 by Gregory P. Smith in branch '2.7': New changeset 2fd669aa4abc by Gregory P. Smith in branch '3.2': New changeset 30fc620e240e by Gregory P. Smith in branch '3.3': New changeset 8f72519fd0e9 by Gregory P. Smith in branch 'default': |
Oh, so we can now implement a version of writelines() using writev()! 2013/2/1 Roundup Robot <report@bugs.python.org>:
|
it was easier to just take care of auditing the write calls as part of this given the code change was directly related to it. On Python 2.7 most of the write calls in the builtin file object (Objects/fileobject.c) rather than the new io module use the libc fwrite() call which, in linux man pages at least, is non-specific about what happens on EINTR (does it retry internally or does it return the number of bytes written so far?). Those could well abort leading to an error. Setting up a testcase fo to confirm that with is painful (time consuming) so I can't claim the non io module based write's do not still have an EINTR issue on 2.7. Workaround: Use the io module instead of the builtin open() or file() calls in Python 2.7. If someone can confirm that with a test case, it'd make another good issue to open. As for the writev comment... go ahead. :) |
Follow-up bug, readahead was missed: http://bugs.python.org/issue1633941 |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: