Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

file readline, readlines & readall methods can lose data on EINTR #56477

Closed
gpshead opened this issue Jun 5, 2011 · 22 comments
Closed

file readline, readlines & readall methods can lose data on EINTR #56477

gpshead opened this issue Jun 5, 2011 · 22 comments
Assignees

Comments

@gpshead
Copy link
Member

gpshead commented Jun 5, 2011

BPO 12268
Nosy @gpshead, @mjpieters, @mdickinson, @pitrou, @vstinner, @asvetlov
Files
  • fix-signal-eintr-read-27-gps02.diff: fixes for 2.7's read, readline, readlines and readinto with unittests
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/gpshead'
    closed_at = <Date 2013-02-01.21:20:00.128>
    created_at = <Date 2011-06-05.20:39:02.990>
    labels = []
    title = 'file readline, readlines & readall methods can lose data on EINTR'
    updated_at = <Date 2016-11-15.14:36:08.044>
    user = 'https://github.com/gpshead'

    bugs.python.org fields:

    activity = <Date 2016-11-15.14:36:08.044>
    actor = 'mjpieters'
    assignee = 'gregory.p.smith'
    closed = True
    closed_date = <Date 2013-02-01.21:20:00.128>
    closer = 'gregory.p.smith'
    components = []
    creation = <Date 2011-06-05.20:39:02.990>
    creator = 'gregory.p.smith'
    dependencies = []
    files = ['26142']
    hgrepos = []
    issue_num = 12268
    keywords = ['patch', 'needs review']
    message_count = 22.0
    messages = ['137714', '137716', '137722', '137731', '137799', '163742', '163744', '163794', '163910', '163911', '163913', '163992', '164028', '164029', '172777', '176886', '176926', '176936', '181107', '181108', '181109', '280852']
    nosy_count = 9.0
    nosy_names = ['gregory.p.smith', 'mjpieters', 'mark.dickinson', 'pitrou', 'vstinner', 'asvetlov', 'python-dev', 'jcon', 'Gary.Miguel']
    pr_nums = []
    priority = 'high'
    resolution = 'fixed'
    stage = 'patch review'
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue12268'
    versions = ['Python 2.7', 'Python 3.2', 'Python 3.3']

    @gpshead
    Copy link
    Member Author

    gpshead commented Jun 5, 2011

    The file object readline() and readlines() methods can lose data when an underlying read system call is interrupted. They will abort with an IOError in this case but any incomplete line data they have read will be discarded.

    readline() and readlines() should never raise an IOError for the EINTR interrupted system call case. They should handle that gracefully, retrying their reads after letting any Python signal handlers run.

    @gpshead gpshead self-assigned this Jun 5, 2011
    @gpshead
    Copy link
    Member Author

    gpshead commented Jun 5, 2011

    3.x has the same issue. unittest & patch forthcoming that addresses that as well.

    2.6 also has the issue but it is in security fix only mode so I won't backport to that.

    @gpshead
    Copy link
    Member Author

    gpshead commented Jun 6, 2011

    .readall() and the equivalent unbounded .read() also have this problem.

    @gpshead gpshead changed the title file readline & readlines methods can lose data on EINTR file readline, readlines & readall methods can lose data on EINTR Jun 6, 2011
    @gpshead
    Copy link
    Member Author

    gpshead commented Jun 6, 2011

    I haven't looked beyond the reading methods it is possible that some of the write implementations have a similar issue. Patch gps02 for 3.2 attached.

    I'll use that as the basis for a stand alone test_file_eintr.py targeted at 2.7.

    @pitrou
    Copy link
    Member

    pitrou commented Jun 7, 2011

    I'm not sure why you're creating a separate test file. There are already signals-related tests in test_io. Also, perhaps you can reuse the idioms used there, rather than spawn subprocesses.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Jun 24, 2012

    New changeset 781b95159954 by Gregory P. Smith in branch '3.2':
    Fixes issue bpo-12268: File readline, readlines and read() or readall() methods
    http://hg.python.org/cpython/rev/781b95159954

    New changeset 19a6bef57490 by Gregory P. Smith in branch 'default':
    Fixes issue bpo-12268: File readline, readlines and read() or readall() methods
    http://hg.python.org/cpython/rev/19a6bef57490

    @gpshead
    Copy link
    Member Author

    gpshead commented Jun 24, 2012

    I'm leaving this open as I still need to audit the write methods and commit the fix(es) for 2.7.

    I tried to merge the test into test_io's signals tests but I could not get that to actually work to reproduce the original problem so I kept my process based test_file_eintr one which easily reproduces it prior to these patches being applied instead.

    @pitrou
    Copy link
    Member

    pitrou commented Jun 24, 2012

    For the record, there was a crash on the ARM buildbot:

    [196/368/1] test_io
    Timeout (1:00:00)!
    Thread 0x2aac5300:
    File "/var/lib/buildbot/buildarea/3.x.warsaw-ubuntu-arm/build/Lib/test/test_io.py", line 3051 in check_interrupted_write_retry
    File "/var/lib/buildbot/buildarea/3.x.warsaw-ubuntu-arm/build/Lib/test/test_io.py", line 3073 in test_interrupted_write_retry_text
    File "/var/lib/buildbot/buildarea/3.x.warsaw-ubuntu-arm/build/Lib/unittest/case.py", line 385 in _executeTestPart
    File "/var/lib/buildbot/buildarea/3.x.warsaw-ubuntu-arm/build/Lib/unittest/case.py", line 440 in run
    File "/var/lib/buildbot/buildarea/3.x.warsaw-ubuntu-arm/build/Lib/unittest/case.py", line 492 in __call__
    File "/var/lib/buildbot/buildarea/3.x.warsaw-ubuntu-arm/build/Lib/unittest/suite.py", line 105 in run
    File "/var/lib/buildbot/buildarea/3.x.warsaw-ubuntu-arm/build/Lib/unittest/suite.py", line 67 in __call__
    File "/var/lib/buildbot/buildarea/3.x.warsaw-ubuntu-arm/build/Lib/unittest/suite.py", line 105 in run
    File "/var/lib/buildbot/buildarea/3.x.warsaw-ubuntu-arm/build/Lib/unittest/suite.py", line 67 in __call__
    File "/var/lib/buildbot/buildarea/3.x.warsaw-ubuntu-arm/build/Lib/unittest/runner.py", line 168 in run
    File "/var/lib/buildbot/buildarea/3.x.warsaw-ubuntu-arm/build/Lib/test/support.py", line 1383 in _run_suite
    File "/var/lib/buildbot/buildarea/3.x.warsaw-ubuntu-arm/build/Lib/test/support.py", line 1417 in run_unittest
    File "/var/lib/buildbot/buildarea/3.x.warsaw-ubuntu-arm/build/Lib/test/test_io.py", line 3121 in test_main
    File "/var/lib/buildbot/buildarea/3.x.warsaw-ubuntu-arm/build/Lib/test/regrtest.py", line 1194 in runtest_inner
    File "/var/lib/buildbot/buildarea/3.x.warsaw-ubuntu-arm/build/Lib/test/regrtest.py", line 905 in runtest
    File "/var/lib/buildbot/buildarea/3.x.warsaw-ubuntu-arm/build/Lib/test/regrtest.py", line 708 in main
    File "/var/lib/buildbot/buildarea/3.x.warsaw-ubuntu-arm/build/Lib/test/main.py", line 13 in <module>
    File "/var/lib/buildbot/buildarea/3.x.warsaw-ubuntu-arm/build/Lib/runpy.py", line 75 in _run_code
    File "/var/lib/buildbot/buildarea/3.x.warsaw-ubuntu-arm/build/Lib/runpy.py", line 162 in _run_module_as_main
    make: *** [buildbottest] Error 1

    @gpshead
    Copy link
    Member Author

    gpshead commented Jun 25, 2012

    I'm attaching an updated patch for 2.7. It fixes read, readline, readlines and readinto and includes tests.

    More code auditing for other methods to fix is still needed.

    @gpshead
    Copy link
    Member Author

    gpshead commented Jun 25, 2012

    The 3.* ubuntu arm buildbot hanging in test_io is very odd.

    I'm going to undo my supposedly straight forward signal.alarm(...) to signal.setitimer(...) change first to see if that is related.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Jun 25, 2012

    New changeset 95b071194ddd by Gregory P. Smith in branch '3.2':
    Backout change e8f44ebacda7052267318cecf5b6f128d35add17. Reverting the test
    http://hg.python.org/cpython/rev/95b071194ddd

    New changeset b4ae7aa21b46 by Gregory P. Smith in branch 'default':
    Backout change e8f44ebacda7052267318cecf5b6f128d35add17. Reverting the test
    http://hg.python.org/cpython/rev/b4ae7aa21b46

    @pitrou
    Copy link
    Member

    pitrou commented Jun 25, 2012

    I don't think setitimer() is the culprit, rather the fact that the timeout is too short. You could try setting it to e.g. 0.4.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Jun 26, 2012

    New changeset 67dc99a989cd by Gregory P. Smith in branch '2.7':
    Fixes issue bpo-12268 for file readline, readlines and read() and readinto methods.
    http://hg.python.org/cpython/rev/67dc99a989cd

    @gpshead
    Copy link
    Member Author

    gpshead commented Jun 26, 2012

    The uses of fwrite() and fflush() also need this EINTR treatment in 2.7. I haven't checked the write paths in 3.2 yet.

    Also, the fix change to 3.2's _io module needs backporting to 2.7's _io module for people using that.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Oct 12, 2012

    New changeset 751a91e332d9 by Gregory P. Smith in branch '2.7':
    Fixes Issue bpo-12268 for the io module - File readline, readlines and
    http://hg.python.org/cpython/rev/751a91e332d9

    @pitrou
    Copy link
    Member

    pitrou commented Dec 4, 2012

    Is there anything left to do here?

    @gpshead
    Copy link
    Member Author

    gpshead commented Dec 4, 2012

    Yes. See my comment from June. The write paths need to be taken care of.

    @pitrou
    Copy link
    Member

    pitrou commented Dec 4, 2012

    Well, this issue is about "readline, readlines & readall". It would be easier to follow if you opened a separate issue.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Feb 1, 2013

    New changeset a5e7b38caee2 by Gregory P. Smith in branch '2.7':
    Additional fix for Issue bpo-12268: The io module file object writelines() methods
    http://hg.python.org/cpython/rev/a5e7b38caee2

    New changeset 2fd669aa4abc by Gregory P. Smith in branch '3.2':
    Additional fix for Issue bpo-12268: The io module file object writelines() methods no longer abort early when one of its write system calls is interrupted (EINTR).
    http://hg.python.org/cpython/rev/2fd669aa4abc

    New changeset 30fc620e240e by Gregory P. Smith in branch '3.3':
    Additional fix for issue bpo-12268: The io module file object write methods no
    http://hg.python.org/cpython/rev/30fc620e240e

    New changeset 8f72519fd0e9 by Gregory P. Smith in branch 'default':
    Additional fix for issue bpo-12268: The io module file object write methods no
    http://hg.python.org/cpython/rev/8f72519fd0e9

    @vstinner
    Copy link
    Member

    vstinner commented Feb 1, 2013

    Oh, so we can now implement a version of writelines() using writev()!

    2013/2/1 Roundup Robot <report@bugs.python.org>:

    Roundup Robot added the comment:

    New changeset a5e7b38caee2 by Gregory P. Smith in branch '2.7':
    Additional fix for Issue bpo-12268: The io module file object writelines() methods
    http://hg.python.org/cpython/rev/a5e7b38caee2

    New changeset 2fd669aa4abc by Gregory P. Smith in branch '3.2':
    Additional fix for Issue bpo-12268: The io module file object writelines() methods no longer abort early when one of its write system calls is interrupted (EINTR).
    http://hg.python.org/cpython/rev/2fd669aa4abc

    New changeset 30fc620e240e by Gregory P. Smith in branch '3.3':
    Additional fix for issue bpo-12268: The io module file object write methods no
    http://hg.python.org/cpython/rev/30fc620e240e

    New changeset 8f72519fd0e9 by Gregory P. Smith in branch 'default':
    Additional fix for issue bpo-12268: The io module file object write methods no
    http://hg.python.org/cpython/rev/8f72519fd0e9

    ----------


    Python tracker <report@bugs.python.org>
    <http://bugs.python.org/issue12268\>


    @gpshead
    Copy link
    Member Author

    gpshead commented Feb 1, 2013

    it was easier to just take care of auditing the write calls as part of this given the code change was directly related to it.

    On Python 2.7 most of the write calls in the builtin file object (Objects/fileobject.c) rather than the new io module use the libc fwrite() call which, in linux man pages at least, is non-specific about what happens on EINTR (does it retry internally or does it return the number of bytes written so far?). Those could well abort leading to an error.

    Setting up a testcase fo to confirm that with is painful (time consuming) so I can't claim the non io module based write's do not still have an EINTR issue on 2.7.

    Workaround: Use the io module instead of the builtin open() or file() calls in Python 2.7.

    If someone can confirm that with a test case, it'd make another good issue to open.

    As for the writev comment... go ahead. :)

    @gpshead gpshead closed this as completed Feb 1, 2013
    @mjpieters
    Copy link
    Mannequin

    mjpieters mannequin commented Nov 15, 2016

    Follow-up bug, readahead was missed: http://bugs.python.org/issue1633941

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    None yet
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants