Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File read silently stops after EIO I/O error #65289

Open
IvanK mannequin opened this issue Mar 28, 2014 · 17 comments
Open

File read silently stops after EIO I/O error #65289

IvanK mannequin opened this issue Mar 28, 2014 · 17 comments
Labels

Comments

@IvanK
Copy link
Mannequin

IvanK mannequin commented Mar 28, 2014

BPO 21090
Nosy @mjpieters, @pitrou, @vstinner
Files
  • fileio_readall_eio.patch
  • fileobject.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2014-03-28.22:29:46.314>
    labels = ['expert-IO']
    title = 'File read silently stops after EIO I/O error'
    updated_at = <Date 2016-11-15.14:32:26.357>
    user = 'https://bugs.python.org/ivank'

    bugs.python.org fields:

    activity = <Date 2016-11-15.14:32:26.357>
    actor = 'mjpieters'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['IO']
    creation = <Date 2014-03-28.22:29:46.314>
    creator = 'ivank'
    dependencies = []
    files = ['35820', '35833']
    hgrepos = []
    issue_num = 21090
    keywords = ['patch']
    message_count = 17.0
    messages = ['215085', '215213', '215533', '217234', '217237', '217244', '217265', '217266', '221864', '221865', '222043', '222114', '222124', '222126', '222140', '231093', '280850']
    nosy_count = 6.0
    nosy_names = ['mjpieters', 'pitrou', 'vstinner', 'ivank', 'neologix', 'python-dev']
    pr_nums = []
    priority = 'normal'
    resolution = None
    stage = None
    status = 'open'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue21090'
    versions = ['Python 2.7', 'Python 3.4']

    @IvanK
    Copy link
    Mannequin Author

    IvanK mannequin commented Mar 28, 2014

    I intentionally corrupted a zpool to induce an I/O error in a file, in this case, /usr/lib/x86_64-linux-gnu/gconv/IBM1390.so

    # ls -l /usr/lib/x86_64-linux-gnu/gconv/IBM1390.so
    -rw-r--r-- 1 root root 231,496 2014-03-24 06:26 /usr/lib/x86_64-linux-gnu/gconv/IBM1390.so

    # cat /usr/lib/x86_64-linux-gnu/gconv/IBM1390.so > /dev/null
    cat: /usr/lib/x86_64-linux-gnu/gconv/IBM1390.so: Input/output error

    When I read the file, Python 3.3.5 and 3.4.0 check for EIO and raise an exception:

    >>> open("/usr/lib/x86_64-linux-gnu/gconv/IBM1390.so", "rb").read()
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    OSError: [Errno 5] Input/output error

    but Python 2.7.6 does not:

    # python2
    Python 2.7.6 (default, Mar 22 2014, 22:59:56) 
    [GCC 4.8.2] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    >>> x = open("/usr/lib/x86_64-linux-gnu/gconv/IBM1390.so", "rb").read()
    >>> len(x)
    131072

    @IvanK IvanK mannequin added the topic-IO label Mar 28, 2014
    @pitrou
    Copy link
    Member

    pitrou commented Mar 30, 2014

    Python 2.7 uses C fopen() and fread(), so what happens probably is that fread() silences the error.

    @vstinner
    Copy link
    Member

    vstinner commented Apr 4, 2014

    Python 2.7 uses C fopen() and fread(), so what happens probably is that fread() silences the error.

    I see that file_read() checks ferror() if fread() returned 0. I would nice to run the test in strace and attach the output of strace to see if the EIO is returneded by the kernel at least.

    @vstinner
    Copy link
    Member

    @ivank: Can you please answer to questions? It's hard to understand the issue. Without more information, I would suggest to close the issue.

    @IvanK
    Copy link
    Mannequin Author

    IvanK mannequin commented Apr 27, 2014

    I'm finding it hard to reproduce the bug again with more zpool corruption. (I see the IOError: [Errno 5] Input/output error exception now.) I do remember that in the reported case, Python 3.4, node.js, and OpenJDK 7 threw an EIO exception, but Python 2.7 did not. I tested this multiple times.

    Right now I can only speculate that Python 2.7 silently stops reading only in certain cases, e.g. depending on how Python's reads are aligned with the first byte that causes EIO.

    I'm still working on getting it reproduced, please hold off on closing.

    @vstinner
    Copy link
    Member

    2014-04-27 5:26 GMT+02:00 ivank report@bugs.python.org:

    (I see the IOError: [Errno 5] Input/output error exception now.)

    Can you please run your test in strace to see system calls?

    @neologix
    Copy link
    Mannequin

    neologix mannequin commented Apr 27, 2014

    I'm with Antoine, it's likely a glibc bug.

    We already had a similar issue with fwrite():
    http://bugs.python.org/issue17976

    @pitrou
    Copy link
    Member

    pitrou commented Apr 27, 2014

    ivank, if you know some C, perhaps you could write a trivial program that does an fopen() followed by an fread() of 131072 bytes, and see if the fread() errors out.

    @IvanK
    Copy link
    Mannequin Author

    IvanK mannequin commented Jun 29, 2014

    I managed to reproduce this again, this time by corrupting data on a btrfs filesystem.

    $ cat read_error_file.py    
    import os

    fname = "/usr/bin/Xorg"
    size = os.stat(fname).st_size
    print fname, "stat size:", size

    f = open(fname, "rb")
    print "len(f.read()): ", len(f.read())
    f.close()
    
    f = open(fname, "rb")
    for i in xrange(size):
            try:
                    f.read(1)
            except IOError:
                    print "IOError at byte %d" % i
                    break
    f.close()
    $ python read_error_file.py 
    /usr/bin/Xorg stat size: 2331776
    len(f.read()):  716800
    IOError at byte 716800

    Note how the first test does not throw an IOError, but the second one does.

    The strace for the first test is:

    open("/usr/bin/Xorg", O_RDONLY) = 3
    fstat(3, {st_mode=S_IFREG|0755, st_size=2331776, ...}) = 0
    fstat(3, {st_mode=S_IFREG|0755, st_size=2331776, ...}) = 0
    lseek(3, 0, SEEK_CUR) = 0
    fstat(3, {st_mode=S_IFREG|0755, st_size=2331776, ...}) = 0
    mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f1334bd6000
    lseek(3, 0, SEEK_CUR) = 0
    mmap(NULL, 2334720, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f1332ea6000
    read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\265M\4\0\0\0\0\0"..., 2330624) = 716800
    read(3, 0x7f1332f55034, 1613824) = -1 EIO (Input/output error)
    mremap(0x7f1332ea6000, 2334720, 720896, MREMAP_MAYMOVE) = 0x7f1332ea6000
    munmap(0x7f1332ea6000, 720896) = 0
    write(1, "len(f.read()): 716800\n", 23len(f.read()): 716800
    ) = 23

    Note the "-1 EIO (Input/output error)" that gets ignored somewhere.

    @IvanK
    Copy link
    Mannequin Author

    IvanK mannequin commented Jun 29, 2014

    This problem happens with Python 3.4 as well.

    $ cat read_error_file.py 
    from __future__ import print_function
    import os

    fname = "/usr/bin/Xorg"
    size = os.stat(fname).st_size
    print(fname, "stat size:", size)

    f = open(fname, "rb")
    print("len(f.read()): ", len(f.read()))
    f.close()
    
    f = open(fname, "rb")
    for i in range(size):
            try:
                    f.read(1)
            except IOError:
                    print("IOError at byte %d" % i)
                    break
    f.close()
    $ python3 --version
    Python 3.4.1
    
    $ python3 read_error_file.py
    /usr/bin/Xorg stat size: 2331776
    len(f.read()):  716800
    IOError at byte 716800

    strace for the first test is:

    open("/usr/bin/Xorg", O_RDONLY|O_CLOEXEC) = 3
    fstat(3, {st_mode=S_IFREG|0755, st_size=2331776, ...}) = 0
    ioctl(3, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS, 0x7fff323ac8b0) = -1 ENOTTY (Inappropriate ioctl for device)
    fstat(3, {st_mode=S_IFREG|0755, st_size=2331776, ...}) = 0
    lseek(3, 0, SEEK_CUR) = 0
    lseek(3, 0, SEEK_CUR) = 0
    fstat(3, {st_mode=S_IFREG|0755, st_size=2331776, ...}) = 0
    mmap(NULL, 2334720, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f57884cc000
    read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\265M\4\0\0\0\0\0"..., 2331777) = 716800
    read(3, 0x7f578857b030, 1614977) = -1 EIO (Input/output error)
    mremap(0x7f57884cc000, 2334720, 720896, MREMAP_MAYMOVE) = 0x7f57884cc000
    munmap(0x7f57884cc000, 720896) = 0
    write(1, "len(f.read()): 716800\n", 23len(f.read()): 716800
    ) = 23
    close(3)

    @vstinner
    Copy link
    Member

    vstinner commented Jul 1, 2014

    Here is a patch for FileIO.readall() which should fix the issue. Currently, readall() returns read bytes at the first read() error if a least one call to read() succeed.

    @neologix
    Copy link
    Mannequin

    neologix mannequin commented Jul 2, 2014

    LGTM.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Jul 2, 2014

    New changeset 652b62213072 by Victor Stinner in branch '3.4':
    Issue bpo-21090: io.FileIO.readall() does not ignore I/O errors anymore. Before,
    http://hg.python.org/cpython/rev/652b62213072

    New changeset 440279cec378 by Victor Stinner in branch 'default':
    (Merge 3.4) Issue bpo-21090: io.FileIO.readall() does not ignore I/O errors
    http://hg.python.org/cpython/rev/440279cec378

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Jul 2, 2014

    New changeset 1492a42b8308 by Victor Stinner in branch '2.7':
    Issue bpo-21090: io.FileIO.readall() does not ignore I/O errors anymore. Before,
    http://hg.python.org/cpython/rev/1492a42b8308

    @vstinner
    Copy link
    Member

    vstinner commented Jul 2, 2014

    For Python 2, file.read() looks wrong: if only checks ferror() if fread() returns 0, whereas Py_UniversalNewlineFread() can call fread() more than once, and according to fread() manual page, fread() result can be different than 0 on error.

    "If an error occurs, or the end of the file is reached, the return value is a short item count (or zero)."
    http://linux.die.net/man/3/fread

    Attached fileobject.c rewrites error handling in fileobject.c. The main change if that ferror() is called even if fread() does not return 0.

    @vstinner
    Copy link
    Member

    On IRC, buck1 asked why the following code behaves differently on Python < 3.4 and Python >= 3.4. It is related to this issue in fact.

    Code:
    ---

    from __future__ import print_function
    
    from os import openpty
    read, write = openpty()
    
    from subprocess import Popen
    proc = Popen(
        ('echo', 'ok'),
        stdout=write,
        close_fds=True,
    )
    
    from os import fdopen
    fdopen(write, 'w').close()
    with fdopen(read) as stdout:
        print('STDOUT', stdout.read())
    
    print('exit code:', proc.wait())

    Simplified example:
    ---

    import io, os
    read, write = os.openpty()
    os.write(write, b'ok\n')
    os.close(write)
    with io.FileIO(read, closefd=False) as fp:
        print(fp.readall())

    On Python < 3.4, it displays "ok", whereas Python 3.4 and later fail with
    OSError(5, 'Input/output error' on readall().

    Another example:
    ---

    import os
    read, write = os.openpty()
    os.write(write, b'ok\n')
    os.close(write)
    print("read: %r" % os.read(read, 4096))
    print("read: %r" % os.read(read, 4096))

    The first read syscall succeed, even if the write end is already called. But the second read syscall fails with EIO.

    @mjpieters
    Copy link
    Mannequin

    mjpieters mannequin commented Nov 15, 2016

    The Python 2.7 issue (using fread without checking for interrupts) looks like a duplicate of http://bugs.python.org/issue1633941

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants