Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Serious interpreter crash and/or arbitrary memory leak using .read() on writable file #49927

Closed
nneonneo mannequin opened this issue Apr 3, 2009 · 18 comments
Closed
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-IO type-crash A hard crash of the interpreter, possibly with a core dump

Comments

@nneonneo
Copy link
Mannequin

nneonneo mannequin commented Apr 3, 2009

BPO 5677
Nosy @loewis, @pitrou, @benjaminp, @skrah, @pmp-p
Files
  • fileread-errno.patch: Handle EBADF on Windows.
  • errno-ebadf.patch
  • errno-ebadf-tests.patch
  • add-readable-writable.patch: Add readable/writable to file object.
  • add-readable-writable2.patch: Add readable/writable to file object Rename README to README.rst and enhance formatting #2.
  • add-readable-writable3.patch: Add readable/writable to file object bpo-29403: Fix mock's broken autospec behavior on method-bound builtin functions #3.
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2010-02-05.17:13:40.973>
    created_at = <Date 2009-04-03.01:24:42.814>
    labels = ['interpreter-core', 'expert-IO', 'type-crash']
    title = 'Serious interpreter crash and/or arbitrary memory leak using .read() on writable file'
    updated_at = <Date 2010-02-05.17:13:40.972>
    user = 'https://bugs.python.org/nneonneo'

    bugs.python.org fields:

    activity = <Date 2010-02-05.17:13:40.972>
    actor = 'pitrou'
    assignee = 'none'
    closed = True
    closed_date = <Date 2010-02-05.17:13:40.973>
    closer = 'pitrou'
    components = ['Interpreter Core', 'IO']
    creation = <Date 2009-04-03.01:24:42.814>
    creator = 'nneonneo'
    dependencies = []
    files = ['16071', '16103', '16104', '16115', '16140', '16141']
    hgrepos = []
    issue_num = 5677
    keywords = ['patch']
    message_count = 18.0
    messages = ['85286', '85287', '98577', '98582', '98613', '98617', '98619', '98663', '98664', '98740', '98741', '98782', '98831', '98870', '98871', '98874', '98875', '98885']
    nosy_count = 6.0
    nosy_names = ['loewis', 'pitrou', 'nneonneo', 'benjamin.peterson', 'skrah', 'pmpp']
    pr_nums = []
    priority = 'high'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'crash'
    url = 'https://bugs.python.org/issue5677'
    versions = ['Python 2.6', 'Python 2.7']

    @nneonneo
    Copy link
    Mannequin Author

    nneonneo mannequin commented Apr 3, 2009

    (tested and verified on Windows and Solaris SPARC)

    Running this code in Python 2.4, 2.5 or 2.6 (all minor versions)
    produces garbage.

    f=open("anyfile","w")
    f.write("garbage")
    f.readline()

    Mac OS X and Linux appear to simply throw an "IOError: [Errno 9] Bad
    file descriptor" exception, while using a read method without writing
    first produces the same exception on Windows and certain versions under
    Solaris.

    Under Solaris, it is further possible to segfault the interpreter with
    this code:
    f=open("anyfile","w")
    f.read()

    In the former case, it appears as if the data is simply read from the
    disk block containing the file. In the latter, I have no clue what is
    going on.

    In Python 3.0, file objects opened with "w" don't even support any .read
    methods, so this does not affect Py3k.

    @nneonneo nneonneo mannequin added interpreter-core (Objects, Python, Grammar, and Parser dirs) type-crash A hard crash of the interpreter, possibly with a core dump labels Apr 3, 2009
    @nneonneo
    Copy link
    Mannequin Author

    nneonneo mannequin commented Apr 3, 2009

    OK, it's not a memory leak, rather, it's something leaking from the
    files. However, this still winds up filling the affected file full of
    garbage, which is quite undesirable (and, I think, quite likely to be
    unwanted behaviour, considering that *some* read methods throw IOError 9
    and some others don't).

    @devdanzin devdanzin mannequin added the topic-IO label May 16, 2009
    @pmp-p
    Copy link
    Mannequin

    pmp-p mannequin commented Jan 30, 2010

    related:
    f=open('test','w+b')
    f.write('123456789ABCDEF')
    #f.seek(0)
    print "position",f.tell()
    print '[',len(f.read()),']'
    f.close()

    windows: 2.6.2 mingw/ 2.6.4 msvc builds
    len(f.read()) > 0 ( 4081 when i did the test ) but should be 0
    printing the buffer show binary garbage.

    linux is ok f.read() return '' as expected

    @nneonneo
    Copy link
    Mannequin Author

    nneonneo mannequin commented Jan 30, 2010

    It seems like this is actually a problem in Windows libc or something (tested using MinGW on Windows XP):

    #include <stdio.h>
    
    main() {
        FILE *f = fopen("test", "wb");
        fwrite("test", 1, 4, f);
        char buf[2048];
        size_t k = fread(buf, 1, 2048, f);
        printf("%d\n", k);
        int i=0;
        for(; i<k; i++) printf("%02x", buf[i]);
    }

    This causes a lot of garbage to be printed out. Removing the fwrite causes "0" to be printed with no further output.

    The garbage is not from the uninitialized buffer, since I've verified that the original contents of the buffer are not what is being printed out. Furthermore, adjusting 2048 produces a maximum output of 4092 bytes (even with 9999 in place of 2048).

    Short of simply denying read() on writable files, I don't really see an easy way around this libc bug.

    @skrah
    Copy link
    Mannequin

    skrah mannequin commented Jan 31, 2010

    I wonder if this is technically a bug. The stream is not opened for reading and yet you do an fread. I quickly glanced through the
    C-Standard and I did not find an _explicit_ paragraph that this is
    undefined behavior, but personally I'd expect it to be.

    Regarding the last C example. Also here, you do not open using "w+".
    Now, even if you used "w+", by the standard you'd have to do an fflush,
    fseek, fsetpos or rewind before reading. I don't see a libc bug.

    @skrah
    Copy link
    Mannequin

    skrah mannequin commented Jan 31, 2010

    Bug or not, this can be handled since fread sets EBADF on Windows.

    @pitrou
    Copy link
    Member

    pitrou commented Jan 31, 2010

    The patch should come with a test.

    @skrah
    Copy link
    Mannequin

    skrah mannequin commented Feb 1, 2010

    I see that a complete errno based solution would get messy. To avoid
    interfering with EAGAIN special cases, this would be needed:

    #if defined(EBADF)
    #define ERRNO_EBADF(x) ((x) == EBADF)
    #else
    #define ERRNO_EBADF(x) 0
    #endif

    Then, additional checks would need to be added to get_line,
    getline_via_fgets and a couple of other places.

    I think the right thing is to add a field f_modeflags to the file
    object, to be initialized on creation. Or use f->readable, f->writable
    like py3k. Thoughts?

    @pitrou
    Copy link
    Member

    pitrou commented Feb 1, 2010

    I think checking errno would be fine (provided it eliminates all variations ot this bug).

    @skrah
    Copy link
    Mannequin

    skrah mannequin commented Feb 2, 2010

    An errno+ferror approach works except for two cases:

    (1) MS fgets() does not consider reading from a write-only stream
    an error. While annoying, I think this is standard conforming.

    (2) OpenSolaris sets errno unreliably after getc.

    I do not know how to get around those without using OS specifics.

    @pitrou
    Copy link
    Member

    pitrou commented Feb 2, 2010

    An errno+ferror approach works except for two cases:

    (1) MS fgets() does not consider reading from a write-only stream
    an error. While annoying, I think this is standard conforming.

    (2) OpenSolaris sets errno unreliably after getc.

    I do not know how to get around those without using OS specifics.

    Ok, then perhaps it's better to have some internal {readable, writable}
    flags based on the original mode string.

    @skrah
    Copy link
    Mannequin

    skrah mannequin commented Feb 3, 2010

    The new patch takes over the logic from fileio.c. Tested on Linux, Windows x86/x64 and OpenSolaris.

    @pitrou
    Copy link
    Member

    pitrou commented Feb 4, 2010

    Hmm, test_file.py is for the new IO library (backported from 3.x). You should use test_file2k.py for tests against the 2.x file object.

    Also, readinto() should take something such as a bytearray() as argument, not a string.

    @skrah
    Copy link
    Mannequin

    skrah mannequin commented Feb 5, 2010

    Thanks Antoine, I fixed both issues.

    @pitrou
    Copy link
    Member

    pitrou commented Feb 5, 2010

    Oh and the following line shouldn't be needed:

    data = b'xxx' if 'b' in mode else u'xxx'

    Since old-style file objects always take bytestrings (it's harmless, though, because the unicode string will be implicitly converted).

    @skrah
    Copy link
    Mannequin

    skrah mannequin commented Feb 5, 2010

    Ok, I'll fix that. Perhaps I should also remove the comment in err_mode(). I wonder if the ValueError in fileio.c is intentional:

    >>> import io
    >>> f = io.open("testfile", "w")
    >>> f.write("xxx")
    3
    >>> f.read()
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    IOError: not readable
    >>> 
    >>> f = io.FileIO("testfile", "w")
    >>> f.write("xxx")
    3
    >>> f.read()
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    ValueError: File not open for reading

    @pitrou
    Copy link
    Member

    pitrou commented Feb 5, 2010

    I wonder if the ValueError in fileio.c is intentional:

    I don't know, but this would be the subject of a separate issue anyway.

    @pitrou
    Copy link
    Member

    pitrou commented Feb 5, 2010

    The patch was committed in r77989 (trunk) and r77990 (2.6). Thank you Stefan!

    @pitrou pitrou closed this as completed Feb 5, 2010
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-IO type-crash A hard crash of the interpreter, possibly with a core dump
    Projects
    None yet
    Development

    No branches or pull requests

    1 participant