This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author twouters
Recipients twouters
Date 2019-02-28.18:45:17
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <>
There is a bug in cPickle that makes it use uninitialised memory when reading a truncated pickle from a file (an actual C FILE*, not just any file-like object). Using MemorySanitizer:

% ./python
Python 2.7.15 (default, redacted, redacted)                                     [GCC 4.2.1 Compatible Clang (redacted)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import tempfile, cPickle
>>> f = tempfile.TemporaryFile()
>>> f.write('F')
>>> cPickle.load(f)
Uninitialized bytes in __interceptor_strlen at offset 1 inside [0x701000001b50, 3)
==23453==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x561a9110d51b in PyString_FromFormatV Objects/stringobject.c:241:22
    #1 0x561a912ba454 in PyErr_Format Python/errors.c:567:14
    #2 0x561a91303bc8 in PyOS_string_to_double Python/pystrtod.c
    #3 0x561a90b4a7d7 in load_float Modules/cPickle.c:3618:9
    #4 0x561a90b48ca7 in load Modules/cPickle.c:4779:17
    #5 0x561a90b56d9c in cpm_load Modules/cPickle.c:5758:11
    #6 0x561a91260b89 in call_function Python/ceval.c:4379:17

The problem is Modules/cPickle:readline_file end-of-file handling logic:

        for (; i < (self->buf_size - 1); i++) {
            if (feof(self->fp) ||
                (self->buf[i] = getc(self->fp)) == '\n') {
                self->buf[i + 1] = '\0';
                *s = self->buf;
                return i + 1;

When feof(self->pf) becomes true, the code skips over writing to self->buf[i] (which hasn't been written to yet), only writes to self->buf[i+1], and returns self->buf.

There is an additional problem that the code fails to check for the EOF return of getc(), which means it may write EOF-cast-to-a-char to self->buf[i] without realising it's meant to be an EOF. (EOF is usually -1, but I don't remember if that's a valid cast or undefined behaviour.) Theoretically this could cause invalid, truncated pickles to be read succesfully, but with the wrong value for the last item. (It could perhaps do even worse things with files that change while cPickle is reading them, but that's harder to reason about.)

(Use of uninitialised memory is a potential security issue, although so is using pickles, so I'm conflicted about the bug type...)
Date User Action Args
2019-02-28 18:45:18twouterssetrecipients: + twouters
2019-02-28 18:45:18twouterssetmessageid: <>
2019-02-28 18:45:18twouterslinkissue36149 messages
2019-02-28 18:45:17twouterscreate