classification
Title: scan_eol() Buffer Over-read
Type: security Stage: resolved
Components: Extension Modules, IO Versions: Python 3.6, Python 3.5
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: serhiy.storchaka Nosy List: Arfrever, JohnLeitch, brycedarling, larry, martin.panter, python-dev, serhiy.storchaka, vstinner
Priority: release blocker Keywords: patch

Created on 2015-09-02 23:40 by JohnLeitch, last changed 2015-09-08 18:34 by Arfrever. This issue is now closed.

Files
File name Uploaded Description Edit
scan_eol_Buffer_Over-read.patch JohnLeitch, 2015-09-02 23:40 patch
scan_eol_Buffer_Over-read.py JohnLeitch, 2015-09-02 23:40 repro
scan_eol_Buffer_Over-read_2.patch serhiy.storchaka, 2015-09-03 06:10 review
Messages (11)
msg249584 - (view) Author: John Leitch (JohnLeitch) * Date: 2015-09-02 23:40
Python 3.5 suffers from a vulnerability caused by the behavior of the scan_eol() function. When called, the function gets a line from the buffer of a BytesIO object by searching for a newline character starting at the position in the buffer.

However, if the position is set to a value that is larger than the buffer, this logic will result in a call to memchr that reads off the end of the buffer:

    /* Move to the end of the line, up to the end of the string, s. */
    start = PyBytes_AS_STRING(self->buf) + self->pos;
    maxlen = self->string_size - self->pos;
    if (len < 0 || len > maxlen)
        len = maxlen;

    if (len) {
        n = memchr(start, '\n', len);


In some applications, it may be possible to exploit this behavior to disclose the contents of adjacent memory. The buffer over-read can be observed by running the following script:

import tempfile
a = tempfile.SpooledTemporaryFile()
a.seek(0b1)
a.readlines()

Which, depending on the arrangement of memory, may produce an exception such as this:

0:000> g
(698.188): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
eax=fff8a14c ebx=0a0a0a0a ecx=00000000 edx=05bb1000 esi=061211b0 edi=89090909
eip=61c6caf2 esp=010af8dc ebp=010af914 iopl=0         nv up ei ng nz ac po nc
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00010292
python35!memchr+0x62:
61c6caf2 8b0a            mov     ecx,dword ptr [edx]  ds:002b:05bb1000=????????
0:000> k1
ChildEBP RetAddr  
010af8e0 61b640f1 python35!memchr+0x62 [f:\dd\vctools\crt_bld\SELF_X86\crt\src\INTEL\memchr.asm @ 125]


To fix this issue, it is recommended that scan_eol() be updated to check that the position is not greater than or equal to the size of the buffer. A proposed patch is attached.
msg249600 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-09-03 03:58
Simpler test case, which might find a place somewhere like /Lib/test/test_memoryio.py:

>>> from io import BytesIO
>>> b = BytesIO()
>>> b.seek(1)
1
>>> b.readlines()  # Should return an empty list
Segmentation fault (core dumped)
[Exit 139]

The patch looks like the right approach. Just watch out for the indentation (tabs vs spaces).

However I don’t see why we need the GET_BYTES_SIZE() macro or the (size_t) casting. Would it be okay comparing PyBytes_GET_SIZE() directly to self->pos? They are both meant to be Py_ssize_t.

The bug looks like it was introduced by Serhiy’s changes in Issue 15381.
msg249602 - (view) Author: John Leitch (JohnLeitch) * Date: 2015-09-03 04:31
We based our fix on the check in write_bytes:

    if (endpos > (size_t)PyBytes_GET_SIZE(self->buf)) {
        if (resize_buffer(self, endpos) < 0)
            return -1;
    }

I see now that our casting was extraneous. As for the macro, it was suspected that similar issues may be present and we wanted to write reusable code, but this also seems unnecessary now that it's known the cast is unneeded.

Early tomorrow I'll take some time to create a revised patch.
msg249606 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-09-03 06:10
It should be self->string_size, not PyBytes_GET_SIZE(self->buf).

Thank you for your report and your patch John. Here is revised patch with tests based on Martin's test.

Larry, perhaps this bug is grave enough to be fixed in RC3.
msg249610 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015-09-03 06:54
scan_eol_Buffer_Over-read_2.patch looks good to me.

> Larry, perhaps this bug is grave enough to be fixed in RC3.

Since it looks like a regression in Python 3.5, yes, it's a severe bug. Please send a pull request to Larry for it.
msg249617 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2015-09-03 08:25
Yes, please create a pull request for this patch.  Thanks!

And just to confirm: I just applied patch 2 to CPython, then undid the change to bytesio.c.  The new test fails, and sometimes Python will segmentation fault.  If I then apply the patch to bytesio.c it passes.  Nice work!
msg249699 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2015-09-03 22:09
New changeset a5858c30db7c by Serhiy Storchaka in branch '3.5':
Issue #24989: Fixed buffer overread in BytesIO.readline() if a position is
https://hg.python.org/cpython/rev/a5858c30db7c

New changeset 215800fb955d by Serhiy Storchaka in branch 'default':
Issue #24989: Fixed buffer overread in BytesIO.readline() if a position is
https://hg.python.org/cpython/rev/215800fb955d
msg249711 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-09-04 04:54
https://bitbucket.org/larry/cpython350/pull-requests/13/issue-24989/diff
msg249717 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2015-09-04 05:13
Pull request accepted.  Please forward-merge.  Thanks!
msg249718 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2015-09-04 05:37
New changeset 2b6ce7e9595c by Serhiy Storchaka in branch '3.5':
Issue #24989: Fixed buffer overread in BytesIO.readline() if a position is
https://hg.python.org/cpython/rev/2b6ce7e9595c
msg249738 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2015-09-04 08:19
New changeset 07e04c34bab5 by Larry Hastings in branch '3.5':
Merged in storchaka/cpython350 (pull request #13)
https://hg.python.org/cpython/rev/07e04c34bab5
History
Date User Action Args
2015-09-08 18:34:24Arfreversetnosy: + Arfrever
2015-09-04 08:19:12python-devsetmessages: + msg249738
2015-09-04 05:37:08python-devsetmessages: + msg249718
2015-09-04 05:13:49larrysetstatus: open -> closed
resolution: fixed
messages: + msg249717

stage: patch review -> resolved
2015-09-04 04:54:22serhiy.storchakasetmessages: + msg249711
2015-09-03 22:09:58python-devsetnosy: + python-dev
messages: + msg249699
2015-09-03 08:25:36larrysetpriority: high -> release blocker
2015-09-03 08:25:31larrysetmessages: + msg249617
2015-09-03 06:54:41vstinnersetmessages: + msg249610
2015-09-03 06:11:06serhiy.storchakasetnosy: + larry
2015-09-03 06:10:40serhiy.storchakasetfiles: + scan_eol_Buffer_Over-read_2.patch

messages: + msg249606
components: + Extension Modules, IO
stage: patch review
2015-09-03 04:47:23serhiy.storchakasetpriority: normal -> high
assignee: serhiy.storchaka
versions: + Python 3.6
2015-09-03 04:31:34JohnLeitchsetmessages: + msg249602
2015-09-03 03:58:34martin.pantersetnosy: + martin.panter
messages: + msg249600
2015-09-02 23:46:17vstinnersetnosy: + vstinner, serhiy.storchaka
2015-09-02 23:40:38JohnLeitchsetfiles: + scan_eol_Buffer_Over-read.py
2015-09-02 23:40:26JohnLeitchcreate