classification
Title: PyNumber_Long Buffer Over-read.patch
Type: crash Stage: resolved
Components: Interpreter Core Versions: Python 3.6, Python 3.4, Python 3.5, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: serhiy.storchaka Nosy List: JohnLeitch, eric.smith, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2015-08-06 03:18 by JohnLeitch, last changed 2015-11-04 13:55 by serhiy.storchaka. This issue is now closed.

Files
File name Uploaded Description Edit
PyNumber_Long_Buffer_Over-read.patch JohnLeitch, 2015-08-06 03:18 patch
PyNumber_Long_Buffer_Over-read.py JohnLeitch, 2015-08-06 03:19 repro
Messages (3)
msg248101 - (view) Author: John Leitch (JohnLeitch) * Date: 2015-08-06 03:18
Python suffers from a buffer over-read in PyNumber_Long() that is caused by the incorrect assumption that buffers returned by PyObject_GetBuffer() are null-terminated. This could potentially result in the disclosure of adjacent memory.

PyObject *
PyNumber_Long(PyObject *o)
{
    [...]

    if (PyObject_GetBuffer(o, &view, PyBUF_SIMPLE) == 0) { <<<< The unterminated buffer 
                                                                is retreived here.
        /* need to do extra error checking that PyLong_FromString()
         * doesn't do.  In particular int('9\x005') must raise an
         * exception, not truncate at the null.
         */
        PyObject *result = _PyLong_FromBytes(view.buf, view.len, 10); <<<< The buffer
                                                is then passed to _PyLong_FromBytes(),
                                                which ultimately passes it to
                                                PyLong_FromString().
        PyBuffer_Release(&view);
        return result;
    }

    return type_error("int() argument must be a string, a bytes-like object "
                      "or a number, not '%.200s'", o);
}

PyObject *
PyLong_FromString(const char *str, char **pend, int base)
{
    int sign = 1, error_if_nonzero = 0;
    const char *start, *orig_str = str;
    PyLongObject *z = NULL;
    PyObject *strobj;
    Py_ssize_t slen;

    [...]

  onError:
    if (pend != NULL)
        *pend = (char *)str;
    Py_XDECREF(z);
    slen = strlen(orig_str) < 200 ? strlen(orig_str) : 200; <<<< If this path is taken,
                                                        orig_str is pointing to the
                                                        unterminated string, resulting in
														strlen reading off the end of the
                                                        buffer.
    strobj = PyUnicode_FromStringAndSize(orig_str, slen); <<<< The incorrect length is
	                                                      then used to create a Python
                                                          string.
    if (strobj == NULL)
        return NULL;
    PyErr_Format(PyExc_ValueError,
                 "invalid literal for int() with base %d: %.200R",
                 base, strobj);
    Py_DECREF(strobj);
    return NULL;
}

A script that reproduces the issue is as follows:

import array
int(array.array("B",b"A"*0x10))

And it produces the following exception:

0:000> p
eax=00000000 ebx=5dbc4699 ecx=00000000 edx=00000000 esi=07ad6b00 edi=00000000
eip=5da07f7e esp=00e4f8f8 ebp=00e4f934 iopl=0         nv up ei pl zr na pe nc
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00000246
python35!PyNumber_Long+0x20e:
5da07f7e 6a0a            push    0Ah
0:000> db @@(view.buf)
096fefe8  41 41 41 41 41 41 41 41-41 41 41 41 41 41 41 41  AAAAAAAAAAAAAAAA
096feff8  c0 c0 c0 c0 d0 d0 d0 d0-?? ?? ?? ?? ?? ?? ?? ??  ........????????
096ff008  ?? ?? ?? ?? ?? ?? ?? ??-?? ?? ?? ?? ?? ?? ?? ??  ????????????????
096ff018  ?? ?? ?? ?? ?? ?? ?? ??-?? ?? ?? ?? ?? ?? ?? ??  ????????????????
096ff028  ?? ?? ?? ?? ?? ?? ?? ??-?? ?? ?? ?? ?? ?? ?? ??  ????????????????
096ff038  ?? ?? ?? ?? ?? ?? ?? ??-?? ?? ?? ?? ?? ?? ?? ??  ????????????????
096ff048  ?? ?? ?? ?? ?? ?? ?? ??-?? ?? ?? ?? ?? ?? ?? ??  ????????????????
096ff058  ?? ?? ?? ?? ?? ?? ?? ??-?? ?? ?? ?? ?? ?? ?? ??  ????????????????
0:000> g
Breakpoint 3 hit
eax=07aed7b0 ebx=0000000a ecx=07aed7a0 edx=07aed000 esi=096fefe8 edi=096fefe8
eip=5da3a55e esp=00e4f870 ebp=00e4f8c4 iopl=0         nv up ei pl nz ac po nc
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00000212
python35!PyLong_FromString+0x4ce:
5da3a55e 8b74244c        mov     esi,dword ptr [esp+4Ch] ss:002b:00e4f8bc=096fefe8
0:000> g
(648.e5c): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
eax=07aed7d0 ebx=0000000a ecx=096ff000 edx=096fefe9 esi=096fefe8 edi=096fefe8
eip=5da3a567 esp=00e4f870 ebp=00e4f8c4 iopl=0         nv up ei ng nz na po nc
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00010282
python35!PyLong_FromString+0x4d7:
5da3a567 8a01            mov     al,byte ptr [ecx]          ds:002b:096ff000=??
0:000> db ecx-0x10
096feff0  41 41 41 41 41 41 41 41-c0 c0 c0 c0 d0 d0 d0 d0  AAAAAAAA........
096ff000  ?? ?? ?? ?? ?? ?? ?? ??-?? ?? ?? ?? ?? ?? ?? ??  ????????????????
096ff010  ?? ?? ?? ?? ?? ?? ?? ??-?? ?? ?? ?? ?? ?? ?? ??  ????????????????
096ff020  ?? ?? ?? ?? ?? ?? ?? ??-?? ?? ?? ?? ?? ?? ?? ??  ????????????????
096ff030  ?? ?? ?? ?? ?? ?? ?? ??-?? ?? ?? ?? ?? ?? ?? ??  ????????????????
096ff040  ?? ?? ?? ?? ?? ?? ?? ??-?? ?? ?? ?? ?? ?? ?? ??  ????????????????
096ff050  ?? ?? ?? ?? ?? ?? ?? ??-?? ?? ?? ?? ?? ?? ?? ??  ????????????????
096ff060  ?? ?? ?? ?? ?? ?? ?? ??-?? ?? ?? ?? ?? ?? ?? ??  ????????????????
0:000> !analyze -v -nodb
*******************************************************************************
*                                                                             *
*                        Exception Analysis                                   *
*                                                                             *
*******************************************************************************


FAULTING_IP: 
python35!PyLong_FromString+4d7 [c:\build\cpython\objects\longobject.c @ 2293]
5da3a567 8a01            mov     al,byte ptr [ecx]

EXCEPTION_RECORD:  ffffffff -- (.exr 0xffffffffffffffff)
ExceptionAddress: 5da3a567 (python35!PyLong_FromString+0x000004d7)
   ExceptionCode: c0000005 (Access violation)
  ExceptionFlags: 00000000
NumberParameters: 2
   Parameter[0]: 00000000
   Parameter[1]: 096ff000
Attempt to read from address 096ff000

CONTEXT:  00000000 -- (.cxr 0x0;r)
eax=07aed7d0 ebx=0000000a ecx=096ff000 edx=096fefe9 esi=096fefe8 edi=096fefe8
eip=5da3a567 esp=00e4f870 ebp=00e4f8c4 iopl=0         nv up ei ng nz na po nc
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00010282
python35!PyLong_FromString+0x4d7:
5da3a567 8a01            mov     al,byte ptr [ecx]          ds:002b:096ff000=??

FAULTING_THREAD:  00000e5c

DEFAULT_BUCKET_ID:  INVALID_POINTER_READ

PROCESS_NAME:  python.exe

ERROR_CODE: (NTSTATUS) 0xc0000005 - The instruction at 0x%p referenced memory at 0x%p. The memory could not be %s.

EXCEPTION_CODE: (NTSTATUS) 0xc0000005 - The instruction at 0x%p referenced memory at 0x%p. The memory could not be %s.

EXCEPTION_PARAMETER1:  00000000

EXCEPTION_PARAMETER2:  096ff000

READ_ADDRESS:  096ff000 

FOLLOWUP_IP: 
python35!PyLong_FromString+4d7 [c:\build\cpython\objects\longobject.c @ 2293]
5da3a567 8a01            mov     al,byte ptr [ecx]

NTGLOBALFLAG:  2000000

APPLICATION_VERIFIER_FLAGS:  0

APP:  python.exe

ANALYSIS_VERSION: 6.3.9600.17029 (debuggers(dbg).140219-1702) x86fre

PRIMARY_PROBLEM_CLASS:  INVALID_POINTER_READ

BUGCHECK_STR:  APPLICATION_FAULT_INVALID_POINTER_READ

LAST_CONTROL_TRANSFER:  from 5da3a60d to 5da3a567

STACK_TEXT:  
00e4f8c4 5da3a60d 096fefe8 00e4f8e0 0000000a python35!PyLong_FromString+0x4d7
00e4f8e4 5da07f8b 096fefe8 00000010 0000000a python35!_PyLong_FromBytes+0x1d
00e4f934 5da3e2cb 07ad6b00 5dc30b98 5dc30b98 python35!PyNumber_Long+0x21b
00e4f958 5da54e08 5dc30b98 07ace630 00000000 python35!long_new+0xab
00e4f978 5da0947d 5dc30b98 07ace630 00000000 python35!type_call+0x38
00e4f994 5daa49cc 5dc30b98 07ace630 00000000 python35!PyObject_Call+0x6d
00e4f9c0 5daa449c 00000001 07ace630 00000083 python35!do_call+0x11c
00e4f9f0 5daa18d8 0662eab0 00000000 00000040 python35!call_function+0x36c
00e4fa68 5daa339f 0662eab0 00000000 092cfff0 python35!PyEval_EvalFrameEx+0x2318
00e4fab4 5dada142 0664ff58 00000000 00000000 python35!_PyEval_EvalCodeWithName+0x82f
00e4faf0 5dad9fd5 0664ff58 0664ff58 00e4fbbc python35!run_mod+0x42
00e4fb1c 5dad904a 07320fc8 07ae7cf0 00000101 python35!PyRun_FileExFlags+0x85
00e4fb60 5d9af037 07320fc8 07ae7cf0 00000001 python35!PyRun_SimpleFileExFlags+0x20a
00e4fb8c 5d9af973 00e4fbbc 65492100 65492108 python35!run_file+0xe7
00e4fc2c 653e72e5 00e4fc80 1c4b143f 00000002 python35!Py_Main+0x913
00e4fc80 76323744 7f444000 76323720 dbcbc723 ucrtbase!_initterm+0x85
00e4fc94 7789a064 7f444000 24309cee 00000000 KERNEL32!BaseThreadInitThunk+0x24
00e4fcdc 7789a02f ffffffff 778bd7e4 00000000 ntdll!__RtlUserThreadStart+0x2f
00e4fcec 00000000 1c4b14f7 7f444000 00000000 ntdll!_RtlUserThreadStart+0x1b


STACK_COMMAND:  .cxr 0x0 ; kb

FAULTING_SOURCE_LINE:  c:\build\cpython\objects\longobject.c

FAULTING_SOURCE_FILE:  c:\build\cpython\objects\longobject.c

FAULTING_SOURCE_LINE_NUMBER:  2293

FAULTING_SOURCE_CODE:  
  2289:   onError:
  2290:     if (pend != NULL)
  2291:         *pend = (char *)str;
  2292:     Py_XDECREF(z);
> 2293:     slen = strlen(orig_str) < 200 ? strlen(orig_str) : 200;
  2294:     strobj = PyUnicode_FromStringAndSize(orig_str, slen);
  2295:     if (strobj == NULL)
  2296:         return NULL;
  2297:     PyErr_Format(PyExc_ValueError,
  2298:                  "invalid literal for int() with base %d: %.200R",


SYMBOL_STACK_INDEX:  0

SYMBOL_NAME:  python35!PyLong_FromString+4d7

FOLLOWUP_NAME:  MachineOwner

MODULE_NAME: python35

IMAGE_NAME:  python35.dll

DEBUG_FLR_IMAGE_TIMESTAMP:  5598ccc2

FAILURE_BUCKET_ID:  INVALID_POINTER_READ_c0000005_python35.dll!PyLong_FromString

BUCKET_ID:  APPLICATION_FAULT_INVALID_POINTER_READ_python35!PyLong_FromString+4d7

ANALYSIS_SOURCE:  UM

FAILURE_ID_HASH_STRING:  um:invalid_pointer_read_c0000005_python35.dll!pylong_fromstring

FAILURE_ID_HASH:  {e857bdce-f7a2-f3d9-b507-e98e25bcd084}

Followup: MachineOwner
---------

To fix the issue, it is recommended that PyNumber_Long() check the type of argument o after a successful PyObject_GetBuffer() call to determine if the buffer is null-terminated or otherwise. A proposed patch is attached.
msg248102 - (view) Author: John Leitch (JohnLeitch) * Date: 2015-08-06 03:19
Attaching repro.
msg254053 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-11-04 13:55
Merged with issue24802.
History
Date User Action Args
2015-11-04 13:55:56serhiy.storchakasetstatus: open -> closed
resolution: fixed
messages: + msg254053

stage: patch review -> resolved
2015-08-06 11:47:38eric.smithsetnosy: + eric.smith
2015-08-06 04:25:50serhiy.storchakasetversions: + Python 2.7, Python 3.4, Python 3.6
2015-08-06 04:24:40serhiy.storchakasetnosy: + serhiy.storchaka
assignee: serhiy.storchaka
components: + Interpreter Core
type: security -> crash
stage: patch review
2015-08-06 03:19:49JohnLeitchsetfiles: + PyNumber_Long_Buffer_Over-read.py

messages: + msg248102
2015-08-06 03:18:43JohnLeitchcreate