Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Various segfaults with dict #72132

Closed
tehybel mannequin opened this issue Sep 2, 2016 · 32 comments
Closed

Various segfaults with dict #72132

tehybel mannequin opened this issue Sep 2, 2016 · 32 comments
Assignees
Labels
3.7 (EOL) end of life interpreter-core (Objects, Python, Grammar, and Parser dirs) type-crash A hard crash of the interpreter, possibly with a core dump

Comments

@tehybel
Copy link
Mannequin

tehybel mannequin commented Sep 2, 2016

BPO 27945
Nosy @birkenfeld, @rhettinger, @larryhastings, @tiran, @benjaminp, @ned-deily, @methane, @serhiy-storchaka, @Vgr255, @tim-mitchell
PRs
  • bpo-27945: Fixed various segfaults with dict. #1657
  • [3.6] bpo-27945: Fixed various segfaults with dict. (GH-1657) #1677
  • [3.5] bpo-27945: Fixed various segfaults with dict. (GH-1657) #1678
  • [2.7] bpo-27945: Fixed various segfaults with dict. (GH-1657) #1681
  • [3.4] bpo-27945: Fixed various segfaults with dict. (GH-1657) (GH-1678) #2248
  • [3.3] bpo-27945: Fixed various segfaults with dict. (GH-1657) (GH-1678) #2396
  • Files
  • 0001-Issue-27945-fix-PyDict_MergeFromSeq2-use-after-free.patch
  • 0001-Issue-27945-fix-dictitems_contains-use-after-free.patch
  • 0001-Issue-27945-fix-dictiter_iternextitem-use-after-free.patch
  • 0001-Issue-27945-Fixed-segfaults-in-dict.fromkeys-when-it.patch
  • 27945-dict-segv-py27.patch
  • 27945-dict-segv-py36.patch
  • 27945-dict-segv-py37-2.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/serhiy-storchaka'
    closed_at = <Date 2017-07-26.03:08:42.879>
    created_at = <Date 2016-09-02.21:33:28.588>
    labels = ['interpreter-core', '3.7', 'type-crash']
    title = 'Various segfaults with dict'
    updated_at = <Date 2019-05-10.18:07:27.519>
    user = 'https://bugs.python.org/tehybel'

    bugs.python.org fields:

    activity = <Date 2019-05-10.18:07:27.519>
    actor = 'ned.deily'
    assignee = 'serhiy.storchaka'
    closed = True
    closed_date = <Date 2017-07-26.03:08:42.879>
    closer = 'ned.deily'
    components = ['Interpreter Core']
    creation = <Date 2016-09-02.21:33:28.588>
    creator = 'tehybel'
    dependencies = []
    files = ['44579', '44582', '44585', '44587', '45500', '45549', '45550']
    hgrepos = []
    issue_num = 27945
    keywords = ['patch']
    message_count = 32.0
    messages = ['274275', '275349', '275897', '275901', '275919', '275924', '275951', '275957', '275960', '280931', '280932', '280937', '280940', '280943', '281034', '281035', '281037', '281039', '281228', '281229', '281328', '281944', '281946', '294022', '294024', '294025', '294035', '294037', '297456', '297474', '298159', '299197']
    nosy_count = 12.0
    nosy_names = ['georg.brandl', 'rhettinger', 'larry', 'christian.heimes', 'benjamin.peterson', 'ned.deily', 'duaneg', 'methane', 'serhiy.storchaka', 'abarry', 'tehybel', 'Tim Mitchell']
    pr_nums = ['1657', '1677', '1678', '1681', '2248', '2396']
    priority = None
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'crash'
    url = 'https://bugs.python.org/issue27945'
    versions = ['Python 2.7', 'Python 3.3', 'Python 3.4', 'Python 3.5', 'Python 3.6', 'Python 3.7']

    @tehybel
    Copy link
    Mannequin Author

    tehybel mannequin commented Sep 2, 2016

    Here I'll describe five distinct issues I found. Common to them all is that they
    reside in the built-in dictionary object.

    Four of them are use-after-frees and one is an array-out-of-bounds indexing bug.

    All of the described functions reside in /Objects/dictobject.c.

    Issue 1: use-after-free when initializing a dictionary

    Initialization of a dictionary happens via the function dict_init which calls
    dict_update_common. From there, PyDict_MergeFromSeq2 may be called, and that is
    where this issue resides.

    In PyDict_MergeFromSeq2 we retrieve a sequence of size 2 with this line:

    	fast = PySequence_Fast(item, "");

    After checking its size, we take out a key and value:

    	key = PySequence_Fast_GET_ITEM(fast, 0);
    	value = PySequence_Fast_GET_ITEM(fast, 1);

    Then we call PyDict_GetItem. This calls back to Python code if the key has a
    __hash__ function. From there the "item" sequence could get modified, resulting
    in "key" or "value" getting used after having been freed.

    Here's a PoC:

    ---

    class X:
        def __hash__(self):
            pair[:] = []
            return 13
    
    pair = [X(), 123]
    dict([pair])

    It crashes while trying to use freed memory as a PyObject:

    (gdb) run ./poc24.py
    Program received signal SIGSEGV, Segmentation fault.
    0x000000000048fe25 in insertdict (mp=mp@entry=0x7ffff6d5c4b8, key=key@entry=0x7ffff6d52538, hash=0xd,
    value=value@entry=0x8d1ac0 <small_ints+6144>) at Objects/dictobject.c:831
    831 MAINTAIN_TRACKING(mp, key, value);
    (gdb) print *key
    $26 = {_ob_next = 0xdbdbdbdbdbdbdbdb, _ob_prev = 0xdbdbdbdbdbdbdbdb, ob_refcnt = 0xdbdbdbdbdbdbdbdb,
    ob_type = 0xdbdbdbdbdbdbdbdb}

    Issue 2: use-after-free in dictitems_contains

    In the function dictitems_contains we call PyDict_GetItem to look up a value in
    the dictionary:

    	found = PyDict_GetItem((PyObject *)dv->dv_dict, key);

    However this "found" variable is borrowed. We then go ahead and compare it:

    return PyObject_RichCompareBool(value, found, Py_EQ);
    

    But PyObject_RichCompareBool could call back into Python code and e.g. release
    the GIL. As a result, the dictionary may be mutated. Thus "found" could get
    freed.

    Then, inside PyObject_RichCompareBool (actually in do_richcompare), the "found"
    variable gets used after being freed.

    PoC:

    ---

    class X:
        def __eq__(self, other):
            d.clear()
            return NotImplemented
    
    d = {0: set()}
    (0, X()) in d.items()

    Result:

    (gdb) run ./poc25.py
    Program received signal SIGSEGV, Segmentation fault.
    0x00000000004a03b6 in do_richcompare (v=v@entry=0x7ffff6d52468, w=w@entry=0x7ffff6ddf7c8, op=op@entry=0x2)
    at Objects/object.c:673
    673 if (!checked_reverse_op && (f = w->ob_type->tp_richcompare) != NULL) {
    (gdb) print w->ob_type
    $26 = (struct _typeobject *) 0xdbdbdbdbdbdbdbdb

    Issue 3: use-after-free in dict_equal

    In the function dict_equal, we call the "lookdict" function via
    b->ma_keys->dk_lookup to look up a value:

    if ((b-\>ma_keys-\>dk_lookup)(b, key, ep-\>me_hash, &vaddr) == NULL)
    

    This value's address is stored into the "vaddr" variable and the value is
    fetched into the "bval" variable:

    bval = \*vaddr;
    

    Then we call Py_DECREF(key) which can call back into Python code. This could
    release the GIL and mutate dictionary b. Therefore "bval" could become freed at
    this point. We then proceed to use "bval":

    	cmp = PyObject_RichCompareBool(aval, bval, Py_EQ);

    This results in a use-after-free.

    PoC:

    ---

    class X():
        def __del__(self): 
            dict_b.clear()
        def __eq__(self, other):
            dict_a.clear()
            return True
        def __hash__(self): 
            return 13
            
    dict_a = {X(): 0}
    dict_b = {X(): X()}
    dict_a == dict_b

    Result:

    (gdb) run ./poc26.py
    Program received signal SIGSEGV, Segmentation fault.
    PyType_IsSubtype (a=0xdbdbdbdbdbdbdbdb, b=0x87ec60 <PyLong_Type>)
    at Objects/typeobject.c:1343
    1343 mro = a->tp_mro;
    (gdb) print a
    $59 = (PyTypeObject *) 0xdbdbdbdbdbdbdbdb

    Issue 4: use-after-free in _PyDict_FromKeys

    The function _PyDict_FromKeys takes an iterable as argument. If the iterable is
    a dict, _PyDict_FromKeys loops over it like this:

    	while (_PyDict_Next(iterable, &pos, &key, &oldvalue, &hash)) {
    		if (insertdict(mp, key, hash, value)) {
    			...
    		}
    	}

    However if we look at the comment for PyDict_Next, we see this:

     * CAUTION:  In general, it isn't safe to use PyDict_Next in a loop that
     * mutates the dict.
    

    But insertdict can call on to Python code which might mutate the dict. In that
    case we perform a use-after-free of the "key" variable.

    Here's a PoC:

    ---

    class X(int):
        def __hash__(self):
            return 13 
        def __eq__(self, other):
            if len(d) > 1:
                d.clear()
            return False
    
    d = {}
    d = {X(1): 1, X(2): 2}
    x = {}.fromkeys(d)

    And the result:

    (gdb) run ./poc27.py
    Program received signal SIGSEGV, Segmentation fault.
    0x0000000000435122 in visit_decref (op=0x7ffff6d5ca68, data=0x0) at Modules/gcmodule.c:373
    373 if (PyObject_IS_GC(op)) {
    (gdb) print *op
    $115 = {_ob_next = 0xdbdbdbdbdbdbdbdb, _ob_prev = 0xdbdbdbdbdbdbdbdb, ob_refcnt = 0xdbdbdbdbdbdbdbdb,
    ob_type = 0xdbdbdbdbdbdbdbdb}

    An almost identical issue also exists further down in the function when calling
    _PySet_NextEntry. To see this crash, just change "d" to be a set in the PoC
    above:

    	d = set()
    	d = set([X(1), X(2)])

    this likewise crashes with a use-after-free.

    (Note: if you grep for PyDict_Next you will find more similar cases, although
    many are in obscure modules or deprecated functions. I'm not sure those are
    worth fixing? E.g. here's a crasher for BaseException_setstate which also calls
    PyDict_Next:

    ---

    class X(str):
        def __hash__(self):
            d.clear()
            return 13
    
    d = {}
    d[X()] = X()
    
    e = Exception()
    e.__setstate__(d)

    end note.)

    Issue 5: out-of-bounds indexing in dictiter_iternextitem

    The function dictiter_iternextitem is used to iterate over a dictionary's items.
    dictiter_iternextitem is careful to check that the dictionary did not change
    size during iteration. However after performing this check, it calls Py_DECREF:

    	Py_DECREF(PyTuple_GET_ITEM(result, 0));
    	Py_DECREF(PyTuple_GET_ITEM(result, 1));

    This can execute Python code and mutate the dict. If that happens, the index "i"
    previously computed by dictiter_iternextitem could become invalid. It would then
    index out of bounds with this line:

    	key = d->ma_keys->dk_entries[i].me_key;

    Furthermore the "value_ptr" variable would have gone stale, too. Taking the
    "value" variable out of it uses memory that has been freed:

    value = \*value_ptr;
    

    Here's a PoC which crashes with the "value" variable being an arbitrary pointer:

    ---

    class X(int):
        def __del__(self):
            d.clear()
        
    d = {i: X(i) for i in range(8)}
        
    for result in d.items():
        if result[0] == 2:
            d[2] = None # free d[2] --> X(2).__del__ is called

    The result:

    (gdb) run ./poc29.py
    Program received signal SIGSEGV, Segmentation fault.
    dictiter_iternextitem (di=0x7ffff6d49cd8) at Objects/dictobject.c:3187
    3187 Py_INCREF(key);
    (gdb) print value
    $12 = (PyObject *) 0x7b7b7b7b7b7b7b7b

    @tehybel tehybel mannequin added the interpreter-core (Objects, Python, Grammar, and Parser dirs) label Sep 2, 2016
    @Vgr255 Vgr255 mannequin changed the title five dictobject issues Various segfaults with dict Sep 2, 2016
    @Vgr255 Vgr255 mannequin added the type-crash A hard crash of the interpreter, possibly with a core dump label Sep 2, 2016
    @Vgr255
    Copy link
    Mannequin

    Vgr255 mannequin commented Sep 9, 2016

    Ping. The built-in dict was considerably changed in bpo-27350; do any of these issues still persist?

    @duaneg
    Copy link
    Mannequin

    duaneg mannequin commented Sep 12, 2016

    I cannot reproduce the segfaults for the first four issues however valgrind still reports problems for all but the second. The fifth (last) one still segfaults.

    I have a patch for the fifth issue. The other remaining issues are all reporting the same invalid read at exit. I'm not sure whether that is the same issue reported here or something else that they all trigger:

    Invalid read of size 8
    at 0x43C5FA: delete_garbage (gcmodule.c:868)
    by 0x43C5FA: collect (gcmodule.c:1019)
    by 0x43D2F0: _PyGC_CollectNoFail (gcmodule.c:1623)
    by 0x55D95E: PyImport_Cleanup (import.c:420)
    by 0x4224E5: Py_FinalizeEx.part.3 (pylifecycle.c:612)
    by 0x43B0CB: Py_Main (main.c:798)
    by 0x41DC71: main (python.c:69)

    @duaneg
    Copy link
    Mannequin

    duaneg mannequin commented Sep 12, 2016

    Apologies: compiling python with --with-pydebug all of these issues are reproducible on head after all. Furthermore while my patch fixes the reported crash it still crashes on exit:

    Program received signal SIGSEGV, Segmentation fault.
    0x0000000000437193 in visit_decref (op=0x7ffff68a4c98, data=0x0) at Modules/gcmodule.c:374
    374 if (PyObject_IS_GC(op)) {
    (gdb) bt
    #0 0x0000000000437193 in visit_decref (op=0x7ffff68a4c98, data=0x0) at Modules/gcmodule.c:374
    #1 0x00000000004a9112 in tupletraverse (o=0x7ffff68a49f8, visit=0x43716d <visit_decref>, arg=0x0) at Objects/tupleobject.c:571
    #2 0x000000000043690a in subtract_refs (containers=containers@entry=0x87cda0 <generations+64>) at Modules/gcmodule.c:399
    #3 0x0000000000437ac3 in collect (generation=generation@entry=2, n_collected=n_collected@entry=0x7fffffffd838, n_uncollectable=n_uncollectable@entry=0x7fffffffd840,
    nofail=nofail@entry=0) at Modules/gcmodule.c:956
    #4 0x0000000000437f57 in collect_with_callback (generation=generation@entry=2) at Modules/gcmodule.c:1128
    #5 0x00000000004383a6 in PyGC_Collect () at Modules/gcmodule.c:1592
    #6 0x00000000004383cf in _PyGC_CollectIfEnabled () at Modules/gcmodule.c:1605
    #7 0x0000000000420c76 in Py_FinalizeEx () at Python/pylifecycle.c:603
    #8 0x000000000043682b in Py_Main (argc=argc@entry=2, argv=argv@entry=0x90d010) at Modules/main.c:798
    #9 0x000000000041d153 in main (argc=2, argv=0x7fffffffdae8) at ./Programs/python.c:69

    @duaneg
    Copy link
    Mannequin

    duaneg mannequin commented Sep 12, 2016

    Fix for the first issue: with this fix there is no segfault or valgrind issue reported on during execution or on exit.

    @duaneg
    Copy link
    Mannequin

    duaneg mannequin commented Sep 12, 2016

    Fix for the second issue: with this fix there is no segfault or valgrind issue reported on during execution or on exit.

    @duaneg
    Copy link
    Mannequin

    duaneg mannequin commented Sep 12, 2016

    Ah, my first fix (for the fifth issue) was incomplete. Please see attached patch which I think correctly fixes the problem.

    @serhiy-storchaka serhiy-storchaka self-assigned this Sep 12, 2016
    @tim-mitchell
    Copy link
    Mannequin

    tim-mitchell mannequin commented Sep 12, 2016

    Here is my patch for parts 3 and 4.
    Core issue for part 4 appears to be dk_lookup calling arbitrary python code may free the key.
    dk_lookup is also used in _PyDict_LoadGlobal not sure if this bug can occur here.

    @duaneg
    Copy link
    Mannequin

    duaneg mannequin commented Sep 12, 2016

    Note that I think most or all of these issues apply to 2.7 and while I didn't do a proper check I think the fixes also apply.

    @serhiy-storchaka serhiy-storchaka added the 3.7 (EOL) end of life label Oct 9, 2016
    @methane
    Copy link
    Member

    methane commented Nov 16, 2016

    0001-Issue-27945-fix-PyDict_MergeFromSeq2-use-after-free.patch:

    LGTM. I've checked it can be applied to 2.7 and 3.5 branch and passes
    make quicktest.

    @methane
    Copy link
    Member

    methane commented Nov 16, 2016

    0001-Issue-27945-fix-dictitems_contains-use-after-free.patch

    LGTM.
    This patch can be applied to 2.7 and 3.5, without conflict against previous patch.
    It passes make quicktest.

    @methane
    Copy link
    Member

    methane commented Nov 16, 2016

    0001-Issue-27945-fix-dictiter_iternextitem-use-after-free.patch LGTM and OK too.
    But 0001-Issue-27945-Fixed-segfaults-in-dict.fromkeys-when-it.patch cause conflict.

    I want to commit first three patches.
    For another reviewer, here is the patch merging three patches. Please review it.

    @serhiy-storchaka
    Copy link
    Member

    I worry about performance. Additional increment/decrement of reference counter can add significant overhead in critical places of Python interpreter. Before committing any patches we should measure the performance lost.

    @methane
    Copy link
    Member

    methane commented Nov 16, 2016

    OK, I'll run benchmark in this week.
    But three patches seems don't affects to performance critical APIs.

    @methane
    Copy link
    Member

    methane commented Nov 17, 2016

    I run performance 0.5.0 on Python 3.5.
    Since it took long time even without -b all option, I haven't run it for Python 2.7 yet.

    On Python 3.5:

    $ ./venv/cpython3.5-846d5b1f0b61/bin/python -m perf compare_to py35-master.json py35-patched.json  -G
    Slower (14):
    - logging_silent: 2.92 us +- 0.15 us -> 3.05 us +- 0.18 us: 1.04x slower
    - pickle: 93.2 us +- 4.7 us -> 95.8 us +- 5.0 us: 1.03x slower
    - python_startup: 72.6 ms +- 4.7 ms -> 74.6 ms +- 4.8 ms: 1.03x slower
    - meteor_contest: 766 ms +- 13 ms -> 779 ms +- 15 ms: 1.02x slower
    - scimark_sor: 2.00 sec +- 0.04 sec -> 2.04 sec +- 0.03 sec: 1.02x slower
    - fannkuch: 3.86 sec +- 0.05 sec -> 3.91 sec +- 0.04 sec: 1.01x slower
    - python_startup_no_site: 37.4 ms +- 2.4 ms -> 38.0 ms +- 2.6 ms: 1.01x slower
    - regex_dna: 958 ms +- 13 ms -> 973 ms +- 16 ms: 1.01x slower
    - regex_compile: 1.41 sec +- 0.02 sec -> 1.43 sec +- 0.02 sec: 1.01x slower
    - raytrace: 5.46 sec +- 0.06 sec -> 5.52 sec +- 0.07 sec: 1.01x slower
    - scimark_lu: 1.88 sec +- 0.09 sec -> 1.90 sec +- 0.12 sec: 1.01x slower
    - nbody: 887 ms +- 19 ms -> 895 ms +- 21 ms: 1.01x slower
    - 2to3: 3.01 sec +- 0.03 sec -> 3.03 sec +- 0.04 sec: 1.01x slower
    - scimark_fft: 2.52 sec +- 0.05 sec -> 2.54 sec +- 0.04 sec: 1.01x slower

    Faster (16):

    • scimark_sparse_mat_mult: 35.0 ms +- 1.8 ms -> 32.7 ms +- 2.0 ms: 1.07x faster
    • xml_etree_process: 1.13 sec +- 0.03 sec -> 1.07 sec +- 0.03 sec: 1.06x faster
    • xml_etree_generate: 1.35 sec +- 0.03 sec -> 1.29 sec +- 0.03 sec: 1.05x faster
    • mako: 154 ms +- 8 ms -> 149 ms +- 7 ms: 1.04x faster
    • telco: 87.4 ms +- 3.7 ms -> 84.2 ms +- 4.1 ms: 1.04x faster
    • tornado_http: 1.75 sec +- 0.03 sec -> 1.70 sec +- 0.03 sec: 1.03x faster
    • call_simple: 45.5 ms +- 2.1 ms -> 44.3 ms +- 1.4 ms: 1.03x faster
    • sympy_integrate: 202 ms +- 9 ms -> 198 ms +- 7 ms: 1.02x faster
    • sympy_str: 2.08 sec +- 0.06 sec -> 2.04 sec +- 0.06 sec: 1.02x faster
    • django_template: 1.65 sec +- 0.03 sec -> 1.61 sec +- 0.03 sec: 1.02x faster
    • call_method_unknown: 65.1 ms +- 0.9 ms -> 64.3 ms +- 0.9 ms: 1.01x faster
    • xml_etree_parse: 1.34 sec +- 0.04 sec -> 1.32 sec +- 0.04 sec: 1.01x faster
    • sympy_sum: 1.01 sec +- 0.03 sec -> 998 ms +- 35 ms: 1.01x faster
    • call_method_slots: 59.3 ms +- 0.8 ms -> 58.6 ms +- 1.0 ms: 1.01x faster
    • float: 1.24 sec +- 0.03 sec -> 1.23 sec +- 0.02 sec: 1.01x faster
    • chaos: 1.20 sec +- 0.02 sec -> 1.19 sec +- 0.02 sec: 1.01x faster

    Benchmark hidden because not significant (31): call_method, chameleon, crypto_pyaes, deltablue, dulwich_log, genshi_text, genshi_xml, go, hexiom, html5lib, json_dumps, json_loads, logging_format, logging_simple, nqueens, pathlib, pickle_dict, pickle_list, pickle_pure_python, pidigits, regex_effbot, regex_v8, richards, scimark_monte_carlo, spectral_norm, sympy_expand, unpack_sequence, unpickle, unpickle_list, unpickle_pure_python, xml_etree_iterparse

    @methane
    Copy link
    Member

    methane commented Nov 17, 2016

    Only patch which affects to hot loop is:

    --- a/Objects/dictobject.c	Tue Nov 15 21:21:35 2016 -0500
    +++ b/Objects/dictobject.c	Wed Nov 16 11:40:51 2016 +0000
    @@ -1550,11 +1550,18 @@ PyDict_MergeFromSeq2(PyObject *d, PyObje
             /* Update/merge with this (key, value) pair. */
             key = PySequence_Fast_GET_ITEM(fast, 0);
             value = PySequence_Fast_GET_ITEM(fast, 1);
    +        Py_INCREF(key);
    +        Py_INCREF(value);
             if (override || PyDict_GetItem(d, key) == NULL) {
                 int status = PyDict_SetItem(d, key, value);
    -            if (status < 0)
    +            if (status < 0) {
    +                Py_DECREF(key);
    +                Py_DECREF(value);
                     goto Fail;
    +            }
             }
    +        Py_DECREF(key);
    +        Py_DECREF(value);
             Py_DECREF(fast);
             Py_DECREF(item);
         }

    Microbenchmark for it:

    $ ~/local/py35/bin/master -m perf timeit --rigorous -t --python ~/local/py35/bin/patched --compare-to ~/local/py35/bin/master -s 'L = [(i,i) for i in range(10000)]' -- 'dict.fromkeys(L)'
    Benchmark master

    ================

    .........................................Total duration: 21.2 sec
    Start date: 2016-11-17 12:21:39
    End date: 2016-11-17 12:22:13
    Raw sample minimum: 109 ms
    Raw sample maximum: 144 ms

    Number of runs: 41
    Total number of samples: 120
    Number of samples per run: 3
    Number of warmups per run: 1
    Loop iterations per sample: 32

    Minimum: 3.41 ms (-13%)
    Median +- std dev: 3.92 ms +- 0.19 ms
    Mean +- std dev: 3.92 ms +- 0.19 ms
    Maximum: 4.50 ms (+15%)

    Median +- std dev: 3.92 ms +- 0.19 ms

    Benchmark patched
    =================

    .........................................Total duration: 21.3 sec
    Start date: 2016-11-17 12:22:13
    End date: 2016-11-17 12:22:47
    Raw sample minimum: 108 ms
    Raw sample maximum: 152 ms

    Number of runs: 41
    Total number of samples: 120
    Number of samples per run: 3
    Number of warmups per run: 1
    Loop iterations per sample: 32

    Minimum: 3.39 ms (-14%)
    Median +- std dev: 3.92 ms +- 0.23 ms
    Mean +- std dev: 3.95 ms +- 0.23 ms
    Maximum: 4.74 ms (+21%)

    Median +- std dev: 3.92 ms +- 0.23 ms

    Compare
    =======

    Median +- std dev: [master] 3.92 ms +- 0.19 ms -> [patched] 3.92 ms +- 0.23 ms: 1.00x slower
    Not significant!

    @methane
    Copy link
    Member

    methane commented Nov 17, 2016

    I'm sorry, dict.fromkeys() didn't use PyDict_MergeFromSeq2().

    This may be microbench for worst case:

    $ ~/local/py35/bin/master -m perf timeit --rigorous  --python ~/local/py35/bin/patched --compare-to ~/local/py35/bin/master -s 'L = [(i,i) for i in range(10000)]' -- 'dict(L)'
    master: ......................................... 2.06 ms +- 0.11 ms
    patched: ......................................... 2.16 ms +- 0.09 ms

    Median +- std dev: [master] 2.06 ms +- 0.11 ms -> [patched] 2.16 ms +- 0.09 ms: 1.05x slower

    $ ~/local/py27/bin/master -m perf timeit --rigorous  --python ~/local/py27/bin/patched --compare-to ~/local/py27/bin/master -s 'L = [(i,i) for i in range(10000)]' -- 'dict(L)'
    master: ......................................... 1.48 ms +- 0.06 ms
    patched: ......................................... 1.57 ms +- 0.09 ms

    Median +- std dev: [master] 1.48 ms +- 0.06 ms -> [patched] 1.57 ms +- 0.09 ms: 1.06x slower

    @methane
    Copy link
    Member

    methane commented Nov 17, 2016

    I modified the patch to avoid incref&decref when pair is not list, because tuple is common for such case.
    But I can't get back original performance.

    (python 2.7 is modified version of patch)

    inada-n@test1:~/work/bench$ ~/local/py27/bin/master -m perf timeit --rigorous --python ~/local/py27/bin/python2.7 --compare-to ~/local/py27/bin/master -s 'L = [(i,i) for i in range(10000)]' -- 'dict(L)'
    master: ......................................... 1.47 ms +- 0.06 ms
    python2.7: ......................................... 1.55 ms +- 0.06 ms

    Median +- std dev: [master] 1.47 ms +- 0.06 ms -> [python2.7] 1.55 ms +- 0.06 ms: 1.05x slower
    inada-n@test1:~/work/bench$ ~/local/py27/bin/master -m perf timeit --rigorous --python ~/local/py27/bin/python2.7 --compare-to ~/local/py27/bin/patched -s 'L = [(i,i) for i in range(10000)]' -- 'dict(L)'
    patched: ......................................... 1.56 ms +- 0.08 ms
    python2.7: ......................................... 1.55 ms +- 0.09 ms

    Median +- std dev: [patched] 1.56 ms +- 0.08 ms -> [python2.7] 1.55 ms +- 0.09 ms: 1.01x faster
    Not significant!

    @serhiy-storchaka
    Copy link
    Member

    Here is a consolidated patch for review.

    @serhiy-storchaka
    Copy link
    Member

    The change to dict_equal() LGTM. It doesn't add an overhead.

    For dictiter_iternextitem() I propose other change. It doesn't add an overhead.

    There are bugs in the patch for _PyDict_FromKeys().

    The change to dictitems_contains() adds an overhead, but it is small and seems unavoidable.

    I wondering whether it is worth to use PySequence_Tuple() instead of PySequence_Fast() in PyDict_MergeFromSeq2(). This would add a cost of creating new tuple if items are not tuples, but save increfing/decrefing the key and the value in common case.

    I have doubts about insertdict(). Additional incref duplicates increfs in dict_merge(). Is it worth to move it outside of insertdict()?

    I have doubts about _PyDict_FromKeys(). It seems to me that it is safe to use _PyDict_Next() in a loop that mutates the dict (not checked _PySet_Next()). With guarded insertdict() additional check is not needed (and it doesn't help against clearing the dict inside insertdict()).

    In updated patch fixed some bugs and used different solution for dictiter_iternextitem().

    @methane
    Copy link
    Member

    methane commented Nov 21, 2016

    LGTM.

    Performance on Azure VM (AMD Opteron(tm) Processor 4171 HE):

    $ ~/local/py36/bin/patched -m perf compare_to master.json patched.json -G
    Slower (10):
    - spectral_norm: 915 ms +- 17 ms -> 967 ms +- 25 ms: 1.06x slower
    - nbody: 774 ms +- 28 ms -> 805 ms +- 22 ms: 1.04x slower
    - regex_dna: 965 ms +- 18 ms -> 993 ms +- 22 ms: 1.03x slower
    - go: 1.93 sec +- 0.05 sec -> 1.99 sec +- 0.06 sec: 1.03x slower
    - python_startup_no_site: 29.7 ms +- 2.0 ms -> 30.5 ms +- 2.0 ms: 1.03x slower
    - xml_etree_parse: 1.02 sec +- 0.03 sec -> 1.04 sec +- 0.04 sec: 1.02x slower
    - python_startup: 52.7 ms +- 3.6 ms -> 53.7 ms +- 3.6 ms: 1.02x slower
    - xml_etree_generate: 962 ms +- 24 ms -> 978 ms +- 25 ms: 1.02x slower
    - pickle_dict: 188 us +- 8 us -> 191 us +- 8 us: 1.02x slower
    - scimark_fft: 2.19 sec +- 0.04 sec -> 2.20 sec +- 0.05 sec: 1.00x slower

    Faster (26):

    • fannkuch: 3.76 sec +- 0.07 sec -> 3.55 sec +- 0.09 sec: 1.06x faster
    • call_method_slots: 59.4 ms +- 9.0 ms -> 56.6 ms +- 3.4 ms: 1.05x faster
    • sqlite_synth: 27.6 us +- 1.4 us -> 26.4 us +- 1.2 us: 1.05x faster
    • nqueens: 852 ms +- 19 ms -> 813 ms +- 20 ms: 1.05x faster
    • django_template: 1.35 sec +- 0.04 sec -> 1.29 sec +- 0.03 sec: 1.05x faster
    • unpack_sequence: 407 ns +- 16 ns -> 390 ns +- 16 ns: 1.04x faster
    • call_method: 60.6 ms +- 1.4 ms -> 58.1 ms +- 1.6 ms: 1.04x faster
    • xml_etree_iterparse: 918 ms +- 29 ms -> 881 ms +- 32 ms: 1.04x faster
    • unpickle_pure_python: 3.44 ms +- 0.14 ms -> 3.31 ms +- 0.19 ms: 1.04x faster
    • richards: 608 ms +- 20 ms -> 585 ms +- 15 ms: 1.04x faster
    • chaos: 921 ms +- 22 ms -> 891 ms +- 24 ms: 1.03x faster
    • mako: 189 ms +- 8 ms -> 183 ms +- 8 ms: 1.03x faster
    • meteor_contest: 699 ms +- 23 ms -> 677 ms +- 26 ms: 1.03x faster
    • regex_compile: 1.52 sec +- 0.03 sec -> 1.48 sec +- 0.04 sec: 1.03x faster
    • chameleon: 101 ms +- 5 ms -> 97.8 ms +- 4.0 ms: 1.03x faster
    • regex_v8: 153 ms +- 7 ms -> 149 ms +- 6 ms: 1.03x faster
    • 2to3: 2.59 sec +- 0.05 sec -> 2.52 sec +- 0.05 sec: 1.03x faster
    • crypto_pyaes: 797 ms +- 20 ms -> 776 ms +- 18 ms: 1.03x faster
    • genshi_xml: 653 ms +- 23 ms -> 637 ms +- 27 ms: 1.03x faster
    • pickle_pure_python: 4.54 ms +- 0.19 ms -> 4.42 ms +- 0.20 ms: 1.03x faster
    • regex_effbot: 18.4 ms +- 0.7 ms -> 18.1 ms +- 0.6 ms: 1.02x faster
    • unpickle: 107 us +- 4 us -> 105 us +- 5 us: 1.02x faster
    • deltablue: 61.2 ms +- 3.4 ms -> 60.0 ms +- 3.5 ms: 1.02x faster
    • raytrace: 4.34 sec +- 0.14 sec -> 4.26 sec +- 0.10 sec: 1.02x faster
    • telco: 63.9 ms +- 3.1 ms -> 62.9 ms +- 3.2 ms: 1.02x faster
    • pidigits: 965 ms +- 14 ms -> 955 ms +- 20 ms: 1.01x faster

    Benchmark hidden because not significant (28): call_method_unknown, call_simple, dulwich_log, float, genshi_text, hexiom, html5lib, json_dumps, json_loads, logging_format, logging_silent, logging_simple, pathlib, pickle, pickle_list, scimark_lu, scimark_monte_carlo, scimark_sor, scimark_sparse_mat_mult, sqlalchemy_declarative, sqlalchemy_imperative, sympy_expand, sympy_integrate, sympy_str, sympy_sum, tornado_http, unpickle_list, xml_etree_process

    @ned-deily
    Copy link
    Member

    Where do we stand on this issue? At the moment, 3.6.0 is going to be released without these fixes.

    @serhiy-storchaka
    Copy link
    Member

    This issue is severe, but I don't consider it as release critical for 3.6.0. The patch fixes segfaults, but it can add unneeded overhead, and the dict performance is critical for Python core. The segfaults are not new. I'm trying to minimize the overhead of changes, but this doesn't suffer a haste.

    @serhiy-storchaka
    Copy link
    Member

    New changeset 753bca3 by Serhiy Storchaka in branch 'master':
    bpo-27945: Fixed various segfaults with dict. (bpo-1657)
    753bca3

    @serhiy-storchaka
    Copy link
    Member

    New changeset 564398a by Serhiy Storchaka in branch '3.6':
    [3.6] bpo-27945: Fixed various segfaults with dict. (GH-1657) (bpo-1677)
    564398a

    @serhiy-storchaka
    Copy link
    Member

    New changeset 2f7f533 by Serhiy Storchaka in branch '3.5':
    [3.5] bpo-27945: Fixed various segfaults with dict. (GH-1657) (bpo-1678)
    2f7f533

    @serhiy-storchaka
    Copy link
    Member

    New changeset e6a0b59 by Serhiy Storchaka in branch '2.7':
    [2.7] bpo-27945: Fixed various segfaults with dict. (GH-1657) (bpo-1681)
    e6a0b59

    @serhiy-storchaka
    Copy link
    Member

    Thank you for your patches Duane and Tim. Thank you for your very detailed report that allow writing these patches tehybel.

    I excluded changes in dict.fromkeys() since they look unnecessary for this issue after fixing insertdict(). There are other reasons for adding these changes (this makes the behavior with exact dicts and sets and their subclasses more consistent), but this is other issue. There are also reasons for not adding them.

    We need to check also the set implementation whether it is vulnerable to similar issues.

    @ned-deily
    Copy link
    Member

    Since Serhiy created backport PRs for 3.4 and 3.3, I'm reopening the issue and marking it as Release Blocker (for those releases) so we don't lose track of them and agree they meet the criteria for security-fix-only releases.

    @serhiy-storchaka
    Copy link
    Member

    For the context see bpo-30484. If the segfault is caused by garbage collection this perhaps is not an expluatable vulnerability, but a severe bug that can affect any multithread application and doesn't have a workaround.

    @larryhastings
    Copy link
    Contributor

    New changeset f734479 by larryhastings (Serhiy Storchaka) in branch '3.4':
    [3.4] [3.5] bpo-27945: Fixed various segfaults with dict. (GH-1657) (GH-1678) (bpo-2248)
    f734479

    @ned-deily
    Copy link
    Member

    New changeset 8fbdab5 by Ned Deily (Serhiy Storchaka) in branch '3.3':
    [3.3] [3.5] bpo-27945: Fixed various segfaults with dict. (GH-1657) (GH-1678) (bpo-2396)
    8fbdab5

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life interpreter-core (Objects, Python, Grammar, and Parser dirs) type-crash A hard crash of the interpreter, possibly with a core dump
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants