Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIGSEGV using json.tool: highly nested OrderedDict #69582

Closed
nagisa mannequin opened this issue Oct 13, 2015 · 15 comments
Closed

SIGSEGV using json.tool: highly nested OrderedDict #69582

nagisa mannequin opened this issue Oct 13, 2015 · 15 comments
Assignees
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) type-crash A hard crash of the interpreter, possibly with a core dump

Comments

@nagisa
Copy link
Mannequin

nagisa mannequin commented Oct 13, 2015

BPO 25395
Nosy @rhettinger, @vstinner, @ericsnowcurrently, @vadmium, @serhiy-storchaka
Files
  • ast.json
  • odict-trashcan.patch
  • odict-trashcan.v2.patch
  • odict-trashcan.v3.patch
  • odict-trashcan.v4.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/serhiy-storchaka'
    closed_at = <Date 2015-11-01.14:16:12.034>
    created_at = <Date 2015-10-13.19:39:41.723>
    labels = ['interpreter-core', 'type-crash']
    title = 'SIGSEGV using json.tool: highly nested OrderedDict'
    updated_at = <Date 2015-11-01.14:16:12.033>
    user = 'https://bugs.python.org/nagisa'

    bugs.python.org fields:

    activity = <Date 2015-11-01.14:16:12.033>
    actor = 'serhiy.storchaka'
    assignee = 'serhiy.storchaka'
    closed = True
    closed_date = <Date 2015-11-01.14:16:12.034>
    closer = 'serhiy.storchaka'
    components = ['Interpreter Core']
    creation = <Date 2015-10-13.19:39:41.723>
    creator = 'nagisa'
    dependencies = []
    files = ['40773', '40804', '40807', '40808', '40822']
    hgrepos = []
    issue_num = 25395
    keywords = ['patch']
    message_count = 15.0
    messages = ['252954', '252959', '252960', '252961', '252998', '253143', '253145', '253148', '253150', '253151', '253220', '253237', '253652', '253845', '253846']
    nosy_count = 7.0
    nosy_names = ['rhettinger', 'vstinner', 'python-dev', 'eric.snow', 'martin.panter', 'serhiy.storchaka', 'nagisa']
    pr_nums = []
    priority = 'critical'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'crash'
    url = 'https://bugs.python.org/issue25395'
    versions = ['Python 3.5', 'Python 3.6']

    @nagisa
    Copy link
    Mannequin Author

    nagisa mannequin commented Oct 13, 2015

    cat attachment | python -m json.tool

    reliably makes python to SIGSEGV on Arch linux

    $ python --version
    Python 3.5.0
    $ uname -a
    Linux kumabox 4.2.2-1-ARCH #1 SMP PREEMPT Tue Sep 29 22:21:33 CEST 2015 x86_64 GNU/Linux

    Does not fail on 2.7.10.

    @nagisa nagisa mannequin added stdlib Python modules in the Lib dir type-crash A hard crash of the interpreter, possibly with a core dump labels Oct 13, 2015
    @serhiy-storchaka
    Copy link
    Member

    Doesn't crash on 3.4.3. Crashes on 3.5.0. The crash is reproduced with Python-only implementation of json, therefore it is not related to json.

    @vstinner
    Copy link
    Member

    Can you post a GDB traceback?

    @serhiy-storchaka
    Copy link
    Member

    3.6.0a0 (de982d8b7b15)

    #0  0x080e62a1 in _PyTrash_thread_destroy_chain () at Objects/object.c:2010
    #1  0x080bf8b3 in frame_dealloc (f=f@entry=0x83f06e4) at Objects/frameobject.c:462
    #2  0x080e3eef in _Py_Dealloc (op=op@entry=0x83f06e4) at Objects/object.c:1786
    #3  0x081909c1 in fast_function (func=func@entry=0xb77f4154, pp_stack=pp_stack@entry=0xbfffe6ac, n=n@entry=0, na=na@entry=0, nk=nk@entry=0) at Python/ceval.c:4752
    #4  0x0819102a in call_function (pp_stack=pp_stack@entry=0xbfffe6ac, oparg=oparg@entry=0) at Python/ceval.c:4677
    #5  0x0818e1f7 in PyEval_EvalFrameEx (f=f@entry=0xb795f034, throwflag=throwflag@entry=0) at Python/ceval.c:3181
    #6  0x081906e7 in _PyEval_EvalCodeWithName (_co=_co@entry=0xb77f00f8, globals=globals@entry=0xb798f734, locals=locals@entry=0xb798f734, args=args@entry=0x0, 
        argcount=argcount@entry=0, kws=kws@entry=0x0, kwcount=kwcount@entry=0, defs=defs@entry=0x0, defcount=defcount@entry=0, kwdefs=kwdefs@entry=0x0, 
        closure=closure@entry=0x0, name=name@entry=0x0, qualname=qualname@entry=0x0) at Python/ceval.c:3962
    #7  0x0819080a in PyEval_EvalCodeEx (_co=_co@entry=0xb77f00f8, globals=globals@entry=0xb798f734, locals=locals@entry=0xb798f734, args=args@entry=0x0, 
        argcount=argcount@entry=0, kws=kws@entry=0x0, kwcount=kwcount@entry=0, defs=defs@entry=0x0, defcount=defcount@entry=0, kwdefs=kwdefs@entry=0x0, 
        closure=closure@entry=0x0) at Python/ceval.c:3983
    #8  0x0819086d in PyEval_EvalCode (co=co@entry=0xb77f00f8, globals=globals@entry=0xb798f734, locals=locals@entry=0xb798f734) at Python/ceval.c:777
    #9  0x08180503 in builtin_exec_impl (module=module@entry=0xb7a1e534, source=0xb77f00f8, globals=0xb798f734, locals=0xb798f734) at Python/bltinmodule.c:942
    #10 0x08180688 in builtin_exec (module=module@entry=0xb7a1e534, args=args@entry=0xb78b784c) at Python/clinic/bltinmodule.c.h:275
    #11 0x080e025c in PyCFunction_Call (func=func@entry=0xb7a1ecb4, args=args@entry=0xb78b784c, kwds=kwds@entry=0x0) at Objects/methodobject.c:109
    #12 0x08190f18 in call_function (pp_stack=pp_stack@entry=0xbfffe97c, oparg=oparg@entry=2) at Python/ceval.c:4651
    #13 0x0818e1f7 in PyEval_EvalFrameEx (f=f@entry=0xb79d4cac, throwflag=throwflag@entry=0) at Python/ceval.c:3181
    #14 0x081906e7 in _PyEval_EvalCodeWithName (_co=0xb78b9028, globals=<optimized out>, locals=locals@entry=0x0, args=args@entry=0xb7a00e84, argcount=argcount@entry=5, 
        kws=kws@entry=0xb7a00e98, kwcount=kwcount@entry=0, defs=defs@entry=0xb78ba288, defcount=5, kwdefs=kwdefs@entry=0x0, closure=closure@entry=0x0, 
        name=name@entry=0xb78b83e8, qualname=qualname@entry=0xb78b83e8) at Python/ceval.c:3962
    #15 0x08190a52 in fast_function (func=func@entry=0xb7925994, pp_stack=pp_stack@entry=0xbfffeb3c, n=n@entry=5, na=na@entry=5, nk=nk@entry=0) at Python/ceval.c:4760
    #16 0x0819102a in call_function (pp_stack=pp_stack@entry=0xbfffeb3c, oparg=oparg@entry=5) at Python/ceval.c:4677
    #17 0x0818e1f7 in PyEval_EvalFrameEx (f=f@entry=0xb7a00d1c, throwflag=throwflag@entry=0) at Python/ceval.c:3181
    #18 0x081906e7 in _PyEval_EvalCodeWithName (_co=_co@entry=0xb78b9298, globals=globals@entry=0xb78b17dc, locals=locals@entry=0x0, args=args@entry=0xb79b83c8, 
        argcount=argcount@entry=2, kws=kws@entry=0x0, kwcount=kwcount@entry=0, defs=defs@entry=0xb78b7eb8, defcount=defcount@entry=1, kwdefs=kwdefs@entry=0x0, 
        closure=closure@entry=0x0, name=name@entry=0x0, qualname=qualname@entry=0x0) at Python/ceval.c:3962
    #19 0x0819080a in PyEval_EvalCodeEx (_co=0xb78b9298, globals=globals@entry=0xb78b17dc, locals=locals@entry=0x0, args=args@entry=0xb79b83c8, argcount=2, 
        kws=kws@entry=0x0, kwcount=kwcount@entry=0, defs=defs@entry=0xb78b7eb8, defcount=defcount@entry=1, kwdefs=0x0, closure=0x0) at Python/ceval.c:3983
    #20 0x0820be82 in function_call (func=0xb7841454, arg=0xb79b83b4, kw=0x0) at Objects/funcobject.c:632
    #21 0x080939ce in PyObject_Call (func=func@entry=0xb7841454, arg=arg@entry=0xb79b83b4, kw=kw@entry=0x0) at Objects/abstract.c:2147
    #22 0x0807614a in RunModule (modname=modname@entry=0x8330308 L"json.tool", set_argv0=set_argv0@entry=1) at Modules/main.c:208
    #23 0x08076f56 in Py_Main (argc=argc@entry=5, argv=argv@entry=0x832f010) at Modules/main.c:709
    #24 0x0805edb4 in main (argc=5, argv=0xbfffee84) at ./Programs/python.c:69

    @serhiy-storchaka
    Copy link
    Member

    Likely this issue is related to C implementation of OrderedDict. json.tool doesn't crash with --sort-keys and doesn't crash with Python implementation of OrderedDict.

    The patch for similar bpo-25406 doesn't fix this issue.

    @vadmium
    Copy link
    Member

    vadmium commented Oct 18, 2015

    The following simplified code produces the crash:

    from collections import OrderedDict
    obj = []
    for _ in range(33):
        obj = OrderedDict(((None, obj),))
    for _ in range(17):
        obj = [obj]
    print("Still alive, crash happens at interpreter finalization")

    This crashes at the same line as in Serhiy’s backtrace. In _PyTrash_thread_destroy_chain(), Py_TYPE(op) is a bad pointer (0xdbdbdbdbdbdbdbdb); I have enabled --with-pydebug.

    @vadmium vadmium added interpreter-core (Objects, Python, Grammar, and Parser dirs) and removed stdlib Python modules in the Lib dir labels Oct 18, 2015
    @vadmium vadmium changed the title SIGSEGV using json.tool SIGSEGV using json.tool: highly nested OrderedDict Oct 18, 2015
    @vadmium
    Copy link
    Member

    vadmium commented Oct 18, 2015

    Here is a patch that seems to fix the problem for me. Can someone who knows more about tp_dealloc() methods and the Py_TRASHCAN_SAFE stuff have a look? All I know is that the body of the trashcan stuff can be deferred until an outer nested body is finished, so I though it might be best to defer that PyDict_Type.tp_dealloc() call as well.

    Is it worth packaging up my simplified code in a test case, perhaps in an assert_python_ok() child process?

    @serhiy-storchaka
    Copy link
    Member

    I think there is no need to run the test in a subprocess. You can trigger the crash by "del obj".

    from collections import OrderedDict
    obj = None
    for _ in range(1000):
        obj = OrderedDict([(None, obj)])

    del obj
    support.gc_collect()

    Unfortunately this test is crashed with the patch.

    Fatal Python error: Objects/odictobject.c:1443 object at 0xb749ddfc has negative ref count -1

    Current thread 0xb7580700 (most recent call first):
    File "<stdin>", line 1 in <module>
    Aborted (core dumped)

    @vadmium
    Copy link
    Member

    vadmium commented Oct 18, 2015

    Here is a new patch which uses Py_CLEAR() rather than Py_XDECREF(), which seems to cure the negative reference count problem. This change was only a guess based on looking at namespaceobject.c and the Py_CLEAR() documentation, so it would be good for someone else to carefully review it.

    @serhiy-storchaka
    Copy link
    Member

    Using Py_CLEAR() fixes symptoms in this test, but the real issue is that
    deallocating code is executed twice for some objects (for every 25th
    OrderedDict).

    PyDict_Type.tp_dealloc() can deposit an object in the _PyTrash_delete_later
    list if trash_delete_nesting exceeds PyTrash_UNWIND_LEVEL. Later Py_TYPE(op)-

    tp_dealloc is called for objects in the _PyTrash_delete_later list. But this
    is odict_dealloc that already was called.

    One of ways to avoid repeated deallocation is to change the type of the object
    before calling PyDict_Type.tp_dealloc(). This looks as a hack, but correct
    hack. May be there is more straightforward solution.

    @ericsnowcurrently
    Copy link
    Member

    Thanks for solving this!

    odict-trashcan.v3.patch LGTM

    @serhiy-storchaka
    Copy link
    Member

    There is a problem with odict-trashcan.v3.patch. PyDict_Type.tp_dealloc() can put the object to the freelist if it's type is dict. Since odict_dealloc() fakes object's type, it alows deallocated OrderedDict to be later used as dict. But the size of OrderedDict object is larger than the size of plain dict, this creates a leak.

    odict-trashcan.v2.patch is not good too. I found new test that crashes with it.

    Here is a patch that solves all known issues with OrderedDict deallocation. Please correct my wording in the comment if needed. Similar bad scenario is described in the comment at the end of subtype_dealloc() in Objects/typeobject.c, but the patch uses different method to solve this.

    @vadmium
    Copy link
    Member

    vadmium commented Oct 29, 2015

    Left a comment about a minor English grammar problem.

    The existing comment Serhiy mentioned was added way back in 2003 for bpo-668433. It appears to use the same underlying technique, reverting the nesting level before calling the base class dealloc. One paragraph talks about the extra increment used there compared to Serhiy’s method:

    '''
    But now it's possible that a chain of objects consisting solely of objects whose deallocator is subtype_dealloc() will defeat the trashcan mechanism completely: the decremented level means that the effective level never reaches the limit. Therefore, we *increment* the level *before* entering the trashcan block, and matchingly decrement it after leaving. This means the trashcan code will trigger a little early, but that's no big deal.
    '''

    I think we may not have to worry about this for ordered dict, assuming that PyDict_Type.tp_dealloc() can only recursively invoke odict_dealloc() from within its own trashcan block.

    So as far as I can tell, this patch is good to apply, unless someone with more knowledge of garbage collection has any comments.

    @serhiy-storchaka serhiy-storchaka self-assigned this Nov 1, 2015
    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Nov 1, 2015

    New changeset ce8c850a52d4 by Serhiy Storchaka in branch '3.5':
    Issue bpo-25395: Fixed crash when highly nested OrderedDict structures were
    https://hg.python.org/cpython/rev/ce8c850a52d4

    New changeset bd6bfa5fe203 by Serhiy Storchaka in branch 'default':
    Issue bpo-25395: Fixed crash when highly nested OrderedDict structures were
    https://hg.python.org/cpython/rev/bd6bfa5fe203

    @serhiy-storchaka
    Copy link
    Member

    Thank you Martin for your review and for initial investigation.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    interpreter-core (Objects, Python, Grammar, and Parser dirs) type-crash A hard crash of the interpreter, possibly with a core dump
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants