Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

segfault due to null pointer in tuple #70998

Closed
random832 mannequin opened this issue Apr 21, 2016 · 16 comments
Closed

segfault due to null pointer in tuple #70998

random832 mannequin opened this issue Apr 21, 2016 · 16 comments
Assignees
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) type-crash A hard crash of the interpreter, possibly with a core dump

Comments

@random832
Copy link
Mannequin

random832 mannequin commented Apr 21, 2016

BPO 26811
Nosy @brettcannon, @rhettinger, @ncoghlan, @vstinner, @larryhastings, @ericsnowcurrently, @serhiy-storchaka, @zhangyangyu
PRs
  • bpo-30156: Fix a crash in debug build when PYTHONDUMPREFS is set. #1272
  • Files
  • fnt.py: Script to find the bad tuple and try to access it
  • property_descr_get_gc_untrack.patch
  • property_cached_args_set_size_0.patch
  • property_descr_get_gc_untrack_2.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/serhiy-storchaka'
    closed_at = <Date 2016-05-05.13:22:36.740>
    created_at = <Date 2016-04-21.00:20:15.665>
    labels = ['interpreter-core', 'type-crash']
    title = 'segfault due to null pointer in tuple'
    updated_at = <Date 2017-04-24.18:46:48.431>
    user = 'https://bugs.python.org/random832'

    bugs.python.org fields:

    activity = <Date 2017-04-24.18:46:48.431>
    actor = 'serhiy.storchaka'
    assignee = 'serhiy.storchaka'
    closed = True
    closed_date = <Date 2016-05-05.13:22:36.740>
    closer = 'serhiy.storchaka'
    components = ['Interpreter Core']
    creation = <Date 2016-04-21.00:20:15.665>
    creator = 'random832'
    dependencies = []
    files = ['42545', '42546', '42547', '42577']
    hgrepos = []
    issue_num = 26811
    keywords = ['patch']
    message_count = 16.0
    messages = ['263866', '263882', '263885', '263886', '263887', '263888', '263889', '263892', '263893', '263894', '263895', '263900', '263903', '263904', '264060', '264846']
    nosy_count = 10.0
    nosy_names = ['brett.cannon', 'rhettinger', 'ncoghlan', 'vstinner', 'larry', 'python-dev', 'eric.snow', 'serhiy.storchaka', 'random832', 'xiang.zhang']
    pr_nums = ['1272']
    priority = 'high'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'crash'
    url = 'https://bugs.python.org/issue26811'
    versions = ['Python 3.6']

    @random832
    Copy link
    Mannequin Author

    random832 mannequin commented Apr 21, 2016

    I was writing something that iterates over all objects in gc.get_objects(). I don't know precisely where the offending tuple is coming from, but nearby objects in gc.get_objects() appear to be related to frozen_importlib.

    A tuple exists somewhere with a null pointer in it. Attempting to iterate over it or access its item results in a segmentation fault.

    I confirmed this on both Linux and OSX.

    Python 3.6.0a0 (default:778ccbe3cf74, Apr 20 2016, 20:17:38)

    @random832 random832 mannequin added the interpreter-core (Objects, Python, Grammar, and Parser dirs) label Apr 21, 2016
    @serhiy-storchaka serhiy-storchaka added the type-crash A hard crash of the interpreter, possibly with a core dump label Apr 21, 2016
    @zhangyangyu
    Copy link
    Member

    hg bisect tells me changeset 95830:661cdbd617b8 introduces this behaviour.

    @zhangyangyu zhangyangyu removed the type-crash A hard crash of the interpreter, possibly with a core dump label Apr 21, 2016
    @rhettinger rhettinger self-assigned this Apr 21, 2016
    @rhettinger
    Copy link
    Contributor

    Also, see https://hg.python.org/cpython/rev/5dbf3d932a59

    Serhiy, I'll thinking the whole optimization ought to be reverted as being too tricky and error-prone for the small benefit. A less aggressive approach would be to have the tuple save a valid object (such as None) instead of NULL. What do you think?

    @serhiy-storchaka
    Copy link
    Member

    661cdbd617b8 can cause problems with recursion and with cached getters. I suggest to revert it.

    @rhettinger
    Copy link
    Contributor

    I concur. Please do the honors and revert it from 3.5 and 3.6.

    @serhiy-storchaka
    Copy link
    Member

    Ah, yes, 5dbf3d932a59 partially fixed this optimization. Saving None in a tuple (as I believe done in itertools/zip/enumerate optimization) should fix this issue. I agree that the code is too tricky (as well as the code in itertools/zip/enumerate). It would be better to make tuple creation faster instead of adding special caches in separate functions (see also bpo-23507). But for now this optimization makes attribute access in named tuples 30% faster, and this is good argument for keeping it.

    @serhiy-storchaka
    Copy link
    Member

    Sorry, I'm late with my comments. All my posts are written without seeing your previous message.

    @vstinner
    Copy link
    Member

    Serhiy: "Ah, yes, 5dbf3d932a59 partially fixed this optimization."

    Since we already talking about hacks, another hack is to hide this private tuple from the GC using _PyObject_GC_UNTRACK(): see attached property_descr_get_gc_untrack.patch.

    Note: I dislike my own patch :-D

    Serhiy: "661cdbd617b8 can cause problems with recursion and with cached getters. I suggest to revert it."

    Even if I don't understand the issue, I concur that the micro-optimization must be reverted. A micro-optimization must not introduce any behaviour change nor segfault :-)

    Raymond Hettinger: "Serhiy, I'll thinking the whole optimization ought to be reverted as being too tricky and error-prone for the small benefit. A less aggressive approach would be to have the tuple save a valid object (such as None) instead of NULL. What do you think?"

    PyEval_EvalCodeEx() accepts a C array of PyObject* for function arguments, it's used by fast_function() in Pythn/ceval.c. Maybe a new C function can be designed to call a function: it would uses PyEval_EvalCodeEx() to call functions implemented in Python, or create the tuple for functions implemented in C.

    It looks like the issue bpo-23910 which introduced the optimization was created for functions implemented in C. So I don't know it's acceptable. But it would be safer and more generic, no?

    @vstinner
    Copy link
    Member

    Maybe a new C function can be designed to call a function: it would uses PyEval_EvalCodeEx() to call functions implemented in Python, or create the tuple for functions implemented in C.

    I would expect C code to be more optimized that Python functions, but for the specific case of function calls: Python are more optimized with the fast-path fast_function()!

    I recall vaguely that Larry Hasting shared me his idea of passing the Python stack to C functions to avoid the creation of a tuple.

    Maybe we can add a new API for C functions accepted a C array of PyObject*, a number of positional arguments and a number of (key, value) pairs for keyword arguments. Something similar to PyEval_EvalCodeEx() but for C functions. It means that we should add a new flavor of PyArg_ParseTuple/PyArg_ParseTupleAndKeywords to accepts this format.

    Larry's idea was to use the argument clinic to handle that for us.

    @serhiy-storchaka
    Copy link
    Member

    References:

    bpo-23910 -- added original optimization (661cdbd617b8).
    bpo-24276 -- it was significant rewritten (5dbf3d932a59).

    My suggestion in msg263886 to revert the optimization was related to the original code. Now, after reminding the background, I think it is worth to keep the optimization and try to fix the code.

    Making the cached tuple to save None adds additional complexity and overhead (about 5%). Here is a patch that uses different trick. The size of saved tuple is set to 0.

    @serhiy-storchaka serhiy-storchaka added the type-crash A hard crash of the interpreter, possibly with a core dump label Apr 21, 2016
    @vstinner
    Copy link
    Member

    I suggest to remove the micro-optimization from Python 3.5 for safety.

    I'm ok to experiment a new safer implementation on Python 3.6 ;-) We have more time to fix the code in Python 3.6 if new issues are found. Setting the tuple size to zero looks simple and safe, but the overall hack deserves a comment to explain:

    • why you use a cached tuple
    • why the reference count can be different than 2: recursive calls
    • why do you change the tuple size

    @vstinner
    Copy link
    Member

    I created the issue bpo-26814: "Add a new _PyObject_CallStack() function which avoids the creation of a tuple or dict for arguments".

    @serhiy-storchaka
    Copy link
    Member

    I suggest to remove the micro-optimization from Python 3.5 for safety.

    Then we should remove the promise from What's New In Python 3.5. That will cause a regression in namedtuple attributes access about 30%.

    @vstinner
    Copy link
    Member

    Oh, I see "The property() getter calls are up to 25% faster.
    (Contributed by Joe Jevnik in bpo-23910.)" in
    https://docs.python.org/dev/whatsnew/3.5.html#optimizations

    Hum... This is embarrassing :-/ Ok, let's keep it, but we must fix it ;-)

    @serhiy-storchaka
    Copy link
    Member

    May be using _PyObject_GC_UNTRACK() is more correct than setting the size to 0 or setting the item to None. But property_descr_get_gc_untrack.patch makes the code a little slower. Here is optimized version of Victor's patch.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented May 4, 2016

    New changeset a98ef122d73d by Serhiy Storchaka in branch '3.5':
    Issue bpo-26811: gc.get_objects() no longer contains a broken tuple with NULL
    https://hg.python.org/cpython/rev/a98ef122d73d

    New changeset 3fe1c7ad3b58 by Serhiy Storchaka in branch 'default':
    Issue bpo-26811: gc.get_objects() no longer contains a broken tuple with NULL
    https://hg.python.org/cpython/rev/3fe1c7ad3b58

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    interpreter-core (Objects, Python, Grammar, and Parser dirs) type-crash A hard crash of the interpreter, possibly with a core dump
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants