Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Checking all tp_dealloc with Py_TPFLAGS_HAVE_GC #75278

Closed
methane opened this issue Aug 1, 2017 · 27 comments
Closed

Checking all tp_dealloc with Py_TPFLAGS_HAVE_GC #75278

methane opened this issue Aug 1, 2017 · 27 comments
Labels
3.7 (EOL) end of life extension-modules C modules in the Modules dir interpreter-core (Objects, Python, Grammar, and Parser dirs) type-crash A hard crash of the interpreter, possibly with a core dump

Comments

@methane
Copy link
Member

methane commented Aug 1, 2017

BPO 31095
Nosy @vstinner, @larryhastings, @methane, @berkerpeksag, @serhiy-storchaka, @thehesiod
PRs
  • bpo-31095: fix potential crash during GC. #2974
  • [3.6] bpo-31095: fix potential crash during GC #3195
  • [3.5] bpo-31095: fix potential crash during GC #3196
  • [2.7] bpo-31095: Fix potential crash during GC #3197
  • Files
  • lru_cache_segv.py
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2017-10-28.22:47:25.467>
    created_at = <Date 2017-08-01.05:49:49.875>
    labels = ['extension-modules', 'interpreter-core', '3.7', 'type-crash']
    title = 'Checking all tp_dealloc with Py_TPFLAGS_HAVE_GC'
    updated_at = <Date 2017-10-28.22:47:25.465>
    user = 'https://github.com/methane'

    bugs.python.org fields:

    activity = <Date 2017-10-28.22:47:25.465>
    actor = 'berker.peksag'
    assignee = 'none'
    closed = True
    closed_date = <Date 2017-10-28.22:47:25.467>
    closer = 'berker.peksag'
    components = ['Extension Modules', 'Interpreter Core']
    creation = <Date 2017-08-01.05:49:49.875>
    creator = 'methane'
    dependencies = []
    files = ['47055']
    hgrepos = []
    issue_num = 31095
    keywords = []
    message_count = 27.0
    messages = ['299600', '299601', '299604', '299605', '299608', '299611', '299612', '299613', '299614', '299615', '300621', '300668', '300669', '300683', '300712', '300714', '300715', '300728', '300772', '300776', '300778', '301203', '301204', '301208', '303080', '303113', '305169']
    nosy_count = 6.0
    nosy_names = ['vstinner', 'larry', 'methane', 'berker.peksag', 'serhiy.storchaka', 'thehesiod']
    pr_nums = ['2974', '3195', '3196', '3197']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'crash'
    url = 'https://bugs.python.org/issue31095'
    versions = ['Python 2.7', 'Python 3.5', 'Python 3.6', 'Python 3.7']

    @methane
    Copy link
    Member Author

    methane commented Aug 1, 2017

    like #47215, most types with Py_TPFLAGS_HAVE_GC
    should call PyObject_GC_UnTrack() at top of the tp_dealloc.

    For example, I found lru_cache doesn't call it and I can produce
    segmentation fault.

    I'll check other types too.

    @methane methane added 3.7 (EOL) end of life extension-modules C modules in the Modules dir interpreter-core (Objects, Python, Grammar, and Parser dirs) labels Aug 1, 2017
    @thehesiod
    Copy link
    Mannequin

    thehesiod mannequin commented Aug 1, 2017

    should the base method which calls tp_dealloc do this? Maybe can kill all birds with one stone.

    @methane
    Copy link
    Member Author

    methane commented Aug 1, 2017

    # collection module

    dequeiter_dealloc doesn't call Untrack(), but it's safe because it only frees deque
    and deque_dealloc calls Untrack()

            dequeiter_dealloc(dequeiterobject *dio)
            {
                Py_XDECREF(dio->deque);

    defdict_dealloc doesn't call Untrack(). And it can cause segfault:

            from collections import defaultdict
            import gc
    
            class Evil:
                def __del__(self):
                    gc.collect()
                def __call__(self):
                    return 42
    
            def main():
                d = defaultdict(Evil())
    
            for i in range(1000):
                print(i)
                main()

    # _elementtree module

    elementiter_dealloc() calls UnTrack(), but it seems too late?

            static void
            elementiter_dealloc(ElementIterObject *it)
            {
                Py_ssize_t i = it->parent_stack_used;
                it->parent_stack_used = 0;
                while (i--)
                    Py_XDECREF(it->parent_stack[i].parent);
                PyMem_Free(it->parent_stack);
    
                Py_XDECREF(it->sought_tag);
                Py_XDECREF(it->root_element);
    
                PyObject_GC_UnTrack(it);
                PyObject_GC_Del(it);
            }

    @methane
    Copy link
    Member Author

    methane commented Aug 1, 2017

    dequeiter_dealloc doesn't call Untrack(), but it's safe because it only frees deque
    and deque_dealloc calls Untrack()

    It may be not true, while I don't have exploit yet.

    @methane
    Copy link
    Member Author

    methane commented Aug 1, 2017

    # _json module

    scanner_dealloc()
    encoder_dealloc()

    # _struct module

    unpackiter_dealloc

    # _ssl module

    context_dealloc()

    # Objects/

    setiter_dealloc
    dictiter_dealloc
    dictview_dealloc

    # Parser/

    ast_dealloc

    @serhiy-storchaka serhiy-storchaka added the type-crash A hard crash of the interpreter, possibly with a core dump label Aug 1, 2017
    @thehesiod
    Copy link
    Mannequin

    thehesiod mannequin commented Aug 1, 2017

    I suggest any places that don't need the calls should have comments so that future reviewers know why.

    @thehesiod
    Copy link
    Mannequin

    thehesiod mannequin commented Aug 1, 2017

    actually another idea: could the PR for this also update https://docs.python.org/2/c-api/typeobj.html#c.PyTypeObject.tp_dealloc to mention about these macros and when they should be used? That, along with all the other locations correctly calling these macros, and having comments when they're not needed hopefully should prevent this from happening again.

    @methane
    Copy link
    Member Author

    methane commented Aug 1, 2017

    should the base method which calls tp_dealloc do this? Maybe can kill all birds with one stone.

    It may be possible, but unclear.
    Object finalization process is very complicated.

    I agree that UnTrack object as soon as refcnt=0, and Track only when
    resurrected seems clearer.
    But changing such basic object system needs carefully designed by expert.

    @methane
    Copy link
    Member Author

    methane commented Aug 1, 2017

    Docs/extending/newtypes.rst and Docs/include/noddy3.c should be updated too.
    But I'm not good English writer. I need help.

    @thehesiod
    Copy link
    Mannequin

    thehesiod mannequin commented Aug 1, 2017

    omg I just realized I need the default dict one too, great investigation work!

    @vstinner
    Copy link
    Member

    "like #47215, most types with Py_TPFLAGS_HAVE_GC should call PyObject_GC_UnTrack() at top of the tp_dealloc."

    Hum, I wasn't aware of that. Writing correctly code for the Python garbage collector is very complex :-/

    Maybe it would help to have a short comment, maybe with a link to this issue, on each PyObject_GC_UnTrack().

    At the first read, I saw the newly added PyObject_GC_UnTrack() calls as duplicate, and so useless. For example, PyObject_Del() already untracks the object, so it doesn't seem to be needed to explicitly call PyObject_GC_UnTrack() "just before".

    @methane
    Copy link
    Member Author

    methane commented Aug 22, 2017

    Maybe it would help to have a short comment, maybe with a link to this issue, on each PyObject_GC_UnTrack().

    Is this looks good to you?

      /* UnTrack is needed before calling any callbacks (bpo-31095) */
      PyObject_GC_UnTrack(self);

    @methane
    Copy link
    Member Author

    methane commented Aug 22, 2017

    BTW, should this backported to Python 3.5?

    @vstinner
    Copy link
    Member

    /* UnTrack is needed before calling any callbacks (bpo-31095) */

    LGTM. I suggest /* bpo-31095: UnTrack is needed before calling any callbacks */ which is a little bit shorter, but it's up to you ;-)

    @larryhastings
    Copy link
    Contributor

    Victor, what do you think, does this need a 3.5 backport? I'm inclined to say yes.

    @vstinner
    Copy link
    Member

    Larry Hastings: "Victor, what do you think, does this need a 3.5 backport? I'm inclined to say yes."

    Naoki has to design an evil object which triggers explicitly the garbage collector to get a crash. He found the bug by reading the code. I don't remind anyone complaining about the bug. So I don't think that it's a major bug, as was bpo-21435 which was *easy* to trigger using asyncio.

    So no, I don't think that this issue desevers a backport.

    But it's just my opinion, feel free to backport to 3.5 if you consider that the bug is critical enough ;-)

    @larryhastings
    Copy link
    Contributor

    I thought crashing bugs were generally considered security bugs. With a reliable crashing bug exploiting a reasonable module (e.g. not ctypes) you can crash Python instances in hosting services. If those instances are shared with other users (e.g. Google App Engine) you can cause a temporary DOS. At least, that's the theory as I understood it...!

    @vstinner
    Copy link
    Member

    In my experience, it's not that hard to crash CPython (without ctypes) if you know internals or just if you look into Lib/test/crashers/ :-)

    I agree that it's a good practice to fix crashes when there is a simple known script to crash Python. The question is more where is the limit between security and bug fixes.

    To be honest, the change is very safe and is very short.

    @larry: It's up to you, you are the release manager :-)

    If we consider that the discussed bugs impact the security, I will ask for backports to 2.7, 3.3 and 3.4 as well.

    @thehesiod
    Copy link
    Mannequin

    thehesiod mannequin commented Aug 24, 2017

    my vote is yes due to the defaultdict issue. We were hitting this in our prod env

    @methane
    Copy link
    Member Author

    methane commented Aug 24, 2017

    New changeset a6296d3 by INADA Naoki in branch 'master':
    bpo-31095: fix potential crash during GC (GH-2974)
    a6296d3

    @methane
    Copy link
    Member Author

    methane commented Aug 24, 2017

    I opened backport PR for 3.6, 2.7 and 3.5.

    @methane
    Copy link
    Member Author

    methane commented Sep 4, 2017

    New changeset 2eea952 by INADA Naoki in branch '3.6':
    bpo-31095: fix potential crash during GC (GH-3195)
    2eea952

    @methane
    Copy link
    Member Author

    methane commented Sep 4, 2017

    New changeset 4cde4bd by INADA Naoki in branch '2.7':
    bpo-31095: Fix potential crash during GC (GH-3197)
    4cde4bd

    @vstinner
    Copy link
    Member

    vstinner commented Sep 4, 2017

    Thank you for fixes Naoki!

    @larryhastings
    Copy link
    Contributor

    New changeset 0fcc033 by larryhastings (INADA Naoki) in branch '3.5':
    bpo-31095: fix potential crash during GC (GH-2974) (bpo-3196)
    0fcc033

    @vstinner
    Copy link
    Member

    Should we backport the fix to Python 3.3 and 3.4 as well?

    I don't think so.

    @berkerpeksag
    Copy link
    Member

    Should we backport the fix to Python 3.3 and 3.4 as well?

    I don't think so.

    I agree with Victor. Closing this as all PRs have been merged. Thank you, all (especially for the documentation update!)

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life extension-modules C modules in the Modules dir interpreter-core (Objects, Python, Grammar, and Parser dirs) type-crash A hard crash of the interpreter, possibly with a core dump
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants