Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix various crashes exposed through mro() customization #66924

Closed
abusalimov mannequin opened this issue Oct 26, 2014 · 12 comments
Closed

Fix various crashes exposed through mro() customization #66924

abusalimov mannequin opened this issue Oct 26, 2014 · 12 comments
Assignees
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) type-crash A hard crash of the interpreter, possibly with a core dump

Comments

@abusalimov
Copy link
Mannequin

abusalimov mannequin commented Oct 26, 2014

BPO 22735
Nosy @malemburg, @tim-one, @pitrou, @benjaminp
Files
  • mro-crashers-fix-v1.diff: (whole patch) fix mro() issues, crasher tests included
  • 0001-minor-test_mro-test-case-stub-and-helpers.patch: [PATCH 01/15] (minor) test_mro: test case stub and helpers
  • 0002-test-crashers-NULL-old_mro-and-over-decref-on-reent.patch: [PATCH 02/15] (test) (crashers) NULL old_mro and over-decref on reent
  • 0003-minor-mro_internal-extract-mro_invoke-and-mro_check.patch: [PATCH 03/15] (minor) mro_internal: extract mro_invoke and mro_check
  • 0004-minor-type_set_bases-move-undoing-logic-to-the-end.patch: [PATCH 04/15] (minor) type_set_bases: move undoing logic to the end
  • 0005-minor-type_set_bases-extract-add_all_subclasses.patch: [PATCH 05/15] (minor) type_set_bases: extract add_all_subclasses
  • 0006-minor-mro_subclasses-loop-over-a-list-of-subclasses.patch: [PATCH 06/15] (minor) mro_subclasses: loop over a list of subclasses
  • 0007-fix-handle-tp_mro-overwritten-through-reentrancy.patch: [PATCH 07/15] (fix) handle tp_mro overwritten through reentrancy
  • 0008-test-crashers-tp_base-tp_subclasses-inherit-cycles.patch: [PATCH 08/15] (test) (crashers) tp_base/tp_subclasses inherit-cycles
  • 0009-minor-PyType_IsSubtype-extract-a-check-using-tp_base.patch: [PATCH 09/15] (minor) PyType_IsSubtype: extract a check using tp_base
  • 0010-fix-type_set_bases-inherit-cycles-and-reent-checks.patch: [PATCH 10/15] (fix) type_set_bases: inherit-cycles and reent checks
  • 0011-test-behavior-error-on-extending-an-incomplete-type.patch: [PATCH 11/15] (test) (behavior) error on extending an incomplete type
  • 0012-minor-mro_implementation-pmerge-refactory.patch: [PATCH 12/15] (minor) mro_implementation, pmerge: refactory
  • 0013-fix-mro_implementation-check-base-tp_mro-is-not-NULL.patch: [PATCH 13/15] (fix) mro_implementation: check base->tp_mro is not NULL
  • 0014-test-crasher-attr-lookup-on-super-with-uninitialized.patch: [PATCH 14/15] (test) (crasher) attr lookup on super with uninitialized type
  • 0015-fix-super_getattro-check-type-tp_mro-and-refactory.patch: [PATCH 15/15] (fix) super_getattro: check type->tp_mro and refactory
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/benjaminp'
    closed_at = <Date 2015-02-14.17:32:53.274>
    created_at = <Date 2014-10-26.21:06:21.602>
    labels = ['interpreter-core', 'type-crash']
    title = 'Fix various crashes exposed through mro() customization'
    updated_at = <Date 2015-02-14.17:32:53.273>
    user = 'https://bugs.python.org/abusalimov'

    bugs.python.org fields:

    activity = <Date 2015-02-14.17:32:53.273>
    actor = 'berker.peksag'
    assignee = 'benjamin.peterson'
    closed = True
    closed_date = <Date 2015-02-14.17:32:53.274>
    closer = 'berker.peksag'
    components = ['Interpreter Core']
    creation = <Date 2014-10-26.21:06:21.602>
    creator = 'abusalimov'
    dependencies = []
    files = ['37025', '37026', '37027', '37028', '37029', '37030', '37031', '37032', '37033', '37034', '37035', '37036', '37037', '37038', '37039', '37040']
    hgrepos = []
    issue_num = 22735
    keywords = ['patch']
    message_count = 12.0
    messages = ['230044', '230045', '230046', '230560', '230729', '230823', '232573', '232628', '235445', '235461', '235484', '235486']
    nosy_count = 6.0
    nosy_names = ['lemburg', 'tim.peters', 'pitrou', 'benjamin.peterson', 'python-dev', 'abusalimov']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'crash'
    url = 'https://bugs.python.org/issue22735'
    versions = ['Python 2.7', 'Python 3.2', 'Python 3.3', 'Python 3.4', 'Python 3.5']

    @abusalimov
    Copy link
    Mannequin Author

    abusalimov mannequin commented Oct 26, 2014

    The attached patch fixes several bugs revealed by providing a custom mro(). Most of these bugs are reentrancy issues (mainly, cls.__bases__ assignment within mro() which causes incorrect refcounting), but there are also some issues when using incomplete types with uninitialized MRO (dereferencing NULL when extending such types, attribute lookup on super, etc.). The patch is made against the default branch (py3k), however all these bugs exist in Python 2 as well.

    I also tried to break the patch into smaller pieces (commits, in fact) to ease reviewing: a common pattern is test->minor->fix series. The patch set is an output of git format-patch, and most patches have a detailed commit message describing a change inside.

    Adding memory mgmt and object model guys to nosy list.

    @abusalimov abusalimov mannequin added interpreter-core (Objects, Python, Grammar, and Parser dirs) type-crash A hard crash of the interpreter, possibly with a core dump labels Oct 26, 2014
    @abusalimov
    Copy link
    Mannequin Author

    abusalimov mannequin commented Oct 26, 2014

    This is a patch with most significant changes, please review it carefully.

    @abusalimov
    Copy link
    Mannequin Author

    abusalimov mannequin commented Oct 26, 2014

    Just in case, the previous message about reviewing is about [PATCH 07/15] (fix) handle tp_mro overwritten through reentrancy

    @benjaminp
    Copy link
    Contributor

    Thank you for the patchset. I wonder if we should just forbid most reentrancy i.e. set a flag on the type when it's being constructed and not let type_set_bases be called then. Setting __bases__ from within mro() doesn't seem very useful outside of producing pathologies.

    @pitrou
    Copy link
    Member

    pitrou commented Nov 6, 2014

    I think forbidding reentrancy would indeed be a good idea.

    @abusalimov
    Copy link
    Mannequin Author

    abusalimov mannequin commented Nov 7, 2014

    Thank you for your replies.

    I don't think forbidding reentrancy is a good idea, and I'm against it for the following reasons.

    First of all, naive prohibition of type_set_bases within mro() on a class doesn't save from mro() reentrancy:

        def mro(cls):
            # cls is B,
            # B extends A
        A.__bases__ = ...  # What to do?
    

    Changing bases of a class provokes MRO of all its subclasses to be updated as well. Therefore, one would also need to forbid changing __bases__ of any direct or indirect base of the class up to the 'object', and this doesn't feel natural at all. Why should mro() of some specialized class B prevent changing bases of its general parent class A?

    Otherwise, there must be a check in mro_subclasses (mro_hierarchy in the patch), which will not recurse into a class flagged as being inside mro(). But in this case a running mro(B) will be unable to notice a change of MRO of the parent class A, in case if mro() involves some non-trivial logic, and A.__bases__ assignment occurs somewhere deep in the call stack and is done by a code from some unknown black-box module. Deferring mro() recalculation upon exiting from the outermost mro() in such case doesn't seem to be a good solution either, especially concerning its (most likely) tricky implementation.

        def mro(cls):
            # some complicated calculation based of MROs of bases:
            parent_mros = [base.__mro__ for base in cls.__bases__]
    
            # The line below does this:
            #   parent.__bases__ = ...
            # (deep-deep-deep in the call stack)
            third_party.do_mro(cls)
        # Either the line above raises an error,
        # or parent_mros becomes invalid.
    

    Another example. Setting __bases__ from inside mro() may be useful for changing class behavior on the fly using a proxy base class (e.g. for debugging, logging, testing frameworks). The only place to do this except mro() is __new__, but in that case it is only possible to fix up bases at the moment of class creation. That is, a tricky class that change its __bases__ would break such framework by overriding a proxy base, which is necessary for this framework to function properly.

        def mro(cls):
            if cls or one of cls.__bases__ is not a proxy class:
                class Proxy(*cls.__bases__):
                    ...
                cls.__bases__ = (Proxy,)  # reenter
    
            return type.mro(cls)

    In other words, there should be a loophole to alter __bases__ upon assignment, and I suppose mro() is a good option.

    Also, __bases__ assignment is known as the only reliable way to invoke MRO recalculation (http://stackoverflow.com/a/20832588/545027). Forbidding it from the inside mro() makes it not so obvious and reliable.

    Actually, I encountered one of these bugs while working on a real metaclass providing an auto-inheriting attributes feature needed for a custom DSL built on top of Python. In a nutshell, if B extends A, then make B.V extend A.V automatically (https://github.com/abusalimov/mybuild/blob/6c7c89521b856c798b46732501adb5e06dec7e03/util/inherit.py, still work in progress).

    I know, some of these use cases may sound a bit unreal (or even crazy), but neither me nor you can imagine all scenarios, that Python programmers could ever invent. Actually, there could already exist some: it is possible to workaround most of reentrancy issues through holding a reference to old_mro from inside a Python code. That is, backward compatibility comes into play too.

    Finally, why do forbid something that was not prohibited before? I think of it as a feature with some issues to fix, not to remove at all. After all, there is a fix provided, it is more or less straightforward (I hope so), with tests showing a desired behavior. The desired behavior is also more or less obvious: take a __mro__ calculated by mro() invoked due to the very last __bases__ assignment, regardless reentrancy (the innermost one in such case).

    Summarizing, that is why I believe that reentrancy should not be forbidden. Furthermore, I considered that way, and I'm pretty sure it is a wrong one. It implies too many corner cases, it has a non-obvious behavior from the point of view of a Python code, and a possible implementation doesn't seem to be more simple or robust than it is now.

    I would be happy to hear opposite arguments, and if I convinced you, get a feedback on the patch, of course. I could also prepare a backport patch fixing 2.7 line as well.

    @abusalimov
    Copy link
    Mannequin Author

    abusalimov mannequin commented Dec 12, 2014

    ping?

    @benjaminp
    Copy link
    Contributor

    I will try to look eventually.

    @benjaminp benjaminp self-assigned this Dec 14, 2014
    @abusalimov
    Copy link
    Mannequin Author

    abusalimov mannequin commented Feb 5, 2015

    I feel a bit uneasy, but... any news on this?

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Feb 6, 2015

    New changeset 6384c0cd3b2d by Benjamin Peterson in branch '3.4':
    fix many custom mro() edge cases and improve code quality (bpo-22735)
    https://hg.python.org/cpython/rev/6384c0cd3b2d

    New changeset 75fd0bd89eef by Benjamin Peterson in branch 'default':
    merge 3.4 (bpo-22735)
    https://hg.python.org/cpython/rev/75fd0bd89eef

    @benjaminp
    Copy link
    Contributor

    Thanks for the very high quality patch. The way it was split made it much easier to review.

    @abusalimov
    Copy link
    Mannequin Author

    abusalimov mannequin commented Feb 6, 2015

    You're welcome, and thank you too for reviewing and merging it.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    interpreter-core (Objects, Python, Grammar, and Parser dirs) type-crash A hard crash of the interpreter, possibly with a core dump
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants