Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PyXxx_Check(x) trusts x->ob_type->tp_mro #41630

Closed
arigo mannequin opened this issue Feb 27, 2005 · 13 comments
Closed

PyXxx_Check(x) trusts x->ob_type->tp_mro #41630

arigo mannequin opened this issue Feb 27, 2005 · 13 comments
Assignees
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs)

Comments

@arigo
Copy link
Mannequin

arigo mannequin commented Feb 27, 2005

BPO 1153075
Nosy @mwhudson, @arigo
Files
  • bug.py: Segfaulting example
  • bug2.py: more nastiness
  • user-mro-argh.diff: mwh's patch Support "bpo-" in Misc/NEWS #1
  • python-subtyping.pdf: mwh's sillyness Support "bpo-" in Misc/NEWS #1
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/arigo'
    closed_at = <Date 2005-12-29.17:08:14.000>
    created_at = <Date 2005-02-27.21:55:55.000>
    labels = ['interpreter-core']
    title = 'PyXxx_Check(x) trusts x->ob_type->tp_mro'
    updated_at = <Date 2005-12-29.17:08:14.000>
    user = 'https://github.com/arigo'

    bugs.python.org fields:

    activity = <Date 2005-12-29.17:08:14.000>
    actor = 'arigo'
    assignee = 'arigo'
    closed = True
    closed_date = None
    closer = None
    components = ['Interpreter Core']
    creation = <Date 2005-02-27.21:55:55.000>
    creator = 'arigo'
    dependencies = []
    files = ['1609', '1610', '1611', '1612']
    hgrepos = []
    issue_num = 1153075
    keywords = []
    message_count = 13.0
    messages = ['24386', '24387', '24388', '24389', '24390', '24391', '24392', '24393', '24394', '24395', '24396', '24397', '24398']
    nosy_count = 3.0
    nosy_names = ['mwh', 'arigo', 'jepler']
    pr_nums = []
    priority = 'normal'
    resolution = 'accepted'
    stage = None
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue1153075'
    versions = []

    @arigo
    Copy link
    Mannequin Author

    arigo mannequin commented Feb 27, 2005

    The functions PyInt_Check(), PyString_Check(),
    PyList_Check() etc. are used all over the core to check
    which typecasts are safe, from PyObject* to the various
    PyXxxObject*.

    But the macros themselves are implemented by
    inspecting the "tp_mro" tuple of the incoming object's
    type. As the latter can be completely controlled by the
    user, an object can pretend to inherit from anything and
    pass the PyXxx_Check() checks of its choice, even if
    its memory layout is actually completely wrong.

    See attached example.

    @arigo arigo mannequin closed this as completed Feb 27, 2005
    @arigo arigo mannequin self-assigned this Feb 27, 2005
    @arigo arigo mannequin added the interpreter-core (Objects, Python, Grammar, and Parser dirs) label Feb 27, 2005
    @arigo arigo mannequin closed this as completed Feb 27, 2005
    @arigo arigo mannequin self-assigned this Feb 27, 2005
    @arigo arigo mannequin added the interpreter-core (Objects, Python, Grammar, and Parser dirs) label Feb 27, 2005
    @jepler
    Copy link
    Mannequin

    jepler mannequin commented Mar 1, 2005

    Logged In: YES
    user_id=2772

    Not sure if this is relevant, but the example given didn't
    crash 2.2:
    $ python2.2 bug.py
    Traceback (most recent call last):
      File "bug.py", line 9, in ?
        x = X()
    TypeError: dict.__new__(X) is not safe, use object.__new__()

    @mwhudson
    Copy link

    mwhudson commented Mar 1, 2005

    Logged In: YES
    user_id=6656

    I wonder if Guido owes Armin a beer (read typeobject.c around line 5230).
    Well, not really, but it's that code that is making the difference.

    However, reversing the order of dict and object is sufficient to make 2.2
    crash (I presume, it crashes CVS HEAD with the __new__ short-circuiting
    code chopped out which gives the error jelper observes on the original).

    I wonder what to do about this. Removing the ability to customize the mro
    from user code is one obvious approach -- I don't know how much code
    uses this feature though.

    @arigo
    Copy link
    Mannequin Author

    arigo mannequin commented Mar 2, 2005

    Logged In: YES
    user_id=4771

    To solve the problem, as hinted in the title of this tracker, I
    think that PyXxx_Check() should simply not trust any mro.
    What PyInt_Check() really means at the C level is to check if
    an object memory layout is compatible with PyIntObject. This
    is easy to figure out by walking the "solid base" chain,
    tp_base.

    As a side note, PyPy gives the same error as Python 2.2.
    However, both in PyPy and in Python 2.2, you can still build
    an instance of the strange class X as follows:

    >> x = object.__new__(X)

    Still, all the methods of x are resolved via the dict type. In
    PyPy we get a clean TypeError because the methods thus
    found don't apply to non-dict objects. In Python 2.2 you can
    crash the interpreter just as in more recent Python releases,
    e.g. with x[5]=6.

    @mwhudson
    Copy link

    mwhudson commented Mar 2, 2005

    Logged In: YES
    user_id=6656

    Hmm, yeah, that works. It wasn't totally clear to me that the user couldn't
    maliciously influence tp_base, but I don't think they can...

    A helper function is presumably in order, unless PyType_IsSubtype should
    be changed.

    @arigo
    Copy link
    Mannequin Author

    arigo mannequin commented Mar 3, 2005

    Logged In: YES
    user_id=4771

    The PyXxx_Check() macros are #defined with
    PyObject_TypeCheck(). Should we change all these #defines to
    use a new PyObject_LayoutCheck() or something, and document
    to C extension writers that they should also switch to the
    new function? More generally, should we review *all* usages
    of PyObject_TypeCheck() and decide if they really meant that
    or PyObject_LayoutCheck()? Do we care about types that are
    solid subclasses of 'int' but don't have it in their MRO?
    Should we just forget the whole idea and sanity-check the
    user-defined mro, which after all would just add another
    strange undocumented condition to typeobject.c? :-)

    @mwhudson
    Copy link

    mwhudson commented Mar 3, 2005

    Logged In: YES
    user_id=6656

    Certainly some more checking of what the user-defined MRO contains
    would be good -- check the attached, which dumps core at class definition
    time...

    @mwhudson
    Copy link

    mwhudson commented Mar 6, 2005

    Logged In: YES
    user_id=6656

    I think the attached fixes all this.

    What it does: check the return value of PySequence_Tuple (duh). Check
    that the returned sequence contains only types or old-style classes.
    Checks that the solid_base of each type in the returned sequence is a
    supertype of the solid_base of type being built. Adds a few tests.

    The error messages are a bit inscrutable. I find it hard to care.

    Assigning to Armin for the first look. Might want to get Guido to look at it
    too.

    When Armin and I talked about this on IRC we found a few oddities in the
    general area, but I don't know if it's worth opening a new bug report for
    them...

    @arigo
    Copy link
    Mannequin Author

    arigo mannequin commented Mar 14, 2005

    Logged In: YES
    user_id=4771

    I tried to convince myself that this check always accepts the
    default mro, but there are more implicit assumptions around
    than I can handle in a quick review...

    Instead, I modified the patch so that a debug build of Python
    always does the check in mro_internal(). This results in a
    shallow problem: the basestring type has tp_basicsize==0,
    which makes the computation of its solid_base trigger an
    assertion in extra_ivars(). If I hack around this problem, then
    I cannot quickly find an example of built-in mro that triggers a
    TypeError, but it doesn't mean there isn't one...

    @mwhudson
    Copy link

    Logged In: YES
    user_id=6656

    Well, emprically that can't happen because the patch passes test_descr,
    and far sillier things happen in that file than in real life.

    More theoretically... I'll think about it :)

    @mwhudson
    Copy link

    Logged In: YES
    user_id=6656

    Please see the attached PDF for a proof of ... well, something in the area.

    I think this is what typeobject.c should be doing and what it's *trying* to
    do. I'm less confident that this is what it's actually doing (particularly wrt
    the subclassing two subclasses of str thing).

    @arigo
    Copy link
    Mannequin Author

    arigo mannequin commented Dec 14, 2005

    Logged In: YES
    user_id=4771

    I'm getting confused.

    First note that the problem I mentioned about extra_ivars() triggering an assert() can only occur at start-up. Indeed, PyType_Ready() copies the tp_base->tp_basicsize if the subtype's tp_basicsize is 0. So we can probably ignore the problem -- no extra_ivars() call would trigger the assert after start-up.

    The check that your patch adds is that solid_base(A) must be a superclass of solid_base(T) for each A in the mro of T. For built-in mros, the A's are exactly the superclasses of T. So we have to prove [*]: if A is a superclass of T, then solid_base(A) is a superclass of solid_base(T).

    Irrespective of what extra_ivars() does, solid_base(T) is either exactly T, or solid_base(T->tp_base). Moreover, when a type T is created with immediate bases B1...Bn, we check that all solid_bases of B1...Bn form a linear order (i.e. are subclasses of each other), and we set T->tp_base to the first Bi such that solid_base(Bi) is a subclass of all the other solid_base(Bj).

    Let's approach [] above by induction on the set of types in which A and T are chosen. This set of types can be ordered by creation time: T1, T2, T3, ... where classes with a higher index can only inherit from classes with lower indices. Assume that [] holds if A and T are chosen among T1...Tn-1. Let's prove it if A and/or T can also be Tn. The only non-trivial case is if A is a strict superclass of T=Tn.

    As above, let write B1...Bn for the immediate bases of T, with T->tp_base=Bi.

    Pick j such that A is a superclass of Bj. By induction solid_base(A) is a superclass of solid_base(Bj) which is a superclass of solid_base(Bi=T->tp_base). Now solid_base(T) is either T or solid_base(Bi). In both cases, solid_base(T) is a subclass of solid_base(A). This concludes the proof -- or I am missing something :-)

    In other words, I don't think the patch can break anything, so it should definitely go in.

    @arigo
    Copy link
    Mannequin Author

    arigo mannequin commented Dec 29, 2005

    Logged In: YES
    user_id=4771

    Checked in as r41845.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 9, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    interpreter-core (Objects, Python, Grammar, and Parser dirs)
    Projects
    None yet
    Development

    No branches or pull requests

    1 participant