Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Special method lookup fails on uninitialized types #71093

Closed
ztane mannequin opened this issue May 2, 2016 · 26 comments
Closed

Special method lookup fails on uninitialized types #71093

ztane mannequin opened this issue May 2, 2016 · 26 comments
Assignees
Labels
3.7 (EOL) end of life interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error

Comments

@ztane
Copy link
Mannequin

ztane mannequin commented May 2, 2016

BPO 26906
Nosy @gvanrossum, @terryjreedy, @ericvsmith, @serhiy-storchaka, @ztane, @orenmn, @iritkatriel
Files
  • init_method_descr_types.patch
  • init_types-2.7.patch
  • init_type_in_pytype_lookup.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/serhiy-storchaka'
    closed_at = <Date 2021-10-18.21:23:30.922>
    created_at = <Date 2016-05-02.13:03:48.822>
    labels = ['interpreter-core', 'type-bug', '3.7']
    title = 'Special method lookup fails on uninitialized types'
    updated_at = <Date 2021-10-18.21:23:30.921>
    user = 'https://github.com/ztane'

    bugs.python.org fields:

    activity = <Date 2021-10-18.21:23:30.921>
    actor = 'iritkatriel'
    assignee = 'serhiy.storchaka'
    closed = True
    closed_date = <Date 2021-10-18.21:23:30.922>
    closer = 'iritkatriel'
    components = ['Interpreter Core']
    creation = <Date 2016-05-02.13:03:48.822>
    creator = 'ztane'
    dependencies = []
    files = ['42685', '42716', '42718']
    hgrepos = []
    issue_num = 26906
    keywords = []
    message_count = 26.0
    messages = ['264647', '264648', '264651', '264730', '264816', '264828', '264840', '264843', '264844', '264912', '264920', '264990', '265009', '265010', '265018', '265023', '265058', '265073', '265081', '277875', '277912', '277913', '278247', '278287', '278290', '404223']
    nosy_count = 8.0
    nosy_names = ['gvanrossum', 'terry.reedy', 'eric.smith', 'python-dev', 'serhiy.storchaka', 'ztane', 'Oren Milman', 'iritkatriel']
    pr_nums = []
    priority = 'high'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue26906'
    versions = ['Python 2.7', 'Python 3.5', 'Python 3.6', 'Python 3.7']

    @ztane
    Copy link
    Mannequin Author

    ztane mannequin commented May 2, 2016

    This is an annoying heisenbug; it seems that some objects cannot be formatted until you explicitly do obj.format. For example object.__reduce__ behaves like this:

        Python 2.7.10 (default, Oct 14 2015, 16:09:02)
        [GCC 5.2.1 20151010] on linux2
        Type "help", "copyright", "credits" or "license" for more information.
        >>> format(object.__reduce__)
        Traceback (most recent call last):
          File "<stdin>", line 1, in <module>
        TypeError: Type method_descriptor doesn't define __format__
        >>> format(object.__reduce__)
        Traceback (most recent call last):
          File "<stdin>", line 1, in <module>
        TypeError: Type method_descriptor doesn't define __format__
        >>> object.__reduce__.__format__
        <built-in method __format__ of method_descriptor object at 0x7f67563ed0e0>
        >>> format(object.__reduce__)
        "<method '__reduce__' of 'object' objects>"

    I can replicate this in 2.7.9, .10 and .11 on Ubuntu and Debian, though it works on Windows Python, works in 2.6.6, and Pythons 3 wherever I've tried, but I've heard this also failing on Python 3.

    @ztane ztane mannequin changed the title __reduce__ format format(object.__reduce__) fails intermittently May 2, 2016
    @ztane
    Copy link
    Mannequin Author

    ztane mannequin commented May 2, 2016

    s/explicitly do/explicitly access/

    @ericvsmith ericvsmith added the type-bug An unexpected behavior, bug, or error label May 2, 2016
    @serhiy-storchaka
    Copy link
    Member

    Proposed patch makes method descriptors types to be explicitly initialized as in 3.x.

    @serhiy-storchaka serhiy-storchaka added the interpreter-core (Objects, Python, Grammar, and Parser dirs) label May 2, 2016
    @serhiy-storchaka
    Copy link
    Member

    There is similar issue on 3.x:

    >>> import array
    >>> it = iter(array.array('i'))
    >>> format(it)
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    TypeError: Type arrayiterator doesn't define __format__
    >>> type(it).__format__
    <method '__format__' of 'object' objects>
    >>> format(it)
    '<arrayiterator object at 0xb703f4ec>'

    @serhiy-storchaka
    Copy link
    Member

    A number of other types are not initialized until you request an attribute.

    Here is larger patch for 2.7 that makes 38 types to be explicitly initialized.

    @serhiy-storchaka
    Copy link
    Member

    An alternative way is just call PyType_Ready from _PyType_Lookup if type->tp_mro is NULL.

    Here is a patch against 2.7 that restores the solution from bpo-551412, but returns NULL if type->tp_mro is still NULL after calling PyType_Ready. I found one place in tests when this is happened (CIOTest.test_IOBase_finalize in test_io).

    @gvanrossum
    Copy link
    Member

    Serhiy, I'm happy to help, but I'm not sure what you're asking me to do. Decide between different patches? I can't even repro the issue.

    @serhiy-storchaka
    Copy link
    Member

    Added a check for Py_TPFLAGS_READYING to prevent recursive calling.

    @serhiy-storchaka
    Copy link
    Member

    The problem is that format() fails for instances of some classes, because the type still is not initialized. The simplest example -- list iterator.

    >>> format(iter([]))
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    TypeError: Type listiterator doesn't define __format__

    After forcing type initialization (for example by getting any type's attribute), format() becomes working.

    >>> type(iter([])).foo
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    AttributeError: type object 'listiterator' has no attribute 'foo'
    >>> format(iter([]))
    '<listiterator object at 0xb708d0ec>'

    I afraid that format() is just one example, and there are other functions or operators that don't work or work incorrectly if the type was not initialized.

    init_types-2.7.patch adds explicit initialization of 38 types (I didn't check that all of them need this, but I suppose they do). This is large patch, and I'm not sure that it fixes all types.

    Other way is to try to initialize the type just in _PyType_Lookup if it is not initialized. This is simple change, but a comment in _PyType_Lookup warns me. I found that this solution was already applied as an attempt to fix bpo-551412, but then reverted. Since you seem to be the most knowledgeable with this code, I'm asking you what was wrong with this approach and how we can fix this.

    Python 3.x also suffers from this bug, but it is reproduced with less types. For example it isn't reproduced for list iterator. I don't know why.

    @ztane
    Copy link
    Mannequin Author

    ztane mannequin commented May 5, 2016

    I can reproduce the bug in 3.5.0+ Ubuntu with list iterator, if I execute python with -S:

        % python3.5 -S
        Python 3.5.0+ (default, Oct 11 2015, 09:05:38) 
        [GCC 5.2.1 20151010] on linux
        >>> format(iter([]))
        Traceback (most recent call last):
          File "<stdin>", line 1, in <module>
        TypeError: Type list_iterator doesn't define __format__

    Thus here it depends on the stuff that site does or doesn't do. Iterating over a list iterator does *not* trigger the initialization. Printing it doesn't help either, or anything else that does not touch the non-magic attributes. I am not even sure what the site.py and such are doing to the list iterator class to trigger the initialization.

    @ztane
    Copy link
    Mannequin Author

    ztane mannequin commented May 5, 2016

    And to the other things failing, I was trying to find out which of the magic method ops fail, and for that tried to find out the dir of list iterator. Well...

        % python3.5 -S          
        Python 3.5.0+ (default, Oct 11 2015, 09:05:38) 
        [GCC 5.2.1 20151010] on linux
        >>> dir(iter([]))
        Traceback (most recent call last):
          File "<stdin>", line 1, in <module>
        TypeError: object does not provide __dir__

    @gvanrossum
    Copy link
    Member

    Sadly it's been a very long time since I wrote that code and I don't recall
    much about it. I presume there was a good reason for not to do it in
    _PyType_Lookup(), but who knows -- maybe the oroginal approach was just too
    naive and nobody cared? I'm not excited by a patch that does this for 38
    types -- invariably there will be another type that still surfaces the same
    bug.

    @terryjreedy
    Copy link
    Member

    Is there a way to have format() try to force the initialization, by explicitly doing the equivalent of obj.__format__, at least for types, instead of raising the TypeError?

    @gvanrossum
    Copy link
    Member

    But the problem isn't limited to format()... Why would format() be special?

    @serhiy-storchaka
    Copy link
    Member

    There is one test (ClassPropertiesAndMethods.test_mutable_bases_with_failing_mro in test_descr) that crashes with the code from bpo-551412 because _PyType_Lookup() is recursive called from PyType_Ready(). Is this the reason? My patch prevents recursive calls.

    Here is minimal example (for Python 3):

    class M(type):
        def mro(self):
            hasattr(self, 'foo')
            return type.mro(self)
    
    class C(metaclass=M):
        pass

    When class C is created, C.mro() is called while C still is not ready. Resolving an attribute calls _PyType_Lookup() which calls PyType_Ready() which calls mro() etc.

    @gvanrossum
    Copy link
    Member

    Probably.

    @ztane
    Copy link
    Mannequin Author

    ztane mannequin commented May 7, 2016

    I am not an expert on PyType internals, so I am wondering why is the PyType_Ready'ing done implicitly at all?

    @gvanrossum
    Copy link
    Member

    Because the data structure that defines a type is just data, and at some
    point PyType_Ready() must be called. The question is how to do this, given
    that nobody can (or needs to) produce a definitive list of all types.

    @ztane
    Copy link
    Mannequin Author

    ztane mannequin commented May 7, 2016

    Could it be possible to to make the debug build absolutely abort on any usage of PyType's that are not readied, usage including instantiating them. Then, instead of changing all static linkages to externs (as in Serhiy's first patch) one could rather make per-compilation unit initialization functions that are called from objects.c; that way it would be easier to use preprocessor to turn on and off the very existence of certain types in a compilation unit based on a preprocessor flag.

    Likewise the C-API docs for PyType_Ready should perhaps say "This must be called on all type objects to finish their initialization." instead of "should"

    @serhiy-storchaka
    Copy link
    Member

    Similar bug just was introduced in bpo-21124.

    @serhiy-storchaka serhiy-storchaka added the 3.7 (EOL) end of life label Oct 2, 2016
    @serhiy-storchaka
    Copy link
    Member

    Yet one similar bug: bpo-11702.

    @gvanrossum
    Copy link
    Member

    Serhiy -- please do what do you think we should do. At this point I'm open to just about anything, but I don't feel comfortable creating or reviewing patches any more.

    @serhiy-storchaka
    Copy link
    Member

    Yet one demonstration of this bug:

    $ ./python -IS
    >>> import operator
    >>> operator.length_hint(iter("abc"))
    0
    >>> import collections.abc
    >>> operator.length_hint(iter("abc"))
    3

    @serhiy-storchaka serhiy-storchaka changed the title format(object.__reduce__) fails intermittently Special method lookup fails on unitialized types Oct 8, 2016
    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Oct 8, 2016

    New changeset bbaf6c928526 by Serhiy Storchaka in branch '3.5':
    Issue bpo-26906: Resolving special methods of uninitialized type now causes
    https://hg.python.org/cpython/rev/bbaf6c928526

    New changeset 3119f08802a5 by Serhiy Storchaka in branch '2.7':
    Issue bpo-26906: Resolving special methods of uninitialized type now causes
    https://hg.python.org/cpython/rev/3119f08802a5

    New changeset 888a26fac9d2 by Serhiy Storchaka in branch '3.6':
    Issue bpo-26906: Resolving special methods of uninitialized type now causes
    https://hg.python.org/cpython/rev/888a26fac9d2

    New changeset d24f1467a297 by Serhiy Storchaka in branch 'default':
    Issue bpo-26906: Resolving special methods of uninitialized type now causes
    https://hg.python.org/cpython/rev/d24f1467a297

    @orenmn
    Copy link
    Mannequin

    orenmn mannequin commented Oct 8, 2016

    (Just to save time for anyone interested)
    The last demonstration of the bug Serhiy mentioned is caused by the following (this was right only until Serhiy's patch earlier today):
    - before importing collections.abc, str_iterator is not initialized, which means:
    * Its tp_mro is NULL.
    * _PyType_Lookup returns NULL (when called to lookup __length_hint__ in str_iterator (as part of operator.length_hint))
    - on import, collections.abc also does 'Iterator.register(str_iterator)', which leads to the following call chain: ABCMeta.register(Iterator, str_iterator) => issubclass(str_iterator, Iterator) => PyObject_IsSubclass(str_iterator, Iterator) => Iterator.__subclasscheck__(Iterator, str_iterator) => Iterator.__subclasshook__(str_iterator) => collections.abc._check_methods(str_iterator, '__iter__', '__next__')
    And _check_methods first does 'mro = C.__mro__', which ultimately calls type_getattro (which calls PyType_Ready in case tp_dict is NULL).

    Anyway, with regard to the disconcerting comment:
    /* If mro is NULL, the type is either not yet initialized
    by PyType_Ready(), or already cleared by type_clear().
    Either way the safest thing to do is to return NULL. */
    Sorry for the newbie question, but why not add a Py_TPFLAGS_CLEARED flag to tp_flags?
    Then we could assert in _PyType_Lookup (and maybe also in other places that call PyType_Ready, such as type_getattro) that the Py_TPFLAGS_CLEARED is not set..

    I realize adding such a flag is really a big deal, but maybe it's worth catching sneaky bugs caused by Python's equivalent of Use-After-Free bugs?

    @serhiy-storchaka serhiy-storchaka changed the title Special method lookup fails on unitialized types Special method lookup fails on uninitialized types Oct 8, 2016
    @iritkatriel
    Copy link
    Member

    All of the examples for python 3 are working now:

    >>> import array
    >>> it = iter(array.array('i'))
    >>> format(it)
    '<array.arrayiterator object at 0x10598f7a0>'
    >>> format(iter([]))
    '<list_iterator object at 0x10598f890>'
    >>> import operator
    >>> operator.length_hint(iter("abc"))
    03

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants