Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework C types initialization #87936

Closed
vstinner opened this issue Apr 7, 2021 · 14 comments
Closed

Rework C types initialization #87936

vstinner opened this issue Apr 7, 2021 · 14 comments
Labels
3.10 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs)

Comments

@vstinner
Copy link
Member

vstinner commented Apr 7, 2021

BPO 43770
Nosy @vstinner, @corona10
PRs
  • bpo-43770: Sort types in _PyTypes_Init() #25263
  • bpo-43770: _PyTypes_Init() inits more static types #25265
  • bpo-43770: _PyTypes_Init() inits _PyAnextAwaitable_Type #25266
  • bpo-43770: Inherit default tp_getattro and tp_setattro #25275
  • bpo-43770: Refactor type_new() function #25325
  • bpo-43770: Refactor PyType_Ready() function #25336
  • bpo-43770: Reorder type_ready() #25373
  • bpo-43770: Cleanup type_ready() #25388
  • bpo-43770: Cleanup PyModuleDef_Init() #26879
  • bpo-43770: PyObject_SetAttr() asserts that the type is ready #26880
  • bpo-43770: PyObject_GetAttr() calls PyType_Ready() #26881
  • bpo-43770: Cleanup _PyObject_GetMethod() #26946
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2021-09-21.22:08:50.540>
    created_at = <Date 2021-04-07.21:48:22.168>
    labels = ['interpreter-core', '3.10']
    title = 'Rework C types initialization'
    updated_at = <Date 2021-09-21.22:08:50.540>
    user = 'https://github.com/vstinner'

    bugs.python.org fields:

    activity = <Date 2021-09-21.22:08:50.540>
    actor = 'vstinner'
    assignee = 'none'
    closed = True
    closed_date = <Date 2021-09-21.22:08:50.540>
    closer = 'vstinner'
    components = ['Interpreter Core']
    creation = <Date 2021-04-07.21:48:22.168>
    creator = 'vstinner'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 43770
    keywords = ['patch']
    message_count = 14.0
    messages = ['390484', '390490', '390493', '390506', '390510', '390519', '390742', '390799', '390955', '390962', '390964', '396420', '396807', '402383']
    nosy_count = 2.0
    nosy_names = ['vstinner', 'corona10']
    pr_nums = ['25263', '25265', '25266', '25275', '25325', '25336', '25373', '25388', '26879', '26880', '26881', '26946']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue43770'
    versions = ['Python 3.10']

    @vstinner
    Copy link
    Member Author

    vstinner commented Apr 7, 2021

    Currently, PyType_Ready() is called late on many static types, which can lead to bugs. For example, PyObject_SetAttr() can fail if the type of the object is not ready yet. PyType_Read() is responsible to initialize tp_getattro and tp_setattro members of the PyTypeObject structure.

    We must explicitly initialize all static types before using them.

    Moreover, many static types initialize explicitly :

      tp_getattro = PyObject_GenericGetAttr

    and:

      tp_setattro = PyObject_GenericSetAttr

    whereas it's the default implementation. They can be omitted.

    I created this issue as a placeholder for multiple changes to modify how types implemented in C are initialized.

    @vstinner vstinner added 3.10 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) labels Apr 7, 2021
    @vstinner
    Copy link
    Member Author

    vstinner commented Apr 7, 2021

    New changeset a518099 by Victor Stinner in branch 'master':
    bpo-43770: Sort types in _PyTypes_Init() (GH-25263)
    a518099

    @vstinner
    Copy link
    Member Author

    vstinner commented Apr 7, 2021

    New changeset df5dc1c by Victor Stinner in branch 'master':
    bpo-43770: _PyTypes_Init() inits more static types (GH-25265)
    df5dc1c

    @vstinner
    Copy link
    Member Author

    vstinner commented Apr 8, 2021

    New changeset b98eba5 by Victor Stinner in branch 'master':
    bpo-43770: _PyTypes_Init() inits _PyAnextAwaitable_Type (GH-25266)
    b98eba5

    @vstinner
    Copy link
    Member Author

    vstinner commented Apr 8, 2021

    I tried to put assertions in PyObject_GetAttr(), _PyObject_LookupAttr(), _PyObject_GetMethod() and PyObject_SetAttr() to ensure that the type is ready. It breaks many tests, examples:

    test_buffer test_pickle test_picklebuffer test_pickletools
    test_structmembers
    

    For example, Modules/_testbuffer.c does not initialize its ndarray type.

    Example of crash:
    -----------------
    Objects/object.c:892: PyObject_GetAttr: Assertion "tp->tp_flags & (1UL << 12)" failed
    Enable tracemalloc to get the memory block allocation traceback

    object address : 0x7f4d89119a00
    object refcount : 4
    object type : 0x8768c0
    object type name: type
    object repr : <class 'ndarray'>

    Fatal Python error: _PyObject_AssertFailed: _PyObject_AssertFailed
    Python runtime state: initialized

    Current thread 0x00007f4d96eba740 (most recent call first):
    File "/home/vstinner/python/master/Lib/test/test_picklebuffer.py", line 80 in test_ndarray_2d
    (...)

    Extension modules: _testcapi, _testbuffer (total: 2)
    -----------------

    Another example of crash:
    -----------------
    Objects/object.c:892: PyObject_GetAttr: Assertion "tp->tp_flags & (1UL << 12)" failed
    Enable tracemalloc to get the memory block allocation traceback

    object address : 0x7f174ec9aa00
    object refcount : 3
    object type : 0x8768c0
    object type name: type
    object repr : <class 'ndarray'>

    Fatal Python error: _PyObject_AssertFailed: _PyObject_AssertFailed
    Python runtime state: initialized

    Current thread 0x00007f175ca42740 (most recent call first):
    File "/home/vstinner/python/master/Lib/test/pickletester.py", line 304 in __reduce_ex__
    (...)

    Extension modules: _testcapi, _testbuffer, numpy.core._multiarray_umath, numpy.core._multiarray_tests, numpy.linalg.lapack_lite, numpy.linalg._umath_linalg, numpy.fft._pocketfft_internal, numpy.random._common, numpy.random.bit_generator, numpy.random._bounded_integers, numpy.random._mt19937, numpy.random.mtrand, numpy.random._philox, numpy.random._pcg64, numpy.random._sfc64, numpy.random._generator (total: 16)
    -----------------

    The C API tries to make the type ready automatically. For example, PyObject_Hash() calls PyType_Ready() if needed and then retry tp_hash:

    Py_hash_t
    PyObject_Hash(PyObject *v)
    {
        PyTypeObject *tp = Py_TYPE(v);
        if (tp->tp_hash != NULL)
            return (*tp->tp_hash)(v);
        /* To keep to the general practice that inheriting
         * solely from object in C code should work without
         * an explicit call to PyType_Ready, we implicitly call
         * PyType_Ready here and then check the tp_hash slot again
         */
        if (tp->tp_dict == NULL) {
            if (PyType_Ready(tp) < 0)
                return -1;
            if (tp->tp_hash != NULL)
                return (*tp->tp_hash)(v);
        }
        /* Otherwise, the object can't be hashed */
        return PyObject_HashNotImplemented(v);
    }

    @vstinner
    Copy link
    Member Author

    vstinner commented Apr 8, 2021

    PR 25275: this is a subtle difference if PyTypeObject.tp_setattro is set statically to PyObject_GenericSetAttr() or if it's inherited by PyType_Ready().

    Reference (master)::

    • BaseException.__dict__['__setattr__'] = <slot wrapper '__setattr__' of 'BaseException' objects>
    • BaseException.__setattr__ = <slot wrapper '__setattr__' of 'BaseException' objects>

    With the PR:

    • no '__setattr___' in BaseException.__dict__
    • BaseException.__setattr__ = <slot wrapper '__setattr__' of 'object' objects>

    Because of that, doctest.DocTestFinder().find(builtins) returns less items, and so test_doctest fails.

    @vstinner
    Copy link
    Member Author

    New changeset ecf14e6 by Victor Stinner in branch 'master':
    bpo-43770: Refactor type_new() function (GH-25325)
    ecf14e6

    @vstinner
    Copy link
    Member Author

    New changeset 53114ff by Victor Stinner in branch 'master':
    bpo-43770: Refactor PyType_Ready() function (GH-25336)
    53114ff

    @vstinner
    Copy link
    Member Author

    New changeset 65f058e by Victor Stinner in branch 'master':
    bpo-43770: Reorder type_ready() (GH-25373)
    65f058e

    @vstinner
    Copy link
    Member Author

    New changeset a328d73 by Victor Stinner in branch 'master':
    bpo-43770: Cleanup type_ready() (GH-25388)
    a328d73

    @vstinner
    Copy link
    Member Author

    If I modify type_ready() to call type_ready_inherit() before type_ready_fill_dict(), some types create a between between their C slot (tp_init) and the Python API in tp_dict (tp_dict["__init__"]).

    Example with importlib:

    class FileLoader(...):
        def __init__(...):
            ...

    => FileLoader.tp_init = slot_tp_init

    class SourceFileLoader(FileLoader):
        ...

    When PyType_Ready() is called on SourceFileLoader, we get:

    • SourceFileLoader.tp_base = FileLoader
    • SourceFileLoader.tp_init = NULL
    • SourceFileLoader.tp_dict has no "__init__" key

    When inherit_slots() is called, SourceFileLoader.tp_init is set to slot_tp_init().

    When add_operators() is called, SourceFileLoader.tp_dict["__init__"] is set to PyDescr_NewWrapper(slot_tp_init).

    Problem: we a loop! tp_dict["__init__"] => slot_tp_init => tp_dict["__init__"]

    @vstinner
    Copy link
    Member Author

    New changeset 2a396d6 by Victor Stinner in branch 'main':
    bpo-43770: Cleanup PyModuleDef_Init() (GH-26879)
    2a396d6

    @vstinner
    Copy link
    Member Author

    vstinner commented Jul 1, 2021

    New changeset dd3adc0 by Victor Stinner in branch 'main':
    bpo-43770: Cleanup _PyObject_GetMethod() (GH-26946)
    dd3adc0

    @vstinner
    Copy link
    Member Author

    The code is way more complicated than what I expected. I hope that my work on this issue will help future developers who will try to understand the code. But I prefer to stop the refactoring at this point. I pushed enough changes ;-)

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.10 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs)
    Projects
    None yet
    Development

    No branches or pull requests

    1 participant