Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C API] Hide static types from the limited C API #84781

Closed
vstinner opened this issue May 11, 2020 · 12 comments
Closed

[C API] Hide static types from the limited C API #84781

vstinner opened this issue May 11, 2020 · 12 comments

Comments

@vstinner
Copy link
Member

BPO 40601
Nosy @gvanrossum, @vstinner, @encukou, @shihai1991, @erlend-aasland, @h-vetinari, @JunyiXie
PRs
  • [WIP] bpo-40601: Add functions to get builtin types #24146
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2020-05-11.22:23:25.790>
    labels = ['expert-subinterpreters', 'expert-C-API', '3.9']
    title = '[C API] Hide static types from the limited C API'
    updated_at = <Date 2022-01-05.17:59:11.136>
    user = 'https://github.com/vstinner'

    bugs.python.org fields:

    activity = <Date 2022-01-05.17:59:11.136>
    actor = 'erlendaasland'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['C API', 'Subinterpreters']
    creation = <Date 2020-05-11.22:23:25.790>
    creator = 'vstinner'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 40601
    keywords = ['patch']
    message_count = 11.0
    messages = ['368667', '368668', '368676', '368731', '369999', '384539', '386140', '388541', '388545', '388546', '389474']
    nosy_count = 7.0
    nosy_names = ['gvanrossum', 'vstinner', 'petr.viktorin', 'shihai1991', 'erlendaasland', 'h-vetinari', 'JunyiXie']
    pr_nums = ['24146']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue40601'
    versions = ['Python 3.9']

    @vstinner
    Copy link
    Member Author

    "Statically allocated types" prevents to get per-interpreter GIL: bpo-40512. These types are currently shared by all interpreters.

    Eric Snow proposed the idea of creating a heap allocated type in subintepreters. But we should take care of direct usage of the statically allocated type.

    For example, Objects/longobject.c defines "PyTypeObject PyLong_Type = {...};". This type is exposed in the limited C API (!) in Include/longobject.c:

    PyAPI_DATA(PyTypeObject) PyLong_Type;

    It's used but such macro:

    #define PyLong_CheckExact(op) Py_IS_TYPE(op, &PyLong_Type)

    I don't think that these types are directly accessed in C extensions built with the limited C API. My expectation is that the type is only exposed for "CheckExact" macros.

    Currently, 100 statically allocated types are declared in Python header files:

    $ grep -F '(PyTypeObject)' Include/ -R
    Include/cpython/fileobject.h:PyAPI_DATA(PyTypeObject) PyStdPrinter_Type;
    (...)
    Include/object.h:PyAPI_DATA(PyTypeObject) PySuper_Type; /* built-in 'super' */
    Include/methodobject.h:PyAPI_DATA(PyTypeObject) PyCFunction_Type;

    Most of them seem to be exposed in the limited C API.

    I propose to break the limited C API backward compatibility on purpose by removing these type definitions form the limited C API.

    For "CheckExact" macros, we can continue to provide them in the limited C API but as function calls. So a built C extension would no longer access directly the type, but only do function calls.

    @vstinner vstinner added 3.9 only security fixes topic-C-API labels May 11, 2020
    @vstinner
    Copy link
    Member Author

    See also bpo-40077: "Convert static types to PyType_FromSpec()".

    @vstinner
    Copy link
    Member Author

    I propose to break the limited C API backward compatibility on purpose by removing these type definitions form the limited C API.

    Hum. How would a C extension subclass the Python int type (PyLong_Type) if it's no longer exposed? One option is to add one function per type, like:

    PyObject* Py_GetLongType(void);

    It would return a *strong reference* to the type (PyLong_Type).

    Another option is to get the type from builtins module or builtins dictionary (PyInterpreterState.builtins). But there is no simple C function to get a builtin object. It requires many calls, handle errors, etc. Maybe a generic helper like the following function would help:

    PyObject *Py_GetBuiltin(const char *name);

    Note: PyEval_GetBuiltins() exposes the builtins of the *current frame* which maybe not be what you may expect.

    Currently, Py_GetBuiltin(name) is not needed since basically *all* Python builtins are *directly* exposed in the C API...

    @encukou
    Copy link
    Member

    encukou commented May 12, 2020

    For example, Objects/longobject.c defines "PyTypeObject PyLong_Type = {...};". This type is exposed in the limited C API (!)

    Technically, it is not, see https://www.python.org/dev/peps/pep-0384/#structures
    Structures like PyLong_Type are *not* part of the limited API.

    I propose to break the limited C API backward compatibility on purpose by removing these type definitions form the limited C API.

    That could only be done in Python 4.0, or if we started C-API 4.0. But I don't think it's necessary here.

    @vstinner
    Copy link
    Member Author

    Technically, it is not, see https://www.python.org/dev/peps/pep-0384/#structures
    Structures like PyLong_Type are *not* part of the limited API.

    The symbol is exported by libpython:

    $ objdump -T /lib64/libpython3.8.so.1.0|grep PyLong_Type
    000000000030de00 g    DO .data	00000000000001a0  Base        PyLong_Type

    A C extension can use a reference to PyLong_Type.

    I don't think it's necessary here.

    Did you read my rationale (first message)? Do you mean that per-interpreter GIL is not worth it?

    --

    A first step would be to expose "CheckExact" macros as function calls in the limited C API.

    @vstinner
    Copy link
    Member Author

    vstinner commented Jan 6, 2021

    PC/python3dll.c exports 66 types in the stable ABI:

    Py_GenericAliasType
    PyObject_Type
    _PyWeakref_CallableProxyType
    _PyWeakref_ProxyType
    _PyWeakref_RefType
    PyBaseObject_Type
    PyBool_Type
    PyByteArray_Type
    PyByteArrayIter_Type
    PyBytes_Type
    PyBytesIter_Type
    PyCallIter_Type
    PyCapsule_Type
    PyCFunction_Type
    PyClassMethodDescr_Type
    PyComplex_Type
    PyDict_Type
    PyDictItems_Type
    PyDictIterItem_Type
    PyDictIterKey_Type
    PyDictIterValue_Type
    PyDictKeys_Type
    PyDictProxy_Type
    PyDictValues_Type
    PyEllipsis_Type
    PyEnum_Type
    PyExc_TypeError
    PyFilter_Type
    PyFloat_Type
    PyFrozenSet_Type
    PyGetSetDescr_Type
    PyList_Type
    PyListIter_Type
    PyListRevIter_Type
    PyLong_Type
    PyLongRangeIter_Type
    PyMap_Type
    PyMemberDescr_Type
    PyMemoryView_Type
    PyMethodDescr_Type
    PyModule_Type
    PyModuleDef_Type
    PyNullImporter_Type
    PyODict_Type
    PyODictItems_Type
    PyODictIter_Type
    PyODictKeys_Type
    PyODictValues_Type
    PyProperty_Type
    PyRange_Type
    PyRangeIter_Type
    PyReversed_Type
    PySeqIter_Type
    PySet_Type
    PySetIter_Type
    PySlice_Type
    PySortWrapper_Type
    PySuper_Type
    PyTraceBack_Type
    PyTuple_Type
    PyTupleIter_Type
    PyType_Type
    PyUnicode_Type
    PyUnicodeIter_Type
    PyWrapperDescr_Type
    PyZip_Type

    @encukou
    Copy link
    Member

    encukou commented Feb 2, 2021

    Sorry, I lost this bug in my TODO list :(

    > I don't think it's necessary here.

    Did you read my rationale (first message)? Do you mean that per-interpreter GIL is not worth it?

    Right, I mean that it it is not worth breaking the C-API for all existing modules.
    Instead, I think that it can be done as an addition: only modules that don't use things like these static types would be allowed in subinterpreters that have their own GIL.

    @JunyiXie
    Copy link
    Mannequin

    JunyiXie mannequin commented Mar 12, 2021

    It seems that there is no continued progress for move static type in heap.This will make it impossible to continue to achieve sub interpreters parallel. Are there any plans to try other solutions to the problem?

    In my project, i try to slove this problem, It can work, we verify on millions of devices.

    1. In typeobject.c add lock to ensure that some functions that modification type are thread-safe
    2. and make the PyCFunction and descri object of the Type will never be released. (Frequently used when load method/attributed, locking affects performance)

    Can this change be submitted to cpython?

    @vstinner
    Copy link
    Member Author

    The Steering Council asked for a PEP to explain why static types should be converted to heap types.

    @vstinner
    Copy link
    Member Author

    I plan to write such PEP soon.

    @gvanrossum
    Copy link
    Member

    FWIW I have an idea that would allow code using e.g. &PyList_Type to continue to work, and even ABI compatible (though only in the main interpreter).

    // In some header file

    PyAPI_FUNC(PyHeapTypeObject *) PyList_GetType();
    
    #define PyList_Type (PyList_GetType()->ht_type)

    For the main interpreter we could make this return the address of PyList_Type.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    @vstinner
    Copy link
    Member Author

    I plan to write such PEP soon.

    Sadly, I worked on other topics in the meanwhile and I left sub-interpreters aside. PEP 687 was accepted which is a step forward, even if it doesn't propose any solution for this specific problem.

    Until I can come up with an idea which doesn't break the API, I prefer to close the issue. If someone wants to experiment Guido's idea, please go ahead and propose a PR or even write a PEP.

    I'm not sure that going through PyTypeObject.ht_type is needed, the macro can just be: #define PyLong_Type (*PyLong_GetType()). Example:

    #include <stdio.h>
    
    typedef struct {
        const char *name;
    } PyTypeObject;
    
    PyTypeObject PyLong_Type = {.name = "int"};
    
    PyTypeObject* PyLong_GetType(void) {
        return &PyLong_Type;
    }
    
    #define PyLong_Type_MACRO (*PyLong_GetType())
    
    int main()
    {
        printf("PyLong_Type.name = %s\n", PyLong_Type.name);
        // at the API level, PyLong_Type_MACRO type is a PyTypeObject instance
        printf("PyLong_Type.name = %s\n", PyLong_Type_MACRO.name);
    }

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Projects
    Status: Done
    Development

    No branches or pull requests

    3 participants