classification
Title: [C API] Revisit usage of the PyCapsule C API with multi-phase initialization API
Type: Stage:
Components: C API Versions: Python 3.10
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: corona10, petr.viktorin, shihai1991, vstinner
Priority: normal Keywords:

Created on 2020-09-16 14:07 by vstinner, last changed 2020-09-17 15:56 by shihai1991.

Messages (2)
msg376992 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-09-16 14:07
More and more extension modules are converted to the multi-phase initialization API (PEP 489) in bpo-1635741.

The problem is that the usage of the PyCapsule C API is not adapted in these extensions, to isolate well capsule objects from other module instances.

For example, the pyexpat extension uses "static struct PyExpat_CAPI capi;". It was fine when it was not possible to create more than one instance of the extension module. But with PR 22222 (currently under review), it becomes possible to have multiple extension module instances. Each module instance creates its own capsule object, but all capsule object points to the same unique "struct PyExpat_CAPI" instance.

For the specific case of the pyexpat in its current implementation, reusing the same "struct PyExpat_CAPI" instance is ok-ish, since the value is the same for all module instances. But this design sounds fragile.

It would be safer to allocate a "struct PyExpat_CAPI" on the heap memory in each module instance, and use a PyCapsule destructor function (3rd parameter of PyCapsule_New()).

The _ctypes does that:
---
        void *space = PyMem_Calloc(2, sizeof(int));
        if (space == NULL)
            return NULL;
        errobj = PyCapsule_New(space, CTYPES_CAPSULE_NAME_PYMEM, pymem_destructor);
---

with:
---
static void pymem_destructor(PyObject *ptr)
{
    void *p = PyCapsule_GetPointer(ptr, CTYPES_CAPSULE_NAME_PYMEM);
    if (p) {
        PyMem_Free(p);
    }
}
---


The PyCapsule API is used by multiple extension modules:

* _ctypes: allocate memory on the heap and uses a destructor to release it

* _curses: static variable, PyInit__curses() sets PyCurses_API[0] to &PyCursesWindow_Type (static type)

* _datetime: static variable, PyInit__datetime() creates a timezone object and stores it into CAPI.TimeZone_UTC

* _decimal: static variable

* _socket: static variable, PyInit__socket() sets PySocketModuleAPI.error to PyExc_OSError, and sets PySocketModuleAPI.timeout_error to _socket.timeout (a new exception object)

* pyexpat: static varaible

* unicodedata: static variable

* posix: nt._add_dll_directory() creates a PyCapsule using AddDllDirectory() result as a the pointer value
The _datetime module overrides the 


--

See also the PEP 630 "Isolating Extension Modules".
msg376995 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-09-16 14:19
> unicodedata: static variable

Right now, it's ok-ish. Mohamed Koubaa is working on PR 22145 to pass a module state into internal functions, and so the PyCapsule pointer must be different in each module instance.
History
Date User Action Args
2020-09-17 15:56:22shihai1991setnosy: + shihai1991
2020-09-17 02:09:38corona10setnosy: + corona10
2020-09-16 14:19:48vstinnersetmessages: + msg376995
2020-09-16 14:09:20petr.viktorinsetnosy: + petr.viktorin
2020-09-16 14:07:49vstinnercreate