classification
Title: [C API] Revisit usage of the PyCapsule C API with multi-phase initialization API
Type: Stage: resolved
Components: C API Versions: Python 3.10
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: christian.heimes, corona10, erlendaasland, petr.viktorin, pkerling, shihai1991, vstinner
Priority: normal Keywords: patch

Created on 2020-09-16 14:07 by vstinner, last changed 2021-01-25 09:46 by vstinner. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 24061 merged shihai1991, 2021-01-02 15:42
PR 24096 merged shihai1991, 2021-01-04 15:24
PR 24117 merged erlendaasland, 2021-01-05 14:01
PR 24126 merged erlendaasland, 2021-01-05 21:18
PR 24128 merged erlendaasland, 2021-01-05 22:06
PR 24186 merged shihai1991, 2021-01-10 08:51
Messages (19)
msg376992 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-09-16 14:07
More and more extension modules are converted to the multi-phase initialization API (PEP 489) in bpo-1635741.

The problem is that the usage of the PyCapsule C API is not adapted in these extensions, to isolate well capsule objects from other module instances.

For example, the pyexpat extension uses "static struct PyExpat_CAPI capi;". It was fine when it was not possible to create more than one instance of the extension module. But with PR 22222 (currently under review), it becomes possible to have multiple extension module instances. Each module instance creates its own capsule object, but all capsule object points to the same unique "struct PyExpat_CAPI" instance.

For the specific case of the pyexpat in its current implementation, reusing the same "struct PyExpat_CAPI" instance is ok-ish, since the value is the same for all module instances. But this design sounds fragile.

It would be safer to allocate a "struct PyExpat_CAPI" on the heap memory in each module instance, and use a PyCapsule destructor function (3rd parameter of PyCapsule_New()).

The _ctypes does that:
---
        void *space = PyMem_Calloc(2, sizeof(int));
        if (space == NULL)
            return NULL;
        errobj = PyCapsule_New(space, CTYPES_CAPSULE_NAME_PYMEM, pymem_destructor);
---

with:
---
static void pymem_destructor(PyObject *ptr)
{
    void *p = PyCapsule_GetPointer(ptr, CTYPES_CAPSULE_NAME_PYMEM);
    if (p) {
        PyMem_Free(p);
    }
}
---


The PyCapsule API is used by multiple extension modules:

* _ctypes: allocate memory on the heap and uses a destructor to release it

* _curses: static variable, PyInit__curses() sets PyCurses_API[0] to &PyCursesWindow_Type (static type)

* _datetime: static variable, PyInit__datetime() creates a timezone object and stores it into CAPI.TimeZone_UTC

* _decimal: static variable

* _socket: static variable, PyInit__socket() sets PySocketModuleAPI.error to PyExc_OSError, and sets PySocketModuleAPI.timeout_error to _socket.timeout (a new exception object)

* pyexpat: static varaible

* unicodedata: static variable

* posix: nt._add_dll_directory() creates a PyCapsule using AddDllDirectory() result as a the pointer value
The _datetime module overrides the 


--

See also the PEP 630 "Isolating Extension Modules".
msg376995 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-09-16 14:19
> unicodedata: static variable

Right now, it's ok-ish. Mohamed Koubaa is working on PR 22145 to pass a module state into internal functions, and so the PyCapsule pointer must be different in each module instance.
msg379104 - (view) Author: Petr Viktorin (petr.viktorin) * (Python committer) Date: 2020-10-20 08:47
Also note that the capsule generally needs to hold references to the objects in exposes, and not rely on the module object to keep things alive. (Module objects with multi-phase init are not unique nor immortal.)
_ctypes is an exception, since it doesn't have ant PyObject*. But in most others the destructor should also contain some DECREFs.
msg379733 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-10-27 04:12
> Also note that the capsule generally needs to hold references to the objects in exposes

Oh right, that's a great advice!
msg384281 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-01-03 15:48
New changeset 7c83eaa536d2f436ae46211ca48692f576c732f0 by Hai Shi in branch 'master':
bpo-41798: pyexpat: Allocate the expat_CAPI on the heap memory (GH-24061)
https://github.com/python/cpython/commit/7c83eaa536d2f436ae46211ca48692f576c732f0
msg384389 - (view) Author: Erlend Egeberg Aasland (erlendaasland) * Date: 2021-01-05 12:38
Hai Shi, do you mind if I have a go at the socket module CAPI capsule?
msg384392 - (view) Author: hai shi (shihai1991) * (Python triager) Date: 2021-01-05 12:59
> Hai Shi, do you mind if I have a go at the socket module CAPI capsule?

Welcome to join this bpo, you can enhance those extension modules if you like :)
msg384393 - (view) Author: Erlend Egeberg Aasland (erlendaasland) * Date: 2021-01-05 13:03
> Welcome to join this bpo, you can enhance those extension modules if you like :)

Thanks :) I'll take a look at socket first.
msg384394 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2021-01-05 13:12
PySocketModuleAPI.timeout_error now points to builtin TimeoutError exception.

I'm also in the process of removing PySocketModuleAPI from _ssl.c.
msg384395 - (view) Author: Erlend Egeberg Aasland (erlendaasland) * Date: 2021-01-05 13:17
> I'm also in the process of removing PySocketModuleAPI from _ssl.c.

That's the only user of the API, right? In that case, I'll target _decimal instead. Hai Shi is working on _curses and _datetime, AFAIK.
msg384429 - (view) Author: Erlend Egeberg Aasland (erlendaasland) * Date: 2021-01-05 20:17
> I'm also in the process of removing PySocketModuleAPI from _ssl.c.

BTW, do we still want to keep the socket CAPI for external users?
msg384430 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2021-01-05 20:21
> BTW, do we still want to keep the socket CAPI for external users?

Yes, at least for a while. socket.h is not part of the public include headers. It's still possible that the CAPI is used by an external project.
msg384494 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-01-06 11:47
New changeset fe9f446afe46bd716592eda9fa2af8d9f46bbce4 by Erlend Egeberg Aasland in branch 'master':
bpo-41798: Allocate _decimal extension module C API on the heap (GH-24117)
https://github.com/python/cpython/commit/fe9f446afe46bd716592eda9fa2af8d9f46bbce4
msg384537 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-01-06 19:43
New changeset f22b7ca1afd98a7381a9fe9e53bd6dfa2a5eba16 by Erlend Egeberg Aasland in branch 'master':
bpo-41798: Allocate _socket module C API on the heap (GH-24126)
https://github.com/python/cpython/commit/f22b7ca1afd98a7381a9fe9e53bd6dfa2a5eba16
msg384538 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-01-06 19:47
New changeset 1ab045933b90d1954af983cc08d1bf0bc46b6240 by Hai Shi in branch 'master':
bpo-41798: Allocate the _datetime.datetime_CAPI on the heap memory (GH-24096)
https://github.com/python/cpython/commit/1ab045933b90d1954af983cc08d1bf0bc46b6240
msg385330 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-01-20 11:03
New changeset 61d26394f97306ab4890f1522f26ee6d17461e2b by Erlend Egeberg Aasland in branch 'master':
bpo-41798: Allocate unicodedata CAPI on the heap (GH-24128)
https://github.com/python/cpython/commit/61d26394f97306ab4890f1522f26ee6d17461e2b
msg385486 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-01-22 10:07
New changeset 2f12a1b7ecc9e0cf39b4c6994473e6cb9989f81b by Hai Shi in branch 'master':
bpo-41798: Allocate the _curses._C_API on the heap memory (GH-24186)
https://github.com/python/cpython/commit/2f12a1b7ecc9e0cf39b4c6994473e6cb9989f81b
msg385533 - (view) Author: hai shi (shihai1991) * (Python triager) Date: 2021-01-23 12:49
I have checked all capi instances have been allocated in the heap memory.
So I think this bpo can be closed ;)

Thanks Erlend for your contribution.
Thanks victor, petr for your review&merge.
msg385610 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-01-25 09:46
hai shi: "I have checked all capi instances have been allocated in the heap memory. So I think this bpo can be closed ;)"

I also checks and I confirm that PyCapsule_New() is no longer used with a pointer pointing to static data. Well, there is one last case, in an unit test on the API itself (in _testcapi). This corner case is acceptable ;-)

Thanks all for the fixes!
History
Date User Action Args
2021-01-25 09:46:36vstinnersetmessages: + msg385610
2021-01-23 12:49:36shihai1991setstatus: open -> closed
resolution: fixed
messages: + msg385533

stage: patch review -> resolved
2021-01-22 10:07:18vstinnersetmessages: + msg385486
2021-01-20 11:03:56vstinnersetmessages: + msg385330
2021-01-10 08:51:29shihai1991setpull_requests: + pull_request23014
2021-01-06 19:47:27vstinnersetmessages: + msg384538
2021-01-06 19:43:14vstinnersetmessages: + msg384537
2021-01-06 11:47:31vstinnersetmessages: + msg384494
2021-01-05 22:06:43erlendaaslandsetpull_requests: + pull_request22958
2021-01-05 21:18:40erlendaaslandsetpull_requests: + pull_request22956
2021-01-05 20:21:14christian.heimessetmessages: + msg384430
2021-01-05 20:17:12erlendaaslandsetmessages: + msg384429
2021-01-05 14:01:07erlendaaslandsetpull_requests: + pull_request22947
2021-01-05 13:17:02erlendaaslandsetmessages: + msg384395
2021-01-05 13:12:21christian.heimessetnosy: + christian.heimes
messages: + msg384394
2021-01-05 13:03:43erlendaaslandsetmessages: + msg384393
2021-01-05 12:59:08shihai1991setmessages: + msg384392
2021-01-05 12:38:50erlendaaslandsetnosy: + erlendaasland
messages: + msg384389
2021-01-04 15:24:37shihai1991setpull_requests: + pull_request22927
2021-01-03 15:48:06vstinnersetmessages: + msg384281
2021-01-02 15:42:45shihai1991setkeywords: + patch
stage: patch review
pull_requests: + pull_request22895
2020-11-09 17:36:19pkerlingsetnosy: + pkerling
2020-10-27 04:12:07vstinnersetmessages: + msg379733
2020-10-20 08:47:58petr.viktorinsetmessages: + msg379104
2020-09-17 15:56:22shihai1991setnosy: + shihai1991
2020-09-17 02:09:38corona10setnosy: + corona10
2020-09-16 14:19:48vstinnersetmessages: + msg376995
2020-09-16 14:09:20petr.viktorinsetnosy: + petr.viktorin
2020-09-16 14:07:49vstinnercreate