classification
Title: [subinterpreters] PyObject statics exposed in the limited API break isolation.
Type: behavior Stage: patch review
Components: C API, Extension Modules, Interpreter Core, Subinterpreters Versions: Python 3.10
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Mark.Shannon, eric.snow, gvanrossum, mattip, nascheme, petr.viktorin, vstinner
Priority: normal Keywords: patch

Created on 2021-03-15 18:53 by eric.snow, last changed 2021-03-17 14:46 by eric.snow.

Pull Requests
URL Status Linked Edit
PR 24828 closed eric.snow, 2021-03-15 19:31
Messages (14)
msg388759 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2021-03-15 18:53
In the limited C-API we expose the following static PyObject variables:

* 5 singletons
* ~70 exception types
* ~70 other types

Since they are part of the limited API, they have a direct effect on the stable ABI.

The problem is that these objects should not be shared between interpreters.  There are a number of possible solutions for isolating the objects, but the big constraint is that the solution cannot break the stable ABI.
msg388764 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2021-03-15 19:24
Here are some solutions that I've considered:

1. immutable objects
   a. make the objects truly immutable/const
      * not trivial, if possible
   b. make the objects effectively immutable
      * (see GH-24828) use a really high refcount to make races irrelevant
2. per-interpreter objects
   a. replace them with macros that do a per-interpreter lookup
   b. replace them with simple placeholders and do a per-interpreter lookup internally
   c. replace them with PyObject placeholders and do a per-interpreter lookup internally

As far as I'm aware, only (1b) and (2c) are realistic and won't break the stable ABI (i.e. preserve layout).

(FWIW, I think that even with (1b) we would still have per-interpreter objects.)

-- Regarding (1a) --

See see GH-24828 for an example implementation.  This includes storing some state for the objects in PyInterpreterState and doing a lookup internally.

pros:
* relatively straightforward to implement
* overlaps with other interests (see bpo-40255)
* makes the objects shareable between interpreters (making them more efficient)

cons:
* we have to ensure the objects stay immutable (a tractable problem if the solution is constrained to only the limited API objects)

-- Regarding (2c) --

This involves adding an API to get the per-interpreter object for a given identity value (flag, global, etc.) and then mapping the limited API objects to the corresponding per-interpreter ones.

pros:
* avoids a one-off solution
* extensions can stop using the variables directly (in favor of the new lookup API)

cons:
* effectively couples the C-API to the design (as long as the objects are in the limited API)
* many touches all over the C-API
* any future changes/additions in the C-API must deal with the objects
msg388765 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2021-03-15 19:28
If the stable ABI weren't an issue then we would probably:

* deprecate using the objects directly
* do something like (2a) in the meantime

It may make sense to do that for "#ifndef Py_LIMITED_API", regardless of how we handle the limited API.
msg388773 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2021-03-15 19:56
I can never remember what "Py_LIMITED_API" stands for. If it's not defined, does that mean we have the *unlimited* API? Is that a superset or a subset of the limited API?

Regarding 1a *and* 1b, I think it would help to list the specific reasons exceptions and other types are not entirely immutable. Is it just __subclasses__ or is there other state (apart from the refcount) that's mutable and visible to the end user? (Or even if it's visible to C API users.)
msg388782 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-03-15 22:43
> * 5 singletons

This issue is discussed in bpo-39511 "[subinterpreters] Per-interpreter singletons (None, True, False, etc.)".

> Since they are part of the limited API, they have a direct effect on the stable ABI.

This issue is discussed in bpo-40601: "[C API] Hide static types from the limited C API".
msg388783 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-03-15 22:44
> I can never remember what "Py_LIMITED_API" stands for.

Include/README file is being written, have a look ;-)
https://github.com/python/cpython/pull/24884/files
msg388848 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2021-03-16 15:38
One simple solution is to explicitly state that the limited API does not support subinterpreters.  This is already implied by the fact that the multi-phase init API (PEP 489) requires subinterpreter support but is not part of the limited API.

If we establish this constraint then the problem I originally described here goes away (and we can close this issue).

(Note: I'm pretty sure this is something someone suggested to me at some point, rather than my own idea.)
msg388859 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2021-03-16 16:42
FYI, I posted to capi-sig about this:

https://mail.python.org/archives/list/capi-sig@python.org/thread/INLCGPMTYFLRTWQL7RB4MUQZ37JAFRAU/
msg388861 - (view) Author: mattip (mattip) * Date: 2021-03-16 17:12
I am confused. How can widening the usable number of functions (i.e. using the whole C-API rather than the limited API) help c-extension modules be usable in subinterpreters? Aren't the same singletons, exception types, and other types exposed in the full C-API?
msg388871 - (view) Author: Petr Viktorin (petr.viktorin) * (Python committer) Date: 2021-03-16 19:28
There seems to be much confusion here. Maybe on my side?

PEP 489 is *very much* part of the limited API.
msg388920 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-03-17 14:14
Eric Snow proposes that C extensions which want to be compatible with subinterpreters must use an hypothetical variant of the C API which doesn't inherit flaws of the current C API. For example, static types like "&PyLong_Type" would be excluded.

To be clear, the limited C API does expose (indirectly) "&PyLong_Type". We are talking about a new variant of the C API.

The main interpreter would continue to use its static type "&PyLong_Type", whereas each subinterpreter would get its own "int" type allocated on the heap (heap type).

Someone has to write a PoC to ensure that this idea works in practice.

In bpo-40601, I proposed that all interpreters including the main interpreter only use heap types: remove "&PyLong_Type" from the C API which is a backward incompatible C API change.
msg388922 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2021-03-17 14:40
> I am confused. How can widening the usable number of functions (i.e. using
> the whole C-API rather than the limited API) help c-extension modules be
> usable in subinterpreters? Aren't the same singletons, exception types, and
> other types exposed in the full C-API?

If Py_LIMITED_API is defined then things would stay the same.  Otherwise we would replace the names with macros to do the appropriate lookup.  (That isn't the whole story since the Py*_Type names are PyTypeObject and not PyObject*.)
msg388923 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2021-03-17 14:45
> PEP 489 is *very much* part of the limited API.

Gah, I missed that.  That said, I don't think it matters; I just lose an easy point in the rationale. :)
msg388924 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2021-03-17 14:46
FYI, I'm going to focus discussion on the capi-sig thread.
History
Date User Action Args
2021-03-17 14:46:29eric.snowsetmessages: + msg388924
2021-03-17 14:45:59eric.snowsetmessages: + msg388923
2021-03-17 14:40:20eric.snowsetmessages: + msg388922
2021-03-17 14:14:59vstinnersetmessages: + msg388920
2021-03-16 19:28:28petr.viktorinsetnosy: + petr.viktorin
messages: + msg388871
2021-03-16 17:12:48mattipsetnosy: + mattip
messages: + msg388861
2021-03-16 16:42:55eric.snowsetmessages: + msg388859
2021-03-16 15:38:45eric.snowsetmessages: + msg388848
2021-03-15 22:44:27vstinnersetmessages: + msg388783
2021-03-15 22:43:31vstinnersetmessages: + msg388782
2021-03-15 19:56:57gvanrossumsetnosy: + gvanrossum
messages: + msg388773
2021-03-15 19:34:18eric.snowsetcomponents: + Extension Modules, Interpreter Core, Subinterpreters
2021-03-15 19:31:52eric.snowsetkeywords: + patch
stage: needs patch -> patch review
pull_requests: + pull_request23642
2021-03-15 19:31:14eric.snowsetnosy: + nascheme, vstinner, Mark.Shannon
2021-03-15 19:28:35eric.snowsetmessages: + msg388765
2021-03-15 19:24:57eric.snowsetmessages: + msg388764
2021-03-15 18:53:43eric.snowcreate