Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[subinterpreters] PyObject statics exposed in the limited API break isolation. #87669

Open
ericsnowcurrently opened this issue Mar 15, 2021 · 14 comments
Labels
3.10 only security fixes extension-modules C modules in the Modules dir interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-C-API topic-subinterpreters type-bug An unexpected behavior, bug, or error

Comments

@ericsnowcurrently
Copy link
Member

BPO 43503
Nosy @gvanrossum, @nascheme, @vstinner, @encukou, @markshannon, @ericsnowcurrently, @mattip
PRs
  • bpo-43503: Make limited API objects effectively immutable. #24828
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2021-03-15.18:53:43.710>
    labels = ['interpreter-core', 'expert-subinterpreters', 'type-bug', '3.10', 'extension-modules', 'expert-C-API']
    title = '[subinterpreters] PyObject statics exposed in the limited API break isolation.'
    updated_at = <Date 2021-03-17.14:46:29.259>
    user = 'https://github.com/ericsnowcurrently'

    bugs.python.org fields:

    activity = <Date 2021-03-17.14:46:29.259>
    actor = 'eric.snow'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Extension Modules', 'Interpreter Core', 'C API', 'Subinterpreters']
    creation = <Date 2021-03-15.18:53:43.710>
    creator = 'eric.snow'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 43503
    keywords = ['patch']
    message_count = 14.0
    messages = ['388759', '388764', '388765', '388773', '388782', '388783', '388848', '388859', '388861', '388871', '388920', '388922', '388923', '388924']
    nosy_count = 7.0
    nosy_names = ['gvanrossum', 'nascheme', 'vstinner', 'petr.viktorin', 'Mark.Shannon', 'eric.snow', 'mattip']
    pr_nums = ['24828']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue43503'
    versions = ['Python 3.10']

    @ericsnowcurrently
    Copy link
    Member Author

    In the limited C-API we expose the following static PyObject variables:

    • 5 singletons
    • ~70 exception types
    • ~70 other types

    Since they are part of the limited API, they have a direct effect on the stable ABI.

    The problem is that these objects should not be shared between interpreters. There are a number of possible solutions for isolating the objects, but the big constraint is that the solution cannot break the stable ABI.

    @ericsnowcurrently ericsnowcurrently added 3.10 only security fixes topic-C-API type-bug An unexpected behavior, bug, or error labels Mar 15, 2021
    @ericsnowcurrently
    Copy link
    Member Author

    Here are some solutions that I've considered:

    1. immutable objects
      a. make the objects truly immutable/const
    2. per-interpreter objects
      a. replace them with macros that do a per-interpreter lookup
      b. replace them with simple placeholders and do a per-interpreter lookup internally
      c. replace them with PyObject placeholders and do a per-interpreter lookup internally

    As far as I'm aware, only (1b) and (2c) are realistic and won't break the stable ABI (i.e. preserve layout).

    (FWIW, I think that even with (1b) we would still have per-interpreter objects.)

    -- Regarding (1a) --

    See see #69016 for an example implementation. This includes storing some state for the objects in PyInterpreterState and doing a lookup internally.

    pros:

    • relatively straightforward to implement
    • overlaps with other interests (see bpo-40255)
    • makes the objects shareable between interpreters (making them more efficient)

    cons:

    • we have to ensure the objects stay immutable (a tractable problem if the solution is constrained to only the limited API objects)

    -- Regarding (2c) --

    This involves adding an API to get the per-interpreter object for a given identity value (flag, global, etc.) and then mapping the limited API objects to the corresponding per-interpreter ones.

    pros:

    • avoids a one-off solution
    • extensions can stop using the variables directly (in favor of the new lookup API)

    cons:

    • effectively couples the C-API to the design (as long as the objects are in the limited API)
    • many touches all over the C-API
    • any future changes/additions in the C-API must deal with the objects

    @ericsnowcurrently
    Copy link
    Member Author

    If the stable ABI weren't an issue then we would probably:

    • deprecate using the objects directly
    • do something like (2a) in the meantime

    It may make sense to do that for "#ifndef Py_LIMITED_API", regardless of how we handle the limited API.

    @ericsnowcurrently ericsnowcurrently added extension-modules C modules in the Modules dir interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-subinterpreters labels Mar 15, 2021
    @gvanrossum
    Copy link
    Member

    I can never remember what "Py_LIMITED_API" stands for. If it's not defined, does that mean we have the *unlimited* API? Is that a superset or a subset of the limited API?

    Regarding 1a *and* 1b, I think it would help to list the specific reasons exceptions and other types are not entirely immutable. Is it just __subclasses__ or is there other state (apart from the refcount) that's mutable and visible to the end user? (Or even if it's visible to C API users.)

    @vstinner
    Copy link
    Member

    • 5 singletons

    This issue is discussed in bpo-39511 "[subinterpreters] Per-interpreter singletons (None, True, False, etc.)".

    Since they are part of the limited API, they have a direct effect on the stable ABI.

    This issue is discussed in bpo-40601: "[C API] Hide static types from the limited C API".

    @vstinner
    Copy link
    Member

    I can never remember what "Py_LIMITED_API" stands for.

    Include/README file is being written, have a look ;-)
    https://github.com/python/cpython/pull/24884/files

    @ericsnowcurrently
    Copy link
    Member Author

    One simple solution is to explicitly state that the limited API does not support subinterpreters. This is already implied by the fact that the multi-phase init API (PEP-489) requires subinterpreter support but is not part of the limited API.

    If we establish this constraint then the problem I originally described here goes away (and we can close this issue).

    (Note: I'm pretty sure this is something someone suggested to me at some point, rather than my own idea.)

    @ericsnowcurrently
    Copy link
    Member Author

    @mattip
    Copy link
    Contributor

    mattip commented Mar 16, 2021

    I am confused. How can widening the usable number of functions (i.e. using the whole C-API rather than the limited API) help c-extension modules be usable in subinterpreters? Aren't the same singletons, exception types, and other types exposed in the full C-API?

    @encukou
    Copy link
    Member

    encukou commented Mar 16, 2021

    There seems to be much confusion here. Maybe on my side?

    PEP-489 is *very much* part of the limited API.

    @vstinner
    Copy link
    Member

    Eric Snow proposes that C extensions which want to be compatible with subinterpreters must use an hypothetical variant of the C API which doesn't inherit flaws of the current C API. For example, static types like "&PyLong_Type" would be excluded.

    To be clear, the limited C API does expose (indirectly) "&PyLong_Type". We are talking about a new variant of the C API.

    The main interpreter would continue to use its static type "&PyLong_Type", whereas each subinterpreter would get its own "int" type allocated on the heap (heap type).

    Someone has to write a PoC to ensure that this idea works in practice.

    In bpo-40601, I proposed that all interpreters including the main interpreter only use heap types: remove "&PyLong_Type" from the C API which is a backward incompatible C API change.

    @ericsnowcurrently
    Copy link
    Member Author

    I am confused. How can widening the usable number of functions (i.e. using
    the whole C-API rather than the limited API) help c-extension modules be
    usable in subinterpreters? Aren't the same singletons, exception types, and
    other types exposed in the full C-API?

    If Py_LIMITED_API is defined then things would stay the same. Otherwise we would replace the names with macros to do the appropriate lookup. (That isn't the whole story since the Py*_Type names are PyTypeObject and not PyObject*.)

    @ericsnowcurrently
    Copy link
    Member Author

    PEP-489 is *very much* part of the limited API.

    Gah, I missed that. That said, I don't think it matters; I just lose an easy point in the rationale. :)

    @ericsnowcurrently
    Copy link
    Member Author

    FYI, I'm going to focus discussion on the capi-sig thread.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.10 only security fixes extension-modules C modules in the Modules dir interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-C-API topic-subinterpreters type-bug An unexpected behavior, bug, or error
    Projects
    Status: Todo
    Development

    No branches or pull requests

    5 participants