Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C API] Move PyFrameObject to the internal C API #90992

Closed
vstinner opened this issue Feb 23, 2022 · 25 comments
Closed

[C API] Move PyFrameObject to the internal C API #90992

vstinner opened this issue Feb 23, 2022 · 25 comments
Labels
3.11 only security fixes topic-C-API

Comments

@vstinner
Copy link
Member

BPO 46836
Nosy @gvanrossum, @scoder, @vstinner, @encukou, @markshannon, @corona10, @brandtbucher, @erlend-aasland
PRs
  • bpo-46836: Move PyFrameObject to pycore_frame.h #31530
  • bpo-46836: Rename InterpreterFrame to _PyInterpreterFrame #31583
  • bpo-46836: Add Doc/c-api/frame.rst #32051
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2022-02-23.15:04:12.108>
    labels = ['expert-C-API', '3.11']
    title = '[C API] Move PyFrameObject to the internal C API'
    updated_at = <Date 2022-03-23.12:19:19.087>
    user = 'https://github.com/vstinner'

    bugs.python.org fields:

    activity = <Date 2022-03-23.12:19:19.087>
    actor = 'vstinner'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['C API']
    creation = <Date 2022-02-23.15:04:12.108>
    creator = 'vstinner'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 46836
    keywords = ['patch']
    message_count = 24.0
    messages = ['413795', '413796', '413801', '413802', '413803', '413805', '413807', '413828', '413829', '413853', '413854', '413917', '413921', '413981', '413990', '414009', '414279', '414280', '414283', '414292', '414297', '414310', '414315', '415868']
    nosy_count = 8.0
    nosy_names = ['gvanrossum', 'scoder', 'vstinner', 'petr.viktorin', 'Mark.Shannon', 'corona10', 'brandtbucher', 'erlendaasland']
    pr_nums = ['31530', '31583', '32051']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue46836'
    versions = ['Python 3.11']

    @vstinner
    Copy link
    Member Author

    I propose to move the PyFrameObject structure to the internal C API.

    --

    Between Python 3.10 and Python 3.11, the work on optimizing ceval.c modified deeply the PyFrameObject structure. Examples:

    Most members have been moved to a new PyFrameObject.f_frame member which has the type "struct _interpreter_frame*". Problem: this type is only part of the *internal* C API.

    Moreover, accessing the few remaining members which "didn't change" became dangerous. For example, f_back can be NULL even if the frame has a previous frame: the PyFrame_GetBack() function *must* now be called. See bpo-46356 "[C API] Enforce usage of PyFrame_GetBack()".

    Reading directly f_lineno was already dangerous since Python 2.3: the value is only valid if the value is greater than 0. It's way safer to use the clean PyFrame_GetLineNumber() API instead.

    PyFrame_GetBack() was added to Python 3.9. You can use the pythoncapi_compat project to get this function on Python 3.8 and older:

    => https://pythoncapi-compat.readthedocs.io/

    PyFrame_GetLineNumber() was added to the limited API in Python 3.10.

    => Documentation: https://docs.python.org/dev/c-api/reflection.html#c.PyFrame_GetBack

    --

    There *are* projects accessing directly PyFrameObject like the gevent project which sets the f_code member (moved to f_frame.f_code in Python 3.11). It's broken on Python 3.11:
    https://bugs.python.org/issue40421#msg413719

    Debuggers and profilers also want to read PyFrameObject directly. IMO for these *specific* use cases, using the *internal* C API is a legit use case and it's fine.

    Moving PyFrameObject to the internal C API would clarify the situation. Currently, What's New in Python 3.11 documents the change this with warning:

    "While the documentation notes that the fields of PyFrameObject are subject to change at any time, they have been stable for a long time and were used in several popular extensions. "

    --

    I'm mostly worried about Cython which still get and set many PyFrameObject members directly (ex: f_lasti, f_lineno, f_localsplus, f_trace), since there are no public functions for that.

    => https://bugs.python.org/issue40421#msg367550

    Right now, I would suggest Cython to use the internal C API, and *later* consider adding new getter and setter functions. I don't think that we can solve all problems at once: it takes take to design clean API and use them in Cython.

    Python 3.11 already broke Cython since most PyFrameObject members moved into the new "internal" PyFrameObject.f_frame API which requires using the internal C API to get "struct _interpreter_frame".

    => cython/cython#4500

    --

    Using a frame using the *public* C API was and remains supported. Short example:
    --
    PyThreadState *tstate = PyThreadState_Get();
    PyFrameObject* frame = PyThreadState_GetFrame(tstate);
    int lineno = PyFrame_GetLineNumber(frame);

    The PyFrameObject structure is opaque and members are not accessed directly: it's fine.

    @vstinner vstinner added 3.11 only security fixes topic-C-API labels Feb 23, 2022
    @vstinner
    Copy link
    Member Author

    By the way, Include/cpython/ceval.h uses the "struct _interpreter_frame*" type whereas this type is part of the internal C API:

    PyAPI_FUNC(PyObject *) _PyEval_EvalFrameDefault(PyThreadState *tstate, struct _interpreter_frame *f, int exc);

    Maybe we should move this defintion to the internal C API pycore_ceval.h.

    @vstinner
    Copy link
    Member Author

    See also bpo-40421 "[C API] Add getter functions for PyFrameObject and maybe move PyFrameObject to the internal C API". I added getter functions in recent Python versions:

    • PyFrame_GetBack(): Python 3.9
    • PyFrame_GetCode(): Python 3.9
    • PyFrame_GetLineNumber() added to the limited C API version 3.10

    @vstinner
    Copy link
    Member Author

    See also bpo-44800 "Code readability: rename InterpreterFrame to _Py_framedata".

    @vstinner
    Copy link
    Member Author

    See also bpo-45247: [C API] Add explicit support for Cython to the C API.

    @vstinner
    Copy link
    Member Author

    I marked my PR as a draft since this change is an incompatible change. Even if PyFrameObject structure is excluded from the limited C API and not documented, it's used by a few projects. I plan to check how this change impacts these projects before merging the change.

    For example, test this change on:

    @vstinner
    Copy link
    Member Author

    See also the bpo-39947: "[C API] Make the PyThreadState structure opaque (move it to the internal C API)". Recently, I also added helper functions:

    • PyThreadState_GetFrame(): Python 3.9 and limited C API version 3.10
    • PyThreadState_GetID(): Python 3.9 and limited C API version 3.10
    • PyThreadState_GetInterpreter(): Python 3.9 and limited C API version 3.10
    • PyThreadState_EnterTracing(), PyThreadState_LeaveTracing(): Python 3.11

    See also pending #73307 of bpo-39947: "Add PyThreadState_SetTrace() function".

    @encukou
    Copy link
    Member

    encukou commented Feb 23, 2022

    So, this will break Cython and gevent, but (unlike the optimization work that broke f_code/f_frame) it won't provide any value to users?

    I don't see how that's a good idea.

    @vstinner
    Copy link
    Member Author

    Petr Viktorin:

    So, this will break Cython and gevent,

    This change doesn't break Cython and gevent: they are already broken.

    but (unlike the optimization work that broke f_code/f_frame) it won't provide any value to users?

    The problem is that the C API changed silently: existing code which gets directly PyFrameObject.f_back still compiles successfully, but it will no longer work in some cases.

    See bpo-46356 "[C API] Enforce usage of PyFrame_GetBack()" for more details.

    The intent of moving the structure to the internal C API is to clarify its status: we provide no backward compatibility warranty, you are on our own if you use it.

    It's also a way to promote the usage of the new clean public C API: it is now reliable, whereas accessing directly PyFrameObject members break at each Python version.

    The internal C API cannot be used easily on purpose: you have to opt-in for this API by defining the Py_BUILD_CORE_MODULE macro and you need to use different #include. It's a way to enforce the usage of the clean public C API.

    @scoder
    Copy link
    Contributor

    scoder commented Feb 23, 2022

    I haven't looked fully into this yet, but I *think* that Cython can get rid of most of the direct usages of PyFrameObject by switching to the new InterpreterFrame struct instead. It looks like the important fields have now been moved over to that.

    That won't improve the situation regarding the usage of CPython internals, but it's probably worth keeping in mind before we start adding new API functions that work on frame objects.

    @vstinner
    Copy link
    Member Author

    Stefan Behnel:

    I haven't looked fully into this yet, but I *think* that Cython can get rid of most of the direct usages of PyFrameObject by switching to the new InterpreterFrame struct instead. It looks like the important fields have now been moved over to that.

    InterpreterFrame is part of the internal C API. As I wrote, IMO it's fine for now that Cython uses the internal C API.

    That won't improve the situation regarding the usage of CPython internals, but it's probably worth keeping in mind before we start adding new API functions that work on frame objects.

    Right.

    My hope is also that making the structure internal should help to identify which members should be exposed with getter functions, or even setter functions (Mark would prefer to no add setter functions).

    @encukou
    Copy link
    Member

    encukou commented Feb 24, 2022

    OK, looking at it more carefully, it makes sense to do the change.

    @vstinner
    Copy link
    Member Author

    PyAPI_FUNC(PyObject *) _PyEval_EvalFrameDefault(PyThreadState *tstate, struct _interpreter_frame *f, int exc);

    I created bpo-46850 "[C API] Move _PyEval_EvalFrameDefault() to the internal C API" for this issue.

    @vstinner
    Copy link
    Member Author

    New changeset 18b5dd6 by Victor Stinner in branch 'main':
    bpo-46836: Move PyFrameObject to pycore_frame.h (GH-31530)
    18b5dd6

    @vstinner
    Copy link
    Member Author

    I announced the change on the capi-sig mailing list:
    https://mail.python.org/archives/list/capi-sig@python.org/thread/RCT4SB5LY5UPRRRALEOHWEQHIXFNTHYF/

    @vstinner
    Copy link
    Member Author

    New changeset 87af12b by Victor Stinner in branch 'main':
    bpo-46836: Rename InterpreterFrame to _PyInterpreterFrame (GH-31583)
    87af12b

    @brandtbucher
    Copy link
    Member

    Victor, can we please revert these changes? They broke Greenlet, a required dependency for three of our performance benchmarks:

    python-greenlet/greenlet#288 (comment)

    I've already spent considerable effort contributing workarounds for other 3.11 breakages in Greenlet and coaxing them to put out a compatible release:

    python-greenlet/greenlet#280

    These changes, however, just seem like needless breakage that are now also impacting our performance work.

    @brandtbucher
    Copy link
    Member

    I'm also very uncomfortable with the lack of review on these PRs. The most recent one (#31583) was open for less than 30 minutes before merging, from 6:57 to 7:22 am in my local time zone.

    @vstinner
    Copy link
    Member Author

    vstinner commented Mar 1, 2022

    I plan to update Cython, greenlet and gevent for this change.

    @vstinner
    Copy link
    Member Author

    vstinner commented Mar 1, 2022

    Draft PR for greenlet: python-greenlet/greenlet#294

    I made these changes close to the Python 3.11 alpha 6 release to be able to test "#if PY_VERSION_HEX < 0x30B00A6" to have code compatible with Python 3.11 alpha 5 and older.

    @vstinner
    Copy link
    Member Author

    vstinner commented Mar 1, 2022

    Draft PR for Cython: cython/cython#4671

    @vstinner
    Copy link
    Member Author

    vstinner commented Mar 1, 2022

    Draft PR for gevent: gevent/gevent#1872

    @vstinner
    Copy link
    Member Author

    vstinner commented Mar 1, 2022

    Draft PR for Cython: cython/cython#4671

    Notes on how Cython access PyFrameObject fields: https://bugs.python.org/issue40421#msg414314

    @vstinner
    Copy link
    Member Author

    New changeset b0f886d by Victor Stinner in branch 'main':
    bpo-46836: Add Doc/c-api/frame.rst (GH-32051)
    b0f886d

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    @vstinner
    Copy link
    Member Author

    vstinner commented May 2, 2022

    I close the issue. I consider that the main issue (Move PyFrameObject to the internal C API) has been implemented.

    I propose to continue the discussion about updating Cython, greenlet and gevent in the bug tracker/on pull requests on these projects.

    Moreover, @markshannon added many getter functions, so the public C API is now enough for most use cases: https://docs.python.org/dev/c-api/frame.html

    I also "backported" these functions to pythoncapi-compat: https://pythoncapi-compat.readthedocs.io/en/latest/api.html#python-3-11

    I know that these changes were painful. IMO we have to go through these changes, and I'm happy that the work was done before Python 3.11 beta 1 (feature freeze). Thanks to everybody who made this enhancement possible! The new C API is now better and should be stable.

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.11 only security fixes topic-C-API
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants