Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Code readability: rename InterpreterFrame to _Py_framedata #88963

Closed
ncoghlan opened this issue Aug 1, 2021 · 16 comments
Closed

Code readability: rename InterpreterFrame to _Py_framedata #88963

ncoghlan opened this issue Aug 1, 2021 · 16 comments
Assignees
Labels
3.11 only security fixes type-feature A feature request or enhancement

Comments

@ncoghlan
Copy link
Contributor

ncoghlan commented Aug 1, 2021

BPO 44800
Nosy @ncoghlan, @vstinner, @markshannon, @pablogsal, @brandtbucher
PRs
  • WIP bpo-44800: Rename _PyInterpreterFrame to _Py_framedata #27525
  • [PEP 558 - WIP] bpo-30744: Trace hooks no longer reset closure state #3640
  • bpo-44800: rename _PyInterpreterFrame to _Py_frame #31987
  • bpo-44800: unobtrusively rename _PyInterpreterFrame to _Py_frame #32024
  • bpo-44800: Document internal frame naming conventions #32281
  • Revert "bpo-44800: Document internal frame naming conventions" #32301
  • Dependencies
  • bpo-44590: Create frame objects lazily when needed
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/ncoghlan'
    closed_at = None
    created_at = <Date 2021-08-01.01:48:31.010>
    labels = ['type-feature', '3.11']
    title = 'Code readability: rename InterpreterFrame to `_Py_framedata`'
    updated_at = <Date 2022-04-04.14:09:53.084>
    user = 'https://github.com/ncoghlan'

    bugs.python.org fields:

    activity = <Date 2022-04-04.14:09:53.084>
    actor = 'Mark.Shannon'
    assignee = 'ncoghlan'
    closed = False
    closed_date = None
    closer = None
    components = []
    creation = <Date 2021-08-01.01:48:31.010>
    creator = 'ncoghlan'
    dependencies = ['44590']
    files = []
    hgrepos = []
    issue_num = 44800
    keywords = ['patch']
    message_count = 10.0
    messages = ['398675', '398681', '398690', '399381', '399581', '399582', '414446', '416610', '416613', '416671']
    nosy_count = 5.0
    nosy_names = ['ncoghlan', 'vstinner', 'Mark.Shannon', 'pablogsal', 'brandtbucher']
    pr_nums = ['27525', '3640', '31987', '32024', '32281', '32301']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue44800'
    versions = ['Python 3.11']

    @ncoghlan
    Copy link
    Contributor Author

    ncoghlan commented Aug 1, 2021

    When merging the bpo-44590 changes into my PEP-558 implementation branch, I found it very hard to follow when the code was referring to the new interpreter frames rather than the existing Python frame objects that were historically used for both execution and introspection.

    The "interpreter frame" name was also a little confusing, since the introspection frames are still associated with a specific interpreter, they're just not required for code execution anymore, only for code introspection APIs that call for a full Python object.

    So, inspired by the "gi_xframe" (etc) attributes added in #27077, I'm proposing the following internal refactoring:

    • Rename "pycore_frame.h" to "pycore_xframe.h"
    • Rename the _interpreter_frame struct to _execution_frame
    • Rename the type from InterpreterFrame to ExecFrame
    • Use "xf_" rather than "f_" as the struct field prefix on execution frames
    • Use "xframe" and "xf" rather than "frame" and "f" for execution frame variables
    • Consistently use _PyExecFrame as the access function prefix, rather than a confusing mixture of _PyFrame and _PyInterpreterFrame
    • Rename _PyThreadState_PushFrame to _PyThreadState_PushExecFrame
    • Rename _PyThreadState_PopFrame to _PyThreadState_PopExecFrame

    @ncoghlan ncoghlan added 3.11 only security fixes type-feature A feature request or enhancement labels Aug 1, 2021
    @ncoghlan ncoghlan self-assigned this Aug 1, 2021
    @ncoghlan
    Copy link
    Contributor Author

    ncoghlan commented Aug 1, 2021

    As a side effect of working on this, I noticed some changes that are needed to adapt the GDB integration hooks to the new frame state layout.

    @ncoghlan
    Copy link
    Contributor Author

    ncoghlan commented Aug 1, 2021

    PR for this proposed refactoring is now up, with a review requested from Mark: #27525

    The PR mostly follows what I originally posted, except that I went with _Py_execution_frame and _PyExecFrame for the struct and typedef names, since these are present in a public header.

    The outdated GDB hooks I found were in the gdbinit example file, rather than the actual libpython.py file that test_gdb covers. I suspect the gdbinit example will need further adjustments to actually work with Python 3.11, so rather than fully updating the implementation dependent pieces, I just added a comment suggesting it be checked after the interpreter optimisation development for 3.11 settles down.

    @ncoghlan
    Copy link
    Contributor Author

    Mark raised some valid concerns with the proposed naming convention over on the PR:

    • the proposed names make it sound like there are genuinely two kinds of frame, when the actual relationship is between a frame's data storage and a Python object providing an interface to that storage
    • _PyExecFrame looks like an actual Python type name, so we probably want something more like "Py_buffer" (where the lowercase name is intended to indicate that the struct isn't a full Python type)

    Mark offered "activation record" as a possible name, but I'm going to see how "_Py_framedata" looks first (with "fdata" as the abbreviated form, since "fd" is so frequently used to mean "file descriptor")

    I'm also going to see how the PR looks if both the frame and frame data struct keep their existing field prefixes - it may be that changing variable names will be enough to avoid ambiguity, in which case leaving the field names alone genuinely reduces code churn.

    @ncoghlan
    Copy link
    Contributor Author

    PR has been updated with a new API proposal prompted by Mark's review comments on the original proposal.

    • Rename "pycore_frame.h" to "pycore_framedata.h"
    • Rename the _interpreter_frame struct to _Py_execution_frame
    • Rename the type from InterpreterFrame to _Py_framedata
    • Rather than 6 fields with no prefix, and 6 fields with the "f_" prefix, _Py_framedata fields now consistently have no prefix (globals, builtins, locals, code, lasti, and stack are affected by this)
    • Use "fdata" rather than a mixture of "frame" and "f" for frame data variables
    • Generators and coroutines use gi_fdata (etc) rather than gi_xframe as their field name
    • Frame objects use f_fdata rather than f_frame as their field name
    • Thread states use fdata rather than frame as their field name
    • Consistently use _Py_frameobject as the access function prefix, rather than a confusing mixture of _PyFrame and _PyInterpreterFrame
    • Leave the name of _PyThreadState_PushFrame alone
    • Leave the name of _PyThreadState_PopFrame alone

    @ncoghlan ncoghlan changed the title Code readability: rename interpreter frames to execution frames Code readability: rename InterpreterFrame to _Py_framedata Aug 14, 2021
    @ncoghlan
    Copy link
    Contributor Author

    From a naming convention perspective, the code comments and NEWS entry in the PR now refer to "full frame objects" (PyFrameObject) and "frame data storage structs" (_Py_framedata) to avoid giving the misleading impression that introspection and execution frames are different Python types.

    @vstinner
    Copy link
    Member

    vstinner commented Mar 3, 2022

    Oh. I didn't know this issue. I recently made changes around PyFrameObject:

    • Move PyFrameObject to the internal C API (see bpo-46836 for the rationale)
    • Rename CFrame to _PyCFrame
    • Rename InterpreterFrame to _PyInterpreterFrame

    I prepared PRs for Cython, greenlet and gevent to use the internal C API pycore_frame.h to get the PyFrameObject structure:

    https://bugs.python.org/issue46836#msg414283

    For the short term, these projects should use the internal C API. But I sent a call to add getter and setter functions:

    https://mail.python.org/archives/list/capi-sig@python.org/thread/RCT4SB5LY5UPRRRALEOHWEQHIXFNTHYF/

    If possible, it would be great to have a public C API so these projects don't use the internal C API at all, that's being discussed at:

    In terms of backward compatibility, since PyFrameObject is now part of the internal C API, we can break things. In practice... it's better to not break 3rd party code too often. See for example Brandt Bucher who is directly impacted by these changes:

    https://bugs.python.org/issue46836#msg414279

    @ncoghlan
    Copy link
    Contributor Author

    ncoghlan commented Apr 3, 2022

    Core dev forum thread: https://discuss.python.org/t/proposal-rename-pyinterpreterframe-struct-as-py-framedata/14213

    The conclusion from the forum thread and associated PRs was that any of the further renaming proposed didn't provide sufficient additional clarity to justify further code churn in a semi public API.

    As a result, the final PR just documents the status quo in the internal C API frame header (since the conventions arising from the code-churn-minimising migration to a split data structure is hard to infer just from reading the code): #32281

    @ncoghlan
    Copy link
    Contributor Author

    ncoghlan commented Apr 3, 2022

    New changeset 124227c by Nick Coghlan in branch 'main':
    bpo-44800: Document internal frame naming conventions (GH-32281)
    124227c

    @markshannon
    Copy link
    Member

    New changeset 8a349eb by Mark Shannon in branch 'main':
    Revert "bpo-44800: Document internal frame naming conventions (GH-32281)" (bpo-32301)
    8a349eb

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    @gpshead
    Copy link
    Member

    gpshead commented Apr 16, 2022

    @markshannon Why did you revert the documentation? Please provide rationale in the commit message and loop in the authors of the commit you are reverting when you propose a PR to revert something.

    @ncoghlan
    Copy link
    Contributor Author

    With #32304 merged, some of the text I had added genuinely became redundant.

    However, the key part of the commit that resolved this issue has still been lost, which was helping readers to navigate the confusing mix of naming conventions arising from the combination of:

    • minimising code churn when splitting the struct definitions
    • the difficulty of making purely cosmetic changes to this nominally internal API due to its use in 3rd party projects that are tightly coupled to the eval loop implementation

    That confusion in naming conventions is excusable given the code history, but it can't readily be inferred from the code itself.

    However, the documentation of it can be much tighter now that the separate frame stack documentation exists, since it only needs to cover the naming conventions, it doesn't need to try to explain the different object types.

    @ncoghlan ncoghlan reopened this Apr 17, 2022
    @markshannon
    Copy link
    Member

    Apparently I closed this 13 hours ago. Didn't mean to.
    I'll blame the new workflow, but it was probably just a fat finger.

    @gpshead
    Explanation and notification of Nick is here: #32281 (comment)

    @ncoghlan Could you make a new PR adding any additional documentation you want? Thanks.

    As an aside, if 3rd party code is accessing fields in either struct, then we need to further flesh out the C-API function.
    IMO, the names of fields should be largely irrelevant, as any code outside of the interpreter code should be using the API functions, not probing the internals.

    @ncoghlan
    Copy link
    Contributor Author

    My apologies, I managed to miss the GitHub notification for the comment on the previous PR, so I incorrectly thought the commit had been reverted without notice when I saw the second commit notification from bugs.python.org.

    I didn't expect the new block comment to be controversial, as I first requested feedback on that more than 6 months ago in #27525 (comment) and until I saw the reversion commit message I had no idea that you considered the new text to be in any way misleading. All of your comments on the PRs and the Discourse discussion had focused on whether the naming changes meaningfully improved code readability, so I took that as meaning you thought the block comment itself was essentially OK.

    Since it was only a comment, I also figured it would be easy enough to fix up any issues with it in a follow-up PR (e.g. duplicating the field names was a definite maintenance hassle, but also easy enough to fix without a wholesale commit reversion).

    @ncoghlan
    Copy link
    Contributor Author

    ncoghlan commented Apr 19, 2022

    The specific items I plan to cover in a follow-up PR are aimed at CPython API and code maintainers, rather than consumers of the API headers:

    • keeping interpreter frame pointers out of public PyFrame_* APIs (and any other regular public C API functions). This may need an exception for APIs aimed specifically at folks using the PEP 523 hooks to implement alternate code evaluation loops (since those will have all the same performance concerns with full frame object allocation as the default eval loop).
    • noting that private _PyFrame_ APIs (and other internal C APIs) may relate to either interpreter frames or frame objects depending on whether any given API needs to be usable without incurring the cost of allocating a full frame object
    • noting the currently in use local variable and parameter naming conventions for the two structs
    • pointing out that "cframe" should be avoided as an abbreviation for "current frame" in new code, since "C frame" structs now exist in their own right

    The new markdown file will be helpful in that regard, as that info really is mostly clutter in the header file - its utility is in consolidating that info in one place for future maintainers during their initial exploration of the frame management code, rather than something it would be necessary to refer back to regularly.

    Even if we do manage to get public (or semi-public) APIs in place to limit potential ripple effects on 3rd party projects for cleaning up the internal APIs, the internal cost of actually changing any of this code is still relatively high in terms of introducing merge conflicts into open development PRs and bug fix backports to the 3.11 maintenance branch. Even the "unobtrusive" variant of the PRs that renamed the structs still had to touch 17 files. The first PR that also renamed the struct fields touched 24 files at the point where I cancelled it, but earlier iterations that tried to change internal API function names to reflect which struct they worked with touched even more files.

    @ncoghlan
    Copy link
    Contributor Author

    ncoghlan commented May 1, 2022

    I belatedly realised that the scope here has been so drastically reduced (from renaming structs to simply documenting naming and API conventions), that it doesn't make sense to keep using the same ticket number.

    Instead, I'll file a new ticket and PR with the conventions as I understand them after the unstable API tier PEP has been resolved.

    @ncoghlan ncoghlan closed this as completed May 1, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.11 only security fixes type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants