This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: C extensions can't swap out live frames anymore
Type: Stage: resolved
Components: Interpreter Core Versions: Python 3.11
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: brandtbucher Nosy List: Mark.Shannon, barry, brandtbucher, gvanrossum, jmadden, pablogsal
Priority: normal Keywords: patch

Created on 2021-12-15 20:55 by brandtbucher, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 30234 merged brandtbucher, 2021-12-23 01:14
Messages (9)
msg408646 - (view) Author: Brandt Bucher (brandtbucher) * (Python committer) Date: 2021-12-15 20:55
I'm specifically concerned about Greenlet here (since it's a dependency of pyperformance), but this discussion is equally relevant to any library like it that dynamically swaps out the currently executing frame.

CPython 3.11 makes several major changes to how frame objects are used and represented internally:

- bpo-44032: Move data stack to thread from FrameObject. (https://github.com/python/cpython/pull/26076)
- bpo-44590: Lazily allocate frame objects (https://github.com/python/cpython/pull/27077)
- bpo-45637: Store the frame pointer in the cframe (https://github.com/python/cpython/pull/29267)

These changes break Greenlet's hot-swapping of frame objects (https://github.com/python-greenlet/greenlet/blob/be41e1a24925326b72a02ef5cb6d1ed9643eb062/src/greenlet/greenlet_greenlet.hpp#L768-L811) in a pretty serious way, and it's not immediately clear what the best fix is. A fairly high-level overview of the new design:

When a frame is executing, the current thread state points to the current CFrame, which itself points to a C-level InterpreterFrame. This interpreter frame is located within a "data stack" that is managed by the thread state. If a PyFrameObject for the currently executing frame is requested (using PyThreadState_GetFrame), the new PyFrameObject points into the InterpreterFrame, where all of the important data is still stored. So far so good.

The issue is what happens next. If the InterpreterFrame is replaced, or exits in any way, its memory may be reused, or even freed entirely. The PyFrameObject is smart enough to copy the data elsewhere (into itself) when this happens, but the new design means that there is no obvious way for a third-party library to "reactivate" a frame in a way analogous to assigning to the old tstate->frame member.

While I'm pretty sure that we don't officially support this "feature", it's probably used often enough that we should at least brainstorm a workaround (if not an unstable C-API function) for affected projects. I suspect that a potential solution likely involves creating an entirely new InterpreterFrame the normal way, and copying the PyFrameObject's InterpreterFrame data into it? Not sure what would happen if the old InterpreterFrame was still alive on the data stack, though...

In any case, it probably makes sense to discuss here with affected library maintainers.
msg408802 - (view) Author: Brandt Bucher (brandtbucher) * (Python committer) Date: 2021-12-17 17:49
I'm going to take a stab at making this work today. The idea is to do something like this when setting a new frame:

- Clear and pop tstate->cframe->current_frame and all of its linked predecessors.
  - These should all belong to the same CFrame, right?
- Push copies of the new InterpreterFrame and all of its predecessors (bottom to top, of course), and update all of the PyFrameObjects to point to the new copies.
  - I'm just going to assert(f->f_owns_frame) before copying each of these (and set it to 0, after). I don't think it could ever be valid to stick a new copy of an active frame onto the top of the stack?
- Set tstate->cframe->current_frame to the new, top InterpreterFrame.
  - If my understanding of the new CFrame mechanics is correct, I don't think we'd need to touch the C stack at all? Maybe I'm missing something, though... if we do need to change the size of the C stack, this whole thing becomes a *lot* more painful.

Anyone see any obvious flaws here? I'm new to a lot of this code, so any guidance would be appreciated.
msg408803 - (view) Author: Brandt Bucher (brandtbucher) * (Python committer) Date: 2021-12-17 17:54
Also, it appears that in earlier versions of CPython, it was okay to set tstate->frame = NULL. Some of Greenlet's tests *seem* to be doing this (although it also possible that my current scaffolding has just broken Greenlet).

Does anybody know off the top of their head what it means to set a NULL frame? I can dig around through the old code and see what the interpreter does when that happens, but it seems pretty nonsensical to me. Is it something that only happens during interpreter/thread initialization/shutdown? Or maybe it keeps that thread from becoming active?
msg408819 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2021-12-17 21:38
A thread without a frame looks like a natural end case when you consider the frames as a linked list. Could it be that exec()/eval() run in the current frame, and if there isn't a frame, they refuse to run? (Sorry, I don't recall anything more specific without going hunting through old code.)
msg408825 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-12-18 00:14
That is weird, but can happen for example if some pthread of some c extensión for example picks up the Gil, causing a new PyThreadState to be created and linked but has not called into Python yet.

Basically, an external native thread that calls 
PyGILState_Ensure leaves a thread state with these characteristics.
msg408826 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-12-18 00:16
Relevant section:

 https://docs.python.org/3/c-api/init.html#non-python-created-threads
msg409005 - (view) Author: Brandt Bucher (brandtbucher) * (Python committer) Date: 2021-12-22 00:34
Fixing this actually ended up being a lot simpler than I anticipated (and it requires no changes on our end). All that's really required is saving and restoring:
- tstate->cframe->current_frame
- tstate->datastack_chunk
- tstate->datastack_top
- tstate->datastack_limit

You can see my proposed fix for Greenlet here (this PR also contains some other 3.11-related fixes):

https://github.com/python-greenlet/greenlet/pull/280

There is one weird edge case in Greenlet's use that I don't *entirely* understand, where it sometimes needs to update a thread with a NULL frame (as discussed above). We need to replace the thread's datastack with *something* in this case: setting it to NULL crashes the interpreter when it tries to allocate frames later, but leaving it as-is puts us in a situation where two threads are sharing the same datastack (also a recipe for crashes). My solution is to set it to a statically allocated zero-size "dummy chunk" in this case. It seems to work fine, since the zero-size forces the interpreter to allocate a "real" chunk on the next frame push.

If this solution seems too hack-y, a very simple change on our end could be to start treating NULL as a valid datastack value, and have the thread state check for it when pushing new chunks. That would make for a relatively clean upgrade path for libraries like Greenlet, and doesn't require adding any new APIs.
msg409058 - (view) Author: Brandt Bucher (brandtbucher) * (Python committer) Date: 2021-12-23 01:06
After further discussion with the team this morning, we decided that the proposed "dummy chunk" workaround is unacceptable (the interpreter may end up trying free it when shutting down a thread).

Allowing the tstate->datastack_* members to be NULL is very easy on our end, and even simplifies thread creation a tad (since we can defer the allocation of the datastack until the first frame is actually needed).

I'll have a PR up in a few minutes.
msg409262 - (view) Author: Brandt Bucher (brandtbucher) * (Python committer) Date: 2021-12-28 17:49
New changeset 77195cd44b2506cda88a3cfc98918526068b1d46 by Brandt Bucher in branch 'main':
bpo-46090: Allow PyThreadState.datastack_* members to be NULL (GH-30234)
https://github.com/python/cpython/commit/77195cd44b2506cda88a3cfc98918526068b1d46
History
Date User Action Args
2022-04-11 14:59:53adminsetgithub: 90248
2021-12-28 17:50:32brandtbuchersetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2021-12-28 17:49:52brandtbuchersetmessages: + msg409262
2021-12-24 00:58:19barrysetnosy: + barry
2021-12-23 01:14:14brandtbuchersetkeywords: + patch
stage: patch review
pull_requests: + pull_request28455
2021-12-23 01:06:51brandtbuchersetmessages: + msg409058
2021-12-22 00:34:22brandtbuchersetmessages: + msg409005
2021-12-18 00:16:43pablogsalsetmessages: + msg408826
2021-12-18 00:14:41pablogsalsetmessages: + msg408825
2021-12-17 21:38:16gvanrossumsetmessages: + msg408819
2021-12-17 17:54:48brandtbuchersetmessages: + msg408803
2021-12-17 17:49:47brandtbuchersetmessages: + msg408802
2021-12-16 15:43:31jmaddensetnosy: + jmadden
2021-12-15 20:55:31brandtbuchercreate