This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Change layout of frames back to specials-locals-stack (from locals-specials-stack)
Type: Stage: resolved
Components: Versions:
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Mark.Shannon Nosy List: Mark.Shannon, pablogsal
Priority: normal Keywords: patch

Created on 2021-08-24 10:18 by Mark.Shannon, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 27933 merged Mark.Shannon, 2021-08-24 17:11
Messages (2)
msg400202 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2021-08-24 10:18
The two plausible layouts from evaluation stack frames are described here:

https://github.com/faster-cpython/ideas/issues/31#issuecomment-844263795

We opted for layout A, although it is a bit more complex to manage and slightly more expensive in terms of pointers. The reason for this was that it theoretically allows zero-copying Python-to-Python calls.

I now believe this was the wrong decision and we should have chosen layout B.

B is cheaper. It needs 2 pointers, not 3, meaning that there is another register available for use in the interpreter.
Also the linkage area doesn't need the nlocalsplus field.

The benefit of zero-copy calls is much smaller than I thought:
* Any calls from a generator functions do not benefit
* An additional check is needed to make sure that both frames are in the same stack chunk
* Any jitted code will keep stack values in registers, so stores will still be needed in either case.
* The average number of arguments copied is low (typically 2 or 3).

Even in the ideal case (interpreter, no generator, same stack chunk) changing to layout B
will cost 2/3 memory moves (independent of each other), but will gain us extra code for checking chunks, and one move (moving nlocalsplus). So at best we only save 1/2 moves.

In other cases layout B is better.

One final improvement to layout B: saving the stackdepth as an offset from locals[0] not from stack[0] further speeds frame handling.
msg400258 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2021-08-25 12:44
New changeset f9242d50b18572ef0d584a1c815ed08d1a38e4f4 by Mark Shannon in branch 'main':
bpo-44990: Change layout of evaluation frames. "Layout B" (GH-27933)
https://github.com/python/cpython/commit/f9242d50b18572ef0d584a1c815ed08d1a38e4f4
History
Date User Action Args
2022-04-11 14:59:49adminsetgithub: 89153
2021-08-26 15:08:57Mark.Shannonsetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2021-08-25 12:44:28Mark.Shannonsetmessages: + msg400258
2021-08-24 17:11:20Mark.Shannonsetkeywords: + patch
stage: patch review
pull_requests: + pull_request26382
2021-08-24 10:18:35Mark.Shannoncreate