Title: Block stack size for frame objects should be dynamically sizable
Type: enhancement Stage: patch review
Components: Interpreter Core Versions: Python 3.10
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Mark.Shannon, gvanrossum, python-dev, serhiy.storchaka, tomkpz
Priority: normal Keywords: patch

Created on 2021-01-13 01:54 by tomkpz, last changed 2021-01-21 11:31 by Mark.Shannon.

Pull Requests
URL Status Linked Edit
PR 24204 open python-dev, 2021-01-13 02:07
Messages (5)
msg384991 - (view) Author: Thomas Anderson (tomkpz) * Date: 2021-01-13 01:54
Currently the block stack size is hardcoded to 20.  Similar to how the value stack is dynamically sizable, we should make the block stack dynamically sizable.  This will reduce space on average (since the typical number of blocks for a function is well below 20) and allow code generators to generate code with more deep nesting.  Note: the motivation is not necessarily to reduce memory usage, but to make L1 cache misses less likely for stack objects.
msg385111 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2021-01-15 13:37
Reducing the size of the frame object seems like a worthwhile goal, but what's the point in increasing the maximum block stack?
msg385120 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2021-01-15 17:06
Getting rid of hardcoded limits is good. And at first look the proposed PR looks good (some minor details can be discussed).

Although there is different approach to solve the same problem. The stack of blocks is used to set handlers for exceptions. For example, when enter the try block, it pushes the handler that points to except and/or finally clauses, when enter the with block, it pushes the handler that calls __exit__, etc. The stack of handlers is changed at run time, and it was the only solution to this problem. But after reorganizes of bytecode in latest Python versions (mainly by Mark) it is now possible to determine handlers for every instruction statically, at compile time. Instead of stack of blocks we would have a table of addresses of handlers. It is more efficient approach and it is used in modern C++ compilers. The only technical issue is compact and platform-independent representation of the table (because the size of the uncompressed table would be larger than the size of the code, but most of entries are repeated and usually are small integers).

It would make PR 24204 unneeded, so I suggest to wait some time before reviewing it.
msg385125 - (view) Author: Thomas Anderson (tomkpz) * Date: 2021-01-15 19:08
> Reducing the size of the frame object seems like a worthwhile goal, but what's the point in increasing the maximum block stack?

No point for humans, but it may be useful for code generators.
msg385409 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2021-01-21 11:31
I see no advantage of getting rid of the limit of 20.

No one ever gets near 20 deep in practice.
Given the limit has been there for so long, it is likely that some tooling that expects the depth of try-statements to be limited.

Why would a code generator need to nest try statements so deeply? I'm curious.
Date User Action Args
2021-01-21 11:31:27Mark.Shannonsetmessages: + msg385409
2021-01-15 19:08:40tomkpzsetmessages: + msg385125
2021-01-15 17:06:16serhiy.storchakasetnosy: + gvanrossum, serhiy.storchaka
messages: + msg385120
2021-01-15 13:37:26Mark.Shannonsetnosy: + Mark.Shannon
messages: + msg385111
2021-01-13 02:07:40python-devsetkeywords: + patch
nosy: + python-dev

pull_requests: + pull_request23029
stage: patch review
2021-01-13 01:54:09tomkpzcreate