This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Disable automatic update of frame locals during tracing
Type: performance Stage: patch review
Components: Interpreter Core Versions: Python 3.11
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Mark.Shannon, fabioz, ncoghlan, nedbat, python-dev, vstinner
Priority: normal Keywords: patch

Created on 2020-10-29 14:16 by fabioz, last changed 2022-04-11 14:59 by admin.

Pull Requests
URL Status Linked Edit
PR 23028 closed python-dev, 2020-10-29 15:41
PR 32055 closed Mark.Shannon, 2022-03-22 17:54
Messages (19)
msg379874 - (view) Author: Fabio Zadrozny (fabioz) * Date: 2020-10-29 14:16
Right now, when a debugger is active, the number of local variables can affect the tracing speed quite a lot.

For instance, having tracing setup in a program such as the one below takes 4.64 seconds to run, yet, changing all the variables to have the same name -- i.e.: change all assignments to `a = 1` (such that there's only a single variable in the namespace), it takes 1.47 seconds (in my machine)... the higher the number of variables, the slower the tracing becomes.

```
import time
t = time.time()

def call():
    a = 1
    b = 1
    c = 1
    d = 1
    e = 1
    f = 1

def noop(frame, event, arg):
    return noop

import sys
sys.settrace(noop)

for i in range(1_000_000):
    call()

print('%.2fs' % (time.time() - t,))
```

This happens because `PyFrame_FastToLocalsWithError` and `PyFrame_LocalsToFast` are called inside the `call_trampoline` (https://github.com/python/cpython/blob/master/Python/sysmodule.c#L946).

So, I'd like to simply remove those calls.

Debuggers can call  `PyFrame_LocalsToFast` when needed -- otherwise mutating non-current frames doesn't work anyways. As a note, pydevd already has such a call: https://github.com/fabioz/PyDev.Debugger/blob/0d4d210f01a1c0a8647178b2e665b53ab113509d/_pydevd_bundle/pydevd_save_locals.py#L57 and PyPy also has a counterpart.

As for `PyFrame_FastToLocalsWithError`, I don't really see any reason to call it at all.

i.e.: something as the code below prints the `a` variable from the `main()` frame regardless of that and I checked all pydevd tests and nothing seems to be affected (it seems that accessing f_locals already does this: https://github.com/python/cpython/blob/cb9879b948a19c9434316f8ab6aba9c4601a8173/Objects/frameobject.c#L35, so, I don't see much reason to call it at all).

```
def call():
    import sys
    frame = sys._getframe()
    print(frame.f_back.f_locals)
   
def main():
    a = 1
    call()
   
if __name__ == '__main__':
    main()
```

So, the idea is removing those lines (I expect that I'll be able to provide a pull request for this).
msg385320 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-01-20 09:03
Fabio created a thread on python-dev:
https://mail.python.org/archives/list/python-dev@python.org/thread/62WK3THUDNWZCDOMXXDZFG3O4LIOIP4W/

In 2017, Nick Coghlan wrote the PEP 558 "Defined semantics for locals()".
msg385323 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-01-20 09:23
Mark: do you think that it's an acceptable trade-off to ask debuggers and profilers to call explicitly PyFrame_FastToLocalsWithError() and PyFrame_LocalsToFast(), to optimize the common case, when locals are not needed?
msg385329 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2021-01-20 11:01
I don't think so.
Tracing is already somewhat fragile, see https://bugs.python.org/issue30744.
Making it more complex is likely to add more bugs.

PEP 558 should help, as f_locals becomes a proxy. 
There are some minor issues with PEP 558 (I'm not convinced that it is completely robust), but it should fix this issue.

Ultimately, what we need is a decent debugger API, something along the lines of the JVMTI.
msg385336 - (view) Author: Fabio Zadrozny (fabioz) * Date: 2021-01-20 11:30
I agree that it can be made better, but I think most of the issues right now comes from CPython trying to automatically do something that's not bound to work (calling PyFrame_FastToLocals/PyFrame_LocalsToFast under the hood during the tracing call).

https://bugs.python.org/issue30744 is a great example of why that can never work properly (debuggers need to manage that much more carefully, just doing it on all calls is really bound to not work).

So, the current implementation besides being broken also makes things pretty slow.

I agree that PEP 558 may fix things here (but it's much more work).

As a disclaimer, pydevd actually uses a different tracer which doesn't do those calls and when possible uses the frame eval to modify the bytecode directly to minimize the overhead, so, in practice the current support given by CPython for debugger is pretty reasonable for doing a fast debugger (but there are a few caveats that it must work around -- such as the auto PyFrame_FastToLocals/PyFrame_LocalsToFast calls).
msg396563 - (view) Author: Alyssa Coghlan (ncoghlan) * (Python committer) Date: 2021-06-27 07:40
I've updated PEP 558 with a reference back to this ticket as additional motivation (the update also addresses several of the fragility concerns that Mark raised): https://github.com/python/peps/pull/1787

There's still work to be done to bring the reference implementation into line with the updated proposal, but I'm making it an explicit goal to get the proposal submitted for pronouncement in time for inclusion in 3.11.
msg396572 - (view) Author: Fabio Zadrozny (fabioz) * Date: 2021-06-27 10:57
@ncoghlan I took a quick look at the PEP...

I'm a bit worried about:

> On optimised frames, the Python level f_locals API will become a direct read/write proxy for the frame's local and closure variable storage, and hence no longer support storing additional data that doesn't correspond to a local or closure variable on the underyling frame object.

In particular, it's common for a debugger to evaluate expressions in a context that would change the locals to add new variables.

i.e.: stopping at a breakpoint and doing 2 `exec` calls with frame.f_locals with:
import some_module
v = some_module.compute(xxx)

So, it's expected that `some_module` and `v` would be in the locals at this point.

Right now after each line of the evaluation, a `PyFrame_FastToLocals` must be called so things work as they should, but given that the PEP explicitly says that it should be deprecated, and this being a common feature for a debugger, what'd be the proper way to support that?

p.s.: should I ask such questions regarding the PEP somewhere else instead of this issue or is this an appropriate place?
msg396574 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2021-06-27 11:59
> So, it's expected that `some_module` and `v` would be in the locals at this point.

If a function does not have the local variables `some_module` and `v`, then the change wouldn't be visible to the debugee.
So what difference does it make?
msg396576 - (view) Author: Fabio Zadrozny (fabioz) * Date: 2021-06-27 12:13
> > So, it's expected that `some_module` and `v` would be in the locals at this point.

> If a function does not have the local variables `some_module` and `v`, then the change wouldn't be visible to the debugee.
So what difference does it make?

Right now such changes are visible to the debugee in the locals frames if a user does the `exec` and calls `PyFrame_FastToLocals` right afterwards (even if they weren't initially there).

So, it's the difference between being able to import a module and creating/manipulating new variables in an `exec` in any frame (as it works right now) or not being able to make it at all (if that feature is deprecated as is being implied in the PEP).
msg411108 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2022-01-21 11:01
While the various frame and debugger PEPs that are open offer a better solution to this, they might not be accepted for 3.11.

So I'd like to revisit this.


Removing the calls to `PyFrame_FastToLocals` and friends cuts the overhead of tracing down a lot. A *lot*.

Using Fabio's example I get the following slowdowns:
No calls to   PyFrame_FastToLocals`: x10
Main branch with one local variable: x16
Main branch with six locals variables: x50

This is quite a compelling reason to remove the calls to `PyFrame_FastToLocals`.

The questions is "what will we break doing this"?
All the tests pass, so that seems promising.

Ned, how would this impact coverage.py?
msg411111 - (view) Author: Ned Batchelder (nedbat) * (Python triager) Date: 2022-01-21 11:45
Off the top of my head, I'm not sure this affects coverage.py at all, but I could be missing something.  Does PR 23028 have all the changes, or is there some other python/cpython branch I can test with?
msg411112 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2022-01-21 11:55
Yes the PR has all the changes.

It is just the changes sysmodule.c that you need:
https://github.com/python/cpython/pull/23028/files#diff-a3a5c73931235f7f344c072dc755d6508e13923db3f5d581c5e88652075871cb
msg411114 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2022-01-21 11:57
Or you can use this branch:
https://github.com/faster-cpython/cpython/tree/dont-fast-to-locals-in-trampoline
msg411145 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-01-21 16:40
As I already wrote previously, I'm a supporter of requiring debuggers and profilers to explicitly call PyFrame_FastToLocalsWithError() and PyFrame_LocalsToFast(). It should only impact a minority of users, whereas the majority will benefit of way better performance, no?
msg411177 - (view) Author: Ned Batchelder (nedbat) * (Python triager) Date: 2022-01-21 19:55
I ran the coverage.py test suite using https://github.com/faster-cpython/cpython/tree/dont-fast-to-locals-in-trampoline, and there were no failures.
msg411210 - (view) Author: Ned Batchelder (nedbat) * (Python triager) Date: 2022-01-21 23:51
And btw, the tests with that branch ran at least twice as fast as with 3.10....!
msg415801 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-03-22 18:09
See also bpo-47092: [C API] Add PyFrame_GetVar(frame, name) function.
msg415998 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2022-03-25 12:57
New changeset d7163bb35d1ed46bde9affcd4eb267dfd0b703dd by Mark Shannon in branch 'main':
bpo-42197: Don't create `f_locals` dictionary unless we actually need it. (GH-32055)
https://github.com/python/cpython/commit/d7163bb35d1ed46bde9affcd4eb267dfd0b703dd
msg416145 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-03-28 06:57
Mark: Please add the new PyFrame_GetLocals() function to the C API > New Features doc:
https://docs.python.org/dev/whatsnew/3.11.html#id1
History
Date User Action Args
2022-04-11 14:59:37adminsetgithub: 86363
2022-03-28 06:57:42vstinnersetmessages: + msg416145
2022-03-25 12:57:58Mark.Shannonsetmessages: + msg415998
2022-03-22 18:09:13vstinnersetmessages: + msg415801
2022-03-22 17:54:49Mark.Shannonsetpull_requests: + pull_request30147
2022-01-21 23:51:32nedbatsetmessages: + msg411210
2022-01-21 19:55:07nedbatsetmessages: + msg411177
2022-01-21 16:40:54vstinnersetmessages: + msg411145
2022-01-21 11:57:31Mark.Shannonsetmessages: + msg411114
2022-01-21 11:55:52Mark.Shannonsetmessages: + msg411112
2022-01-21 11:45:21nedbatsetmessages: + msg411111
2022-01-21 11:01:39Mark.Shannonsetnosy: + nedbat
messages: + msg411108
2021-06-27 12:13:55fabiozsetmessages: + msg396576
2021-06-27 11:59:27Mark.Shannonsetmessages: + msg396574
2021-06-27 10:57:34fabiozsetmessages: + msg396572
2021-06-27 07:40:13ncoghlansetmessages: + msg396563
versions: + Python 3.11, - Python 3.10
2021-01-20 11:30:22fabiozsetmessages: + msg385336
2021-01-20 11:01:17Mark.Shannonsetmessages: + msg385329
2021-01-20 09:23:26vstinnersetnosy: + ncoghlan, Mark.Shannon
messages: + msg385323
2021-01-20 09:03:58vstinnersetmessages: + msg385320
2020-10-29 21:01:29vstinnersetnosy: + vstinner
2020-10-29 15:41:05python-devsetkeywords: + patch
nosy: + python-dev

pull_requests: + pull_request21947
stage: patch review
2020-10-29 14:16:41fabiozcreate