This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author njs
Recipients Mark.Shannon, arigo, belopolsky, benjamin.peterson, ncoghlan, njs, xgdomingo, yselivanov
Date 2017-06-25.06:08:26
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1498370907.85.0.978265557857.issue30744@psf.upfronthosting.co.za>
In-reply-to
Content
It isn't obvious to me whether the write-through proxy idea is a good one on net, but here's the rationale for why it might be.

Currently, the user-visible semantics of locals() and f_locals are a bit complicated. AFAIK they aren't documented anywhere outside the CPython and PyPy source (PyPy is careful to match all these details), but let me take a stab at it:

The mapping object you get from locals()/f_locals represents the relevant frame's local namespace. For many frames (e.g. module level, class level, anything at the REPL), it literally *is* the local namespace: changes made by executing bytecode are immediately represented in it, and changes made to it are immediately visible to executing bytecode. Except, for function frames, it acts more like a snapshot copy of the local namespace: it shows you the namespace at the moment you call locals(), but then future changes to either the code's namespace or the object don't affect each other. Except except, the snapshot might be automatically updated later to incorporate namespace changes, e.g. if I do 'ns = locals()' and then later on someone accesses the frame's f_locals attribute, then reading that attribute will cause my 'ns' object to be silently updated. But it's still a snapshot; modifications to the mapping aren't visible to the executing frame. Except**3, if you happen to modify the mapping object while you're inside a trace function callback, then *those* modifications are visible to the executing frame. (And also if a function is being traced then as a side-effect this means that now our 'ns' object above does stay constantly up to date.) Except**4, you don't actually have to be inside a trace function callback for your modifications to be visible to the executing frame – all that's necessary is that *some* thread somewhere is currently inside a trace callback (even if it doesn't modify or even look at the locals itself, as e.g. coverage.py doesn't).

This causes a lot of confusion [1].

On top of that, we have this bug here. The writeback-only-if-changed idea would make it so that we at least correctly implement the semantics I described in the long paragraph above. But I wonder if maybe we should consider this an opportunity to fix the underlying problem, which is that allowing skew between locals() and the actual execution namespace is this ongoing factory for bugs and despair. Specifically, I'm wondering if we could make the semantics be:

"locals() and f_locals return a dict-like object representing the local namespace of the given frame. Modifying this object and modifying the corresponding local variables are equivalent operations in all cases."

(So I guess this would mean a proxy object that on reads checks the fast array first and then falls back to the dict, and on writes updates the fast array as well as the dict.)

> you can still have race conditions between "read-update-writeback" operations that affect the cells directly, as well as with those that use the new write-through proxy.

Sure, but that's just a standard concurrent access problem, no different from any other case where you have two different threads trying to mutate the same local variable or dictionary key at the same time. Everyone who uses threads knows that if you want to do that then you need a mutex, and if you don't use proper locking then it's widely understood how to recognize and debug the resulting failure modes. OTOH, the current situation where modifications to the locals object sometimes affect the namespace, and sometimes not, and sometimes they get overwritten, and sometimes they don't, and it sometimes depends on spooky unrelated things like "is some other thread currently being traced"? That's *way* more confusing that figuring out that there might be a race condition between 'x = 1' and 'locals()["x"] = 2'.

Plus, pdb depends on write-through working, and there are lots of frames that don't even use the fast array and already have my proposed semantics. So realistically our choices are either "consistently write-through" or "inconsistently write-through".

[1] https://www.google.com/search?q=python+modify+locals&ie=utf-8&oe=utf-8
History
Date User Action Args
2017-06-25 06:08:27njssetrecipients: + njs, arigo, ncoghlan, belopolsky, benjamin.peterson, Mark.Shannon, yselivanov, xgdomingo
2017-06-25 06:08:27njssetmessageid: <1498370907.85.0.978265557857.issue30744@psf.upfronthosting.co.za>
2017-06-25 06:08:27njslinkissue30744 messages
2017-06-25 06:08:26njscreate