Message 304099 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	ncoghlan
Recipients	Mark.Shannon, arigo, belopolsky, benjamin.peterson, ncoghlan, njs, vstinner, xdegaye, xgdomingo, yselivanov
Date	2017-10-11.02:13:09
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1507687991.44.0.213398074469.issue30744@psf.upfronthosting.co.za>
In-reply-to

Content
I've been thinking further about the write-through proxy idea, and I think I've come up with a design for one that shouldn't be too hard to implement, while also avoiding all of the problems that we want to avoid. The core of the idea is that the proxy type would just be a wrapper around two dictionaries: - the existing f_locals dictionary - a new dictionary mapping cell & free variable names to their respective cells (note: this may not actually need to be a dict, as a direct reference from the proxy back to the frame may also suffice. However, I find it easier to think about the design by assuming this will be a lazily initialised dict in its own right) Most operations on the proxy would just be passed through to f_locals, but for keys in both dictionaries, set and delete operations would also affect the cell in the cell dictionary. (Fortunately dict views don't expose any mutation methods, or intercepting all changes to the mapping would be a lot trickier) Frames would gain a new lazily initialised "f_traceproxy" field that defaults to NULL/None. For code objects that don't define or reference any cells, nothing would change relative to today. For code objects that do define or reference cells though, tracing would change as follows: * before calling the trace function: - f_locals would be updated from the fast locals array and current cell values as usual - f_locals on the frame would be swapped out for f_traceproxy (creating the latter if needed) * after returning from the trace function: - f_locals on the frame would be reset back to bypassing the proxy (so writes to f_locals stop being written through to cells when the trace hook isn't running) - only the actual locals would be written from f_locals back to the fast locals array (cell updates are assumed to have already been handled via the proxy) This preserves all current behaviour except the unwanted one of resetting cells back to their pre-tracehook value after returning from a trace hook: * code outside trace hooks can't mutate the function level fast locals or cells via locals() or frame.f_locals (since their modifications will be overwritten immediately before the trace function runs), but can treat it as a regular namespace otherwise * code inside trace hooks can mutate function level fast locals and cells just by modifying frame.f_locals * all code can display the current value of function level fast locals and cells just by displaying locals() or frame.f_locals * there's still only one f_locals dictionary per frame, it may just have a proxy object intercepting writes to cell variables when a trace hook is running That way, we can avoid the problem with overwriting cells back to old values, without enabling arbitrary writes to function locals from outside trace functions, and without introducing any tricky new state synchronisation problems.

I've been thinking further about the write-through proxy idea, and I think I've come up with a design for one that shouldn't be too hard to implement, while also avoiding all of the problems that we want to avoid.

The core of the idea is that the proxy type would just be a wrapper around two dictionaries:

- the existing f_locals dictionary
- a new dictionary mapping cell & free variable names to their respective cells (note: this may not actually need to be a dict, as a direct reference from the proxy back to the frame may also suffice. However, I find it easier to think about the design by assuming this will be a lazily initialised dict in its own right)

Most operations on the proxy would just be passed through to f_locals, but for keys in both dictionaries, set and delete operations would *also* affect the cell in the cell dictionary. (Fortunately dict views don't expose any mutation methods, or intercepting all changes to the mapping would be a lot trickier)

Frames would gain a new lazily initialised "f_traceproxy" field that defaults to NULL/None.

For code objects that don't define or reference any cells, nothing would change relative to today.

For code objects that *do* define or reference cells though, tracing would change as follows:

* before calling the trace function:
  - f_locals would be updated from the fast locals array and current cell values as usual
  - f_locals on the frame would be swapped out for f_traceproxy (creating the latter if needed)
* after returning from the trace function:
  - f_locals on the frame would be reset back to bypassing the proxy (so writes to f_locals stop being written through to cells when the trace hook isn't running)
  - only the actual locals would be written from f_locals back to the fast locals array (cell updates are assumed to have already been handled via the proxy)

This preserves all current behaviour *except* the unwanted one of resetting cells back to their pre-tracehook value after returning from a trace hook:

* code outside trace hooks can't mutate the function level fast locals or cells via locals() or frame.f_locals (since their modifications will be overwritten immediately before the trace function runs), but *can* treat it as a regular namespace otherwise
* code inside trace hooks can mutate function level fast locals and cells just by modifying frame.f_locals
* all code can display the current value of function level fast locals and cells just by displaying locals() or frame.f_locals
* there's still only one f_locals dictionary per frame, it may just have a proxy object intercepting writes to cell variables when a trace hook is running

That way, we can avoid the problem with overwriting cells back to old values, *without* enabling arbitrary writes to function locals from outside trace functions, and without introducing any tricky new state synchronisation problems.

History
Date	User	Action	Args
2017-10-11 02:13:11	ncoghlan	set	recipients: + ncoghlan, arigo, belopolsky, vstinner, benjamin.peterson, njs, xdegaye, Mark.Shannon, yselivanov, xgdomingo
2017-10-11 02:13:11	ncoghlan	set	messageid: <1507687991.44.0.213398074469.issue30744@psf.upfronthosting.co.za>
2017-10-11 02:13:11	ncoghlan	link	issue30744 messages
2017-10-11 02:13:09	ncoghlan	create