Message 414808 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	carljm
Recipients	Dennis Sweeney, Mark.Shannon, barry, brandtbucher, carljm, dino.viehland, gregory.p.smith, gvanrossum, itamaro
Date	2022-03-09.17:57:21
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1646848641.84.0.0837989959219.issue46896@roundup.psfhosted.org>
In-reply-to

Content
Hi Dennis, thanks for the questions! > A curiosity: have you considered watching dict keys rather than whole dicts? There's a bit of discussion of this above. A core requirement is to avoid any memory overhead and minimize CPU overhead on unwatched dicts. Additional memory overhead seems like a nonstarter, given the sheer number of dict objects that can exist in a large Python system. The CPU overhead for unwatched dicts in the current PR consists of a single added `testb` and `jne` (for checking if the dict is watched), in the write path only; I think that's effectively the minimum possible. It's not clear to me how to implement per-key watching under this constraint. One option Brandt mentioned above is to steal the low bit of a `PyObject` pointer; in theory we could do this on `me_key` to implement per-key watching with no memory overhead. But then we are adding bit-masking overhead on every dict read and write. I think we really want the implementation here to be zero-overhead in the dict read path. Open to suggestions if I've missed a good option here! > That way, changing global values would not have to de-optimize, only adding new global keys would. > Indexing into dict values array wouldn't be as efficient as embedding direct jump targets in JIT-generated machine code, but as long as we're not doing that, maybe watching the keys is a happy medium? But we are doing that, in the Cinder JIT. Dict watching here is intentionally exposed for use by extensions, including hopefully in future the Cinder JIT as an installable extension. We burn exact pointer values for module globals into generated JIT code and deopt if they change (we are close to landing a change to code-patch instead of deopting.) This is quite a bit more efficient in the hot path than having to go through a layer of indirection. I don't want to assume too much about how dict watching will be used in future, or go for an implementation that limits its future usefulness. The current PR is quite flexible and can be used to implement a variety of caching strategies. The main downside of dict-level watching is that a lot of notifications will be fired if code does a lot of globals-rebinding in modules where globals are watched, but this doesn't appear to be a problem in practice, either in our workloads or in pyperformance. It seems likely that a workable strategy if this ever was observed to be a problem would be to notice at runtime that globals are being re-bound frequently in a particular module and just stop watching that module's globals.

Hi Dennis, thanks for the questions!

> A curiosity: have you considered watching dict keys rather than whole dicts?

There's a bit of discussion of this above. A core requirement is to avoid any memory overhead and minimize CPU overhead on unwatched dicts. Additional memory overhead seems like a nonstarter, given the sheer number of dict objects that can exist in a large Python system. The CPU overhead for unwatched dicts in the current PR consists of a single added `testb` and `jne` (for checking if the dict is watched), in the write path only; I think that's effectively the minimum possible.

It's not clear to me how to implement per-key watching under this constraint. One option Brandt mentioned above is to steal the low bit of a `PyObject` pointer; in theory we could do this on `me_key` to implement per-key watching with no memory overhead. But then we are adding bit-masking overhead on every dict read and write. I think we really want the implementation here to be zero-overhead in the dict read path.

Open to suggestions if I've missed a good option here!

> That way, changing global values would not have to de-optimize, only adding new global keys would.

> Indexing into dict values array wouldn't be as efficient as embedding direct jump targets in JIT-generated machine code, but as long as we're not doing that, maybe watching the keys is a happy medium?

But we are doing that, in the Cinder JIT. Dict watching here is intentionally exposed for use by extensions, including hopefully  in future the Cinder JIT as an installable extension. We burn exact pointer values for module globals into generated JIT code and deopt if they change (we are close to landing a change to code-patch instead of deopting.) This is quite a bit more efficient in the hot path than having to go through a layer of indirection.

I don't want to assume too much about how dict watching will be used in future, or go for an implementation that limits its future usefulness. The current PR is quite flexible and can be used to implement a variety of caching strategies. The main downside of dict-level watching is that a lot of notifications will be fired if code does a lot of globals-rebinding in modules where globals are watched, but this doesn't appear to be a problem in practice, either in our workloads or in pyperformance. It seems likely that a workable strategy if this ever was observed to be a problem would be to notice at runtime that globals are being re-bound frequently in a particular module and just stop watching that module's globals.

History
Date	User	Action	Args
2022-03-09 17:57:21	carljm	set	recipients: + carljm, gvanrossum, barry, gregory.p.smith, dino.viehland, Mark.Shannon, brandtbucher, Dennis Sweeney, itamaro
2022-03-09 17:57:21	carljm	set	messageid: <1646848641.84.0.0837989959219.issue46896@roundup.psfhosted.org>
2022-03-09 17:57:21	carljm	link	issue46896 messages
2022-03-09 17:57:21	carljm	create