Issue 4688: GC optimization: don't track simple tuples and dicts

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/48938

classification

Title:	GC optimization: don't track simple tuples and dicts
Type:	performance	Stage:	patch review
Components:	Interpreter Core	Versions:	Python 3.1, Python 2.7

process

Status:	closed	Resolution:	fixed
Dependencies:		Superseder:
Assigned To:	pitrou	Nosy List:	collinwinter, esam, gregory.p.smith, jcea, loewis, pitrou, rhettinger, stutzbach
Priority:	normal	Keywords:	patch

Created on 2008-12-17 23:41 by pitrou, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name	Uploaded	Description	Edit
containeropts.patch	pitrou, 2009-03-14 16:12
binary-trees.py	pitrou, 2009-03-14 16:31

Messages (19)
msg77995 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2008-12-17 23:41
Split out from #4074, here is a standalone patch which disables GC tracking for simple tuples (tuples made of atomic objects or immutable containers, possibly nested). The performance improvement can be evaluated using tuple_gc_hell.py from #4074. The patch also adds a function named is_tracked() to the gc module.
msg77996 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2008-12-17 23:48
This additional patch also optimizes simple dicts, it must be applied over the tuples patch. Together these two patches make it so that e.g. a dict of tuples or strings won't be tracked at all, saving CPU cycles when collecting.
msg77999 - (view)	Author: Daniel Stutzbach (stutzbach)	Date: 2008-12-17 23:56
I've looked very quickly over your patches to try to figure what rule qualifies a tuple or dict as simple enough to be untracked, out of curiosity. For tuples, I think the rule is: If an object is immutable, and it points only to untracked objects, the object can be untracked. Is that right? If so, you could perform the same optimization on the frozenset() type. Why do empty tuples have to be tracked? The dict patch adds a boolean flag to the dict data structure to indicate whether the dict is being tracked or not. I think. Couldn't you use _PyObject_GC_IS_TRACKED() instead? What's the rule for when a dict can be tracked or untracked? Do you need to recheck the rule every time the dict is modified?
msg78001 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2008-12-18 00:04
Le mercredi 17 décembre 2008 à 23:56 +0000, Daniel Stutzbach a écrit : > For tuples, I think the rule is: > If an object is immutable, and it points only to untracked objects, > the object can be untracked. Roughly, yes. Exactly, it is "if it points only to untrackable objects". That is, a tuple containing an untracked dict will still be tracked, because the dict may mutate and become tracked later. But a tuple containing an untracked tuple will itself be untracked, because the enclosed tuple can't mutate. > Is that right? If so, you could perform the same optimization on the > frozenset() type. Yes, but I prefer to concentrate on those two core types now, which are the most frequent. Once the principle is accepted we can extend the implementation to other types. > Why do empty tuples have to be tracked? Actually, they are never tracked, which means calling _PyObject_GC_UNTRACK on them would segfault. > The dict patch adds a boolean flag to the dict data structure to > indicate whether the dict is being tracked or not. I think. Couldn't > you use _PyObject_GC_IS_TRACKED() instead? Yes, probably. I thought a dedicated flag may be faster but I will try without. > Do you > need to recheck the rule every time the dict is modified? Only if it is not tracked. Once a dict is tracked, it cannot become untracked again (there would be too much overhead, and the assumption is that you usually create homogeneous containers, you don't create a dict of mutable containers things and then replace the mutable containers with simple atomic objects).
msg78002 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2008-12-18 00:15
So, I've tried without the dedicated flag in the dict object and it's as fast as previously! Here is a new patch.
msg78011 - (view)	Author: Martin v. Löwis (loewis) *	Date: 2008-12-18 06:12
IIUC, you try to find various places where tuples are created, and check whether they create tuples that don't need tracking. I think this approach is difficult to maintain, and also may miss some objects. I'd rather see that integrated into collection: e.g. when iterating over survivors of a collection, check whether they can be untracked.
msg78012 - (view)	Author: Daniel Stutzbach (stutzbach)	Date: 2008-12-18 06:25
Unfortunately, the check is O(n), so it can get a little expensive. I suppose that's no worse than traversing the tuple to look for a cycle, though. Perhaps it could be done at the same time? Can _PyObject_GC_UNTRACK() be safely called from tp_traverse?
msg78013 - (view)	Author: Martin v. Löwis (loewis) *	Date: 2008-12-18 06:43
> Unfortunately, the check is O(n), so it can get a little expensive. Not in the typical case, I guess. If you have to go through all objects, then likely you end up untracking the object. If you can't untrack, you typically find a tracked object in the content quickly. > I suppose that's no worse than traversing the tuple to look for a cycle, > though. Correct. Collection is O(n) per object, anyway. > Perhaps it could be done at the same time? > Can _PyObject_GC_UNTRACK() be safely called from tp_traverse? No. However, I doubt that iterating over the object twice is significantly slower than iterating over it once and performing both checks. Plus, the check for tracked elements can return as soon as one such object is found.
msg78017 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2008-12-18 09:28
Le jeudi 18 décembre 2008 à 06:12 +0000, Martin v. Löwis a écrit : > I think this approach is difficult to maintain, and also may miss some > objects. But what counts is where tuples can be created in massive numbers or sizes: the eval loop, the tuple type's tp_new, and a couple of other places. We don't need to optimize every single tuple created by the interpreter or extension modules (and even the, one can simply call _PyTuple_Optimize()). > I'd rather see that integrated into collection: e.g. when iterating over > survivors of a collection, check whether they can be untracked. That's what I had tried at first, but it crashes. I haven't investigated, but I suspect removing an object from the GC list while that list is being walked isn't very well supported by the GC. Also, this approach is more expensive since it involves checking tuples again and again each time they are walked by the GC. The patch only does it once, when the tuple is being built.
msg78018 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2008-12-18 09:36
> > I'd rather see that integrated into collection: e.g. when iterating over > > survivors of a collection, check whether they can be untracked. > > That's what I had tried at first, but it crashes. To be clearer, I tried adding it to the tp_traverse method of tuples. I could try to add it to the collection machinery in gcmodule.c instead.
msg78021 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2008-12-18 10:37
Here is an alternate patch which does the tuple optimization in the GC. I don't like it a lot, first because it puts some type-specific code in gcmodule, second because it invalidates the dict optimization (since tuples are untracked only when walked by the GC, they are still tracked when added to a dict and therefore the dict will be tracked too).
msg78024 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2008-12-18 12:04
Here is a patch which combines the alternate approach (untrack simple objects when doing a GC collection) for tuples and dicts. I'm surprised by the results: after a full regression test suite, there are roughly 55000 tracked objects (measured with len(gc.get_objects())) while there are 150000 of them with the original interpreter. Some eyes are welcome.
msg78048 - (view)	Author: Martin v. Löwis (loewis) *	Date: 2008-12-18 23:17
> But what counts is where tuples can be created in massive numbers or > sizes: the eval loop, the tuple type's tp_new, and a couple of other > places. We don't need to optimize every single tuple created by the > interpreter or extension modules (and even the, one can simply call > _PyTuple_Optimize()). Still, I think this patch does too much code duplication. There should be only a single function that does the optional untracking; this then gets called from multiple places. > Also, this approach is more expensive I'm skeptical. It could well be less expensive, namely if many tuples get deleted before gc even happens. Then you currently check whether you can untrack them, which is pointless if the tuple gets deallocated quickly, anyway.
msg78049 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2008-12-18 23:43
Hello again, > Still, I think this patch does too much code duplication. There should > be only a single function that does the optional untracking; this then > gets called from multiple places. The point was to avoid slowing down the critical path of tuple creation in the most common cases. If it is considered too hackish, then I agree the other patch (tuple+dictopts-alt.patch) should be considered instead. By the way, perhaps pybench should grow a GC test (e.g. derived from Greg's script). What do you think?
msg83592 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2009-03-14 16:12
Here is a new patch against trunk.
msg83594 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2009-03-14 16:31
Here is a benchmark ripped from the Computer Language Shootout (http://shootout.alioth.debian.org). Running "binary_trees.py 18" takes the following time: -> without patch: 330s. -> with patch: 201s. (40% speedup) -> with GC disabled : 165s. Running pybench --with-gc doesn't show any performance variation, which seems to imply the overhead of the heuristic is negligible.
msg83965 - (view)	Author: Martin v. Löwis (loewis) *	Date: 2009-03-22 10:20
The patch looks fine to me. Some documentation is missing, still: is_tracked needs to be documented in the regular documentation, and SHOW_TRACK_COUNT should be documented in SpecialBuilds.txt. I'm not sure whether you actually want to integrate SHOW_TRACK_COUNT into the code base. If yes, it should be cleaned up a little. There should be no default #undef for it, and, IMO, it should count tuples, too.
msg84024 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2009-03-23 18:45
I've added tracking statistics for tuples, documentation for gc.is_tracked(), and changed _Py_Optimize to _Py_MaybeUntrack on a suggestion by Benjamin. Committed to trunk in r70546.
msg84026 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2009-03-23 18:59
Thanks for the review, by the way!

History
Date	User	Action	Args
2022-04-11 14:56:42	admin	set	github: 48938
2009-10-17 04:29:53	esam	set	nosy: + esam
2009-03-23 18:59:54	pitrou	set	messages: + msg84026
2009-03-23 18:45:10	pitrou	set	status: open -> closed resolution: accepted -> fixed messages: + msg84024
2009-03-22 13:46:48	pitrou	set	assignee: pitrou resolution: accepted
2009-03-22 10:20:31	loewis	set	messages: + msg83965
2009-03-14 16:31:13	pitrou	set	files: + binary-trees.py messages: + msg83594
2009-03-14 16:12:57	pitrou	set	files: + containeropts.patch messages: + msg83592
2009-03-14 16:12:30	pitrou	set	files: - tuple+dictopts-alt.patch
2009-03-14 16:12:26	pitrou	set	files: - tupleopts-alt.patch
2009-03-14 16:12:23	pitrou	set	files: - dictopts2.patch
2009-03-14 16:12:19	pitrou	set	files: - tupleopts.patch
2009-01-08 01:03:53	collinwinter	set	nosy: + collinwinter
2008-12-30 22:25:44	jcea	set	nosy: + jcea
2008-12-18 23:43:57	pitrou	set	messages: + msg78049
2008-12-18 23:17:19	loewis	set	messages: + msg78048
2008-12-18 12:04:15	pitrou	set	files: + tuple+dictopts-alt.patch messages: + msg78024
2008-12-18 10:38:23	pitrou	set	files: + tupleopts-alt.patch
2008-12-18 10:38:15	pitrou	set	files: - tupleopts-alt.patch
2008-12-18 10:37:41	pitrou	set	files: + tupleopts-alt.patch
2008-12-18 10:37:32	pitrou	set	files: - dictopts2.patch
2008-12-18 10:37:06	pitrou	set	files: + dictopts2.patch messages: + msg78021
2008-12-18 09:36:31	pitrou	set	messages: + msg78018
2008-12-18 09:28:48	pitrou	set	messages: + msg78017
2008-12-18 07:25:39	gregory.p.smith	set	nosy: + gregory.p.smith
2008-12-18 06:43:47	loewis	set	messages: + msg78013
2008-12-18 06:25:21	stutzbach	set	messages: + msg78012
2008-12-18 06:12:57	loewis	set	messages: + msg78011
2008-12-18 00:19:23	pitrou	set	files: - dictopts.patch
2008-12-18 00:15:45	pitrou	set	files: + dictopts2.patch messages: + msg78002
2008-12-18 00:12:03	rhettinger	set	nosy: + rhettinger
2008-12-18 00:04:55	pitrou	set	messages: + msg78001
2008-12-17 23:56:56	stutzbach	set	nosy: + stutzbach messages: + msg77999
2008-12-17 23:48:47	pitrou	set	files: + dictopts.patch nosy: + loewis messages: + msg77996
2008-12-17 23:41:08	pitrou	create