classification
Title: GC optimization: don't track simple tuples and dicts
Type: performance Stage: patch review
Components: Interpreter Core Versions: Python 3.1, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: pitrou Nosy List: collinwinter, esam, gregory.p.smith, jcea, loewis, pitrou, rhettinger, stutzbach
Priority: normal Keywords: patch

Created on 2008-12-17 23:41 by pitrou, last changed 2009-10-17 04:29 by esam. This issue is now closed.

Files
File name Uploaded Description Edit
containeropts.patch pitrou, 2009-03-14 16:12
binary-trees.py pitrou, 2009-03-14 16:31
Messages (19)
msg77995 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-12-17 23:41
Split out from #4074, here is a standalone patch which disables GC
tracking for simple tuples (tuples made of atomic objects or immutable
containers, possibly nested). The performance improvement can be
evaluated using tuple_gc_hell.py from #4074.

The patch also adds a function named is_tracked() to the gc module.
msg77996 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-12-17 23:48
This additional patch also optimizes simple dicts, it must be applied
over the tuples patch.

Together these two patches make it so that e.g. a dict of tuples or
strings won't be tracked at all, saving CPU cycles when collecting.
msg77999 - (view) Author: Daniel Stutzbach (stutzbach) (Python committer) Date: 2008-12-17 23:56
I've looked very quickly over your patches to try to figure what rule
qualifies a tuple or dict as simple enough to be untracked, out of
curiosity.

For tuples, I think the rule is:
    If an object is immutable, and it points only to untracked objects,
the object can be untracked.

Is that right?  If so, you could perform the same optimization on the
frozenset() type.

Why do empty tuples have to be tracked?

The dict patch adds a boolean flag to the dict data structure to
indicate whether the dict is being tracked or not.  I think.  Couldn't
you use _PyObject_GC_IS_TRACKED() instead?

What's the rule for when a dict can be tracked or untracked?  Do you
need to recheck the rule every time the dict is modified?
msg78001 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-12-18 00:04
Le mercredi 17 décembre 2008 à 23:56 +0000, Daniel Stutzbach a écrit :
> For tuples, I think the rule is:
>     If an object is immutable, and it points only to untracked objects,
> the object can be untracked.

Roughly, yes. Exactly, it is "if it points only to untrackable objects".
That is, a tuple containing an untracked dict will still be tracked,
because the dict may mutate and become tracked later. But a tuple
containing an untracked tuple will itself be untracked, because the
enclosed tuple can't mutate.

> Is that right?  If so, you could perform the same optimization on the
> frozenset() type.

Yes, but I prefer to concentrate on those two core types now, which are
the most frequent. Once the principle is accepted we can extend the
implementation to other types.

> Why do empty tuples have to be tracked?

Actually, they are never tracked, which means calling
_PyObject_GC_UNTRACK on them would segfault.

> The dict patch adds a boolean flag to the dict data structure to
> indicate whether the dict is being tracked or not.  I think.  Couldn't
> you use _PyObject_GC_IS_TRACKED() instead?

Yes, probably. I thought a dedicated flag may be faster but I will try
without.

> Do you
> need to recheck the rule every time the dict is modified?

Only if it is not tracked. Once a dict is tracked, it cannot become
untracked again (there would be too much overhead, and the assumption is
that you usually create homogeneous containers, you don't create a dict
of mutable containers things and then replace the mutable containers
with simple atomic objects).
msg78002 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-12-18 00:15
So, I've tried without the dedicated flag in the dict object and it's as
fast as previously! Here is a new patch.
msg78011 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2008-12-18 06:12
IIUC, you try to find various places where tuples are created, and check
whether they create tuples that don't need tracking.

I think this approach is difficult to maintain, and also may miss some
objects.

I'd rather see that integrated into collection: e.g. when iterating over
survivors of a collection, check whether they can be untracked.
msg78012 - (view) Author: Daniel Stutzbach (stutzbach) (Python committer) Date: 2008-12-18 06:25
Unfortunately, the check is O(n), so it can get a little expensive.  I
suppose that's no worse than traversing the tuple to look for a cycle,
though.  Perhaps it could be done at the same time?

Can _PyObject_GC_UNTRACK() be safely called from tp_traverse?
msg78013 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2008-12-18 06:43
> Unfortunately, the check is O(n), so it can get a little expensive.

Not in the typical case, I guess. *If* you have to go through all
objects, then likely you end up untracking the object. If you can't
untrack, you typically find a tracked object in the content quickly.

> I suppose that's no worse than traversing the tuple to look for a cycle,
> though.  

Correct. Collection is O(n) per object, anyway.

> Perhaps it could be done at the same time?
> Can _PyObject_GC_UNTRACK() be safely called from tp_traverse?

No. However, I doubt that iterating over the object twice is
significantly slower than iterating over it once and performing
both checks. Plus, the check for tracked elements can return
as soon as one such object is found.
msg78017 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-12-18 09:28
Le jeudi 18 décembre 2008 à 06:12 +0000, Martin v. Löwis a écrit :
> I think this approach is difficult to maintain, and also may miss some
> objects.

But what counts is where tuples can be created in massive numbers or
sizes: the eval loop, the tuple type's tp_new, and a couple of other
places. We don't need to optimize every single tuple created by the
interpreter or extension modules (and even the, one can simply call
_PyTuple_Optimize()).

> I'd rather see that integrated into collection: e.g. when iterating over
> survivors of a collection, check whether they can be untracked.

That's what I had tried at first, but it crashes. I haven't
investigated, but I suspect removing an object from the GC list while
that list is being walked isn't very well supported by the GC.

Also, this approach is more expensive since it involves checking tuples
again and again each time they are walked by the GC. The patch only does
it once, when the tuple is being built.
msg78018 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-12-18 09:36
> > I'd rather see that integrated into collection: e.g. when iterating over
> > survivors of a collection, check whether they can be untracked.
> 
> That's what I had tried at first, but it crashes.

To be clearer, I tried adding it to the tp_traverse method of tuples. I
could try to add it to the collection machinery in gcmodule.c instead.
msg78021 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-12-18 10:37
Here is an alternate patch which does the tuple optimization in the GC.
I don't like it a lot, first because it puts some type-specific code in
gcmodule, second because it invalidates the dict optimization (since
tuples are untracked only when walked by the GC, they are still tracked
when added to a dict and therefore the dict will be tracked too).
msg78024 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-12-18 12:04
Here is a patch which combines the alternate approach (untrack simple
objects when doing a GC collection) for tuples and dicts.
I'm surprised by the results: after a full regression test suite, there
are roughly 55000 tracked objects (measured with len(gc.get_objects()))
while there are 150000 of them with the original interpreter.

Some eyes are welcome.
msg78048 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2008-12-18 23:17
> But what counts is where tuples can be created in massive numbers or
> sizes: the eval loop, the tuple type's tp_new, and a couple of other
> places. We don't need to optimize every single tuple created by the
> interpreter or extension modules (and even the, one can simply call
> _PyTuple_Optimize()).

Still, I think this patch does too much code duplication. There should
be only a single function that does the optional untracking; this then
gets called from multiple places.

> Also, this approach is more expensive

I'm skeptical. It could well be *less* expensive, namely if many tuples
get deleted before gc even happens. Then you currently check whether you
can untrack them, which is pointless if the tuple gets deallocated
quickly, anyway.
msg78049 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-12-18 23:43
Hello again,

> Still, I think this patch does too much code duplication. There should
> be only a single function that does the optional untracking; this then
> gets called from multiple places.

The point was to avoid slowing down the critical path of tuple creation
in the most common cases. If it is considered too hackish, then I agree
the other patch (tuple+dictopts-alt.patch) should be considered instead.

By the way, perhaps pybench should grow a GC test (e.g. derived from
Greg's script). What do you think?
msg83592 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009-03-14 16:12
Here is a new patch against trunk.
msg83594 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009-03-14 16:31
Here is a benchmark ripped from the Computer Language Shootout
(http://shootout.alioth.debian.org).
Running "binary_trees.py 18" takes the following time:
-> without patch:     330s.
-> with patch:        201s. (40% speedup)
-> with GC disabled : 165s.

Running pybench --with-gc doesn't show any performance variation, which
seems to imply the overhead of the heuristic is negligible.
msg83965 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2009-03-22 10:20
The patch looks fine to me. Some documentation is missing, still:
is_tracked needs to be documented in the regular documentation, and
SHOW_TRACK_COUNT should be documented in SpecialBuilds.txt.

I'm not sure whether you actually want to integrate SHOW_TRACK_COUNT
into the code base. If yes, it should be cleaned up a little. There
should be no default #undef for it, and, IMO, it should count tuples, too.
msg84024 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009-03-23 18:45
I've added tracking statistics for tuples, documentation for
gc.is_tracked(), and changed _Py*_Optimize to _Py*_MaybeUntrack on a
suggestion by Benjamin. Committed to trunk in r70546.
msg84026 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009-03-23 18:59
Thanks for the review, by the way!
History
Date User Action Args
2009-10-17 04:29:53esamsetnosy: + esam
2009-03-23 18:59:54pitrousetmessages: + msg84026
2009-03-23 18:45:10pitrousetstatus: open -> closed
resolution: accepted -> fixed
messages: + msg84024
2009-03-22 13:46:48pitrousetassignee: pitrou
resolution: accepted
2009-03-22 10:20:31loewissetmessages: + msg83965
2009-03-14 16:31:13pitrousetfiles: + binary-trees.py

messages: + msg83594
2009-03-14 16:12:57pitrousetfiles: + containeropts.patch

messages: + msg83592
2009-03-14 16:12:30pitrousetfiles: - tuple+dictopts-alt.patch
2009-03-14 16:12:26pitrousetfiles: - tupleopts-alt.patch
2009-03-14 16:12:23pitrousetfiles: - dictopts2.patch
2009-03-14 16:12:19pitrousetfiles: - tupleopts.patch
2009-01-08 01:03:53collinwintersetnosy: + collinwinter
2008-12-30 22:25:44jceasetnosy: + jcea
2008-12-18 23:43:57pitrousetmessages: + msg78049
2008-12-18 23:17:19loewissetmessages: + msg78048
2008-12-18 12:04:15pitrousetfiles: + tuple+dictopts-alt.patch
messages: + msg78024
2008-12-18 10:38:23pitrousetfiles: + tupleopts-alt.patch
2008-12-18 10:38:15pitrousetfiles: - tupleopts-alt.patch
2008-12-18 10:37:41pitrousetfiles: + tupleopts-alt.patch
2008-12-18 10:37:32pitrousetfiles: - dictopts2.patch
2008-12-18 10:37:06pitrousetfiles: + dictopts2.patch
messages: + msg78021
2008-12-18 09:36:31pitrousetmessages: + msg78018
2008-12-18 09:28:48pitrousetmessages: + msg78017
2008-12-18 07:25:39gregory.p.smithsetnosy: + gregory.p.smith
2008-12-18 06:43:47loewissetmessages: + msg78013
2008-12-18 06:25:21stutzbachsetmessages: + msg78012
2008-12-18 06:12:57loewissetmessages: + msg78011
2008-12-18 00:19:23pitrousetfiles: - dictopts.patch
2008-12-18 00:15:45pitrousetfiles: + dictopts2.patch
messages: + msg78002
2008-12-18 00:12:03rhettingersetnosy: + rhettinger
2008-12-18 00:04:55pitrousetmessages: + msg78001
2008-12-17 23:56:56stutzbachsetnosy: + stutzbach
messages: + msg77999
2008-12-17 23:48:47pitrousetfiles: + dictopts.patch
nosy: + loewis
messages: + msg77996
2008-12-17 23:41:08pitroucreate