Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GC optimization: don't track simple tuples and dicts #48938

Closed
pitrou opened this issue Dec 17, 2008 · 19 comments
Closed

GC optimization: don't track simple tuples and dicts #48938

pitrou opened this issue Dec 17, 2008 · 19 comments
Assignees
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage

Comments

@pitrou
Copy link
Member

pitrou commented Dec 17, 2008

BPO 4688
Nosy @loewis, @rhettinger, @gpshead, @jcea, @pitrou
Files
  • containeropts.patch
  • binary-trees.py
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/pitrou'
    closed_at = <Date 2009-03-23.18:45:10.587>
    created_at = <Date 2008-12-17.23:41:08.921>
    labels = ['interpreter-core', 'performance']
    title = "GC optimization: don't track simple tuples and dicts"
    updated_at = <Date 2009-10-17.04:29:53.042>
    user = 'https://github.com/pitrou'

    bugs.python.org fields:

    activity = <Date 2009-10-17.04:29:53.042>
    actor = 'esam'
    assignee = 'pitrou'
    closed = True
    closed_date = <Date 2009-03-23.18:45:10.587>
    closer = 'pitrou'
    components = ['Interpreter Core']
    creation = <Date 2008-12-17.23:41:08.921>
    creator = 'pitrou'
    dependencies = []
    files = ['13331', '13332']
    hgrepos = []
    issue_num = 4688
    keywords = ['patch']
    message_count = 19.0
    messages = ['77995', '77996', '77999', '78001', '78002', '78011', '78012', '78013', '78017', '78018', '78021', '78024', '78048', '78049', '83592', '83594', '83965', '84024', '84026']
    nosy_count = 8.0
    nosy_names = ['loewis', 'collinwinter', 'rhettinger', 'gregory.p.smith', 'jcea', 'pitrou', 'stutzbach', 'esam']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'patch review'
    status = 'closed'
    superseder = None
    type = 'performance'
    url = 'https://bugs.python.org/issue4688'
    versions = ['Python 3.1', 'Python 2.7']

    @pitrou
    Copy link
    Member Author

    pitrou commented Dec 17, 2008

    Split out from bpo-4074, here is a standalone patch which disables GC
    tracking for simple tuples (tuples made of atomic objects or immutable
    containers, possibly nested). The performance improvement can be
    evaluated using tuple_gc_hell.py from bpo-4074.

    The patch also adds a function named is_tracked() to the gc module.

    @pitrou pitrou added interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage labels Dec 17, 2008
    @pitrou
    Copy link
    Member Author

    pitrou commented Dec 17, 2008

    This additional patch also optimizes simple dicts, it must be applied
    over the tuples patch.

    Together these two patches make it so that e.g. a dict of tuples or
    strings won't be tracked at all, saving CPU cycles when collecting.

    @stutzbach
    Copy link
    Mannequin

    stutzbach mannequin commented Dec 17, 2008

    I've looked very quickly over your patches to try to figure what rule
    qualifies a tuple or dict as simple enough to be untracked, out of
    curiosity.

    For tuples, I think the rule is:
    If an object is immutable, and it points only to untracked objects,
    the object can be untracked.

    Is that right? If so, you could perform the same optimization on the
    frozenset() type.

    Why do empty tuples have to be tracked?

    The dict patch adds a boolean flag to the dict data structure to
    indicate whether the dict is being tracked or not. I think. Couldn't
    you use _PyObject_GC_IS_TRACKED() instead?

    What's the rule for when a dict can be tracked or untracked? Do you
    need to recheck the rule every time the dict is modified?

    @pitrou
    Copy link
    Member Author

    pitrou commented Dec 18, 2008

    Le mercredi 17 décembre 2008 à 23:56 +0000, Daniel Stutzbach a écrit :

    For tuples, I think the rule is:
    If an object is immutable, and it points only to untracked objects,
    the object can be untracked.

    Roughly, yes. Exactly, it is "if it points only to untrackable objects".
    That is, a tuple containing an untracked dict will still be tracked,
    because the dict may mutate and become tracked later. But a tuple
    containing an untracked tuple will itself be untracked, because the
    enclosed tuple can't mutate.

    Is that right? If so, you could perform the same optimization on the
    frozenset() type.

    Yes, but I prefer to concentrate on those two core types now, which are
    the most frequent. Once the principle is accepted we can extend the
    implementation to other types.

    Why do empty tuples have to be tracked?

    Actually, they are never tracked, which means calling
    _PyObject_GC_UNTRACK on them would segfault.

    The dict patch adds a boolean flag to the dict data structure to
    indicate whether the dict is being tracked or not. I think. Couldn't
    you use _PyObject_GC_IS_TRACKED() instead?

    Yes, probably. I thought a dedicated flag may be faster but I will try
    without.

    Do you
    need to recheck the rule every time the dict is modified?

    Only if it is not tracked. Once a dict is tracked, it cannot become
    untracked again (there would be too much overhead, and the assumption is
    that you usually create homogeneous containers, you don't create a dict
    of mutable containers things and then replace the mutable containers
    with simple atomic objects).

    @pitrou
    Copy link
    Member Author

    pitrou commented Dec 18, 2008

    So, I've tried without the dedicated flag in the dict object and it's as
    fast as previously! Here is a new patch.

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Dec 18, 2008

    IIUC, you try to find various places where tuples are created, and check
    whether they create tuples that don't need tracking.

    I think this approach is difficult to maintain, and also may miss some
    objects.

    I'd rather see that integrated into collection: e.g. when iterating over
    survivors of a collection, check whether they can be untracked.

    @stutzbach
    Copy link
    Mannequin

    stutzbach mannequin commented Dec 18, 2008

    Unfortunately, the check is O(n), so it can get a little expensive. I
    suppose that's no worse than traversing the tuple to look for a cycle,
    though. Perhaps it could be done at the same time?

    Can _PyObject_GC_UNTRACK() be safely called from tp_traverse?

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Dec 18, 2008

    Unfortunately, the check is O(n), so it can get a little expensive.

    Not in the typical case, I guess. *If* you have to go through all
    objects, then likely you end up untracking the object. If you can't
    untrack, you typically find a tracked object in the content quickly.

    I suppose that's no worse than traversing the tuple to look for a cycle,
    though.

    Correct. Collection is O(n) per object, anyway.

    Perhaps it could be done at the same time?
    Can _PyObject_GC_UNTRACK() be safely called from tp_traverse?

    No. However, I doubt that iterating over the object twice is
    significantly slower than iterating over it once and performing
    both checks. Plus, the check for tracked elements can return
    as soon as one such object is found.

    @pitrou
    Copy link
    Member Author

    pitrou commented Dec 18, 2008

    Le jeudi 18 décembre 2008 à 06:12 +0000, Martin v. Löwis a écrit :

    I think this approach is difficult to maintain, and also may miss some
    objects.

    But what counts is where tuples can be created in massive numbers or
    sizes: the eval loop, the tuple type's tp_new, and a couple of other
    places. We don't need to optimize every single tuple created by the
    interpreter or extension modules (and even the, one can simply call
    _PyTuple_Optimize()).

    I'd rather see that integrated into collection: e.g. when iterating over
    survivors of a collection, check whether they can be untracked.

    That's what I had tried at first, but it crashes. I haven't
    investigated, but I suspect removing an object from the GC list while
    that list is being walked isn't very well supported by the GC.

    Also, this approach is more expensive since it involves checking tuples
    again and again each time they are walked by the GC. The patch only does
    it once, when the tuple is being built.

    @pitrou
    Copy link
    Member Author

    pitrou commented Dec 18, 2008

    > I'd rather see that integrated into collection: e.g. when iterating over
    > survivors of a collection, check whether they can be untracked.

    That's what I had tried at first, but it crashes.

    To be clearer, I tried adding it to the tp_traverse method of tuples. I
    could try to add it to the collection machinery in gcmodule.c instead.

    @pitrou
    Copy link
    Member Author

    pitrou commented Dec 18, 2008

    Here is an alternate patch which does the tuple optimization in the GC.
    I don't like it a lot, first because it puts some type-specific code in
    gcmodule, second because it invalidates the dict optimization (since
    tuples are untracked only when walked by the GC, they are still tracked
    when added to a dict and therefore the dict will be tracked too).

    @pitrou
    Copy link
    Member Author

    pitrou commented Dec 18, 2008

    Here is a patch which combines the alternate approach (untrack simple
    objects when doing a GC collection) for tuples and dicts.
    I'm surprised by the results: after a full regression test suite, there
    are roughly 55000 tracked objects (measured with len(gc.get_objects()))
    while there are 150000 of them with the original interpreter.

    Some eyes are welcome.

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Dec 18, 2008

    But what counts is where tuples can be created in massive numbers or
    sizes: the eval loop, the tuple type's tp_new, and a couple of other
    places. We don't need to optimize every single tuple created by the
    interpreter or extension modules (and even the, one can simply call
    _PyTuple_Optimize()).

    Still, I think this patch does too much code duplication. There should
    be only a single function that does the optional untracking; this then
    gets called from multiple places.

    Also, this approach is more expensive

    I'm skeptical. It could well be *less* expensive, namely if many tuples
    get deleted before gc even happens. Then you currently check whether you
    can untrack them, which is pointless if the tuple gets deallocated
    quickly, anyway.

    @pitrou
    Copy link
    Member Author

    pitrou commented Dec 18, 2008

    Hello again,

    Still, I think this patch does too much code duplication. There should
    be only a single function that does the optional untracking; this then
    gets called from multiple places.

    The point was to avoid slowing down the critical path of tuple creation
    in the most common cases. If it is considered too hackish, then I agree
    the other patch (tuple+dictopts-alt.patch) should be considered instead.

    By the way, perhaps pybench should grow a GC test (e.g. derived from
    Greg's script). What do you think?

    @pitrou
    Copy link
    Member Author

    pitrou commented Mar 14, 2009

    Here is a new patch against trunk.

    @pitrou
    Copy link
    Member Author

    pitrou commented Mar 14, 2009

    Here is a benchmark ripped from the Computer Language Shootout
    (http://shootout.alioth.debian.org).
    Running "binary_trees.py 18" takes the following time:
    -> without patch: 330s.
    -> with patch: 201s. (40% speedup)
    -> with GC disabled : 165s.

    Running pybench --with-gc doesn't show any performance variation, which
    seems to imply the overhead of the heuristic is negligible.

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Mar 22, 2009

    The patch looks fine to me. Some documentation is missing, still:
    is_tracked needs to be documented in the regular documentation, and
    SHOW_TRACK_COUNT should be documented in SpecialBuilds.txt.

    I'm not sure whether you actually want to integrate SHOW_TRACK_COUNT
    into the code base. If yes, it should be cleaned up a little. There
    should be no default #undef for it, and, IMO, it should count tuples, too.

    @pitrou pitrou self-assigned this Mar 22, 2009
    @pitrou
    Copy link
    Member Author

    pitrou commented Mar 23, 2009

    I've added tracking statistics for tuples, documentation for
    gc.is_tracked(), and changed _Py*_Optimize to _Py*_MaybeUntrack on a
    suggestion by Benjamin. Committed to trunk in r70546.

    @pitrou pitrou closed this as completed Mar 23, 2009
    @pitrou
    Copy link
    Member Author

    pitrou commented Mar 23, 2009

    Thanks for the review, by the way!

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage
    Projects
    None yet
    Development

    No branches or pull requests

    1 participant