Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PATCH: Attribute lookup caching #45901

Closed
ntoronto mannequin opened this issue Dec 5, 2007 · 6 comments
Closed

PATCH: Attribute lookup caching #45901

ntoronto mannequin opened this issue Dec 5, 2007 · 6 comments
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs)

Comments

@ntoronto
Copy link
Mannequin

ntoronto mannequin commented Dec 5, 2007

BPO 1560
Nosy @gvanrossum, @rhettinger
Files
  • fastattr-0.patch.txt
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2007-12-10.19:50:25.833>
    created_at = <Date 2007-12-05.21:00:24.961>
    labels = ['interpreter-core']
    title = 'PATCH: Attribute lookup caching'
    updated_at = <Date 2007-12-10.21:22:25.064>
    user = 'https://bugs.python.org/ntoronto'

    bugs.python.org fields:

    activity = <Date 2007-12-10.21:22:25.064>
    actor = 'rhettinger'
    assignee = 'none'
    closed = True
    closed_date = <Date 2007-12-10.19:50:25.833>
    closer = 'gvanrossum'
    components = ['Interpreter Core']
    creation = <Date 2007-12-05.21:00:24.961>
    creator = 'ntoronto'
    dependencies = []
    files = ['8883']
    hgrepos = []
    issue_num = 1560
    keywords = ['patch']
    message_count = 6.0
    messages = ['58229', '58355', '58360', '58365', '58367', '58371']
    nosy_count = 3.0
    nosy_names = ['gvanrossum', 'rhettinger', 'ntoronto']
    pr_nums = []
    priority = 'normal'
    resolution = 'out of date'
    stage = None
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue1560'
    versions = ['Python 2.6']

    @ntoronto
    Copy link
    Mannequin Author

    ntoronto mannequin commented Dec 5, 2007

    I've attached a patch to accelerate type and instance lookups using a
    specialized cache. It includes a benchmarking script (fastattr_test.py)
    that tests both successful and failing lookups on list, dict and tuple,
    and a simple, deep hierarchy. Benchmark results are here:

    http://spreadsheets.google.com/ccc?key=pHIJrYc_pnIUpTm6QSG2gZg&hl=en_US

    Everything tested in fastattr_test.py is faster except list.__init__ and
    list().__init__ (and I haven't got a clue why, so some pointers would be
    nice). Pybench is faster overall. TryRaiseExcept is faster for some
    non-obvious reason. CreateNewInstances is a little slower, which I'll
    discuss in a bit. Setting type attributes is slower, but I haven't
    benchmarked that yet. It may not happen often enough that we care as
    long as it's not noticeably slow in general usage.

    In benchmarks the patch does a little slower, it may be in part because
    I removed a manually inlined _PyType_Lookup from
    PyObject_GenericGetAttr. Something like it can be put back if it needs
    to be.

    It works in a fairly obvious way. Every type has a tp_cache, which is a
    custom dict type that caches the first value in a type dict in the MRO
    for a given name. Lazy cached lookups are done via
    _PyType_CachingLookup, which is a drop-in replacement for
    _PyType_Lookup. The only way to set an attribute on a type is via its
    setattr, so type's setattr notifies subclasses to invalidate specific
    cache entries.

    The cache dict is custom for two reasons:

    1. The regular dict is a little slower because it's more general. The
      custom dict is restricted to string-exact keys (types fall back to no
      caching if this constraint is violated). Because it's internal to
      typeobject.c, it's safe for lookups to return entry pointers rather than
      values, so lookups only have to be done once, even on cache misses.

    2. Caching missing attributes is crucial for speed on instance attribute
      lookups. Both type and metatype instances check all the way through the
      MRO for a descriptor before even trying to find an attribute. It's
      usually missing. Having to go through the cache machinery to find a
      missing attribute for every attribute lookup is expensive. However,
      storing all missing attribute queries in the cache is bad - it can grow
      unboundedly through hasattr(inst, <random name>) calls.

    What's needed is a dict that knows that some of its entries are
    transient and doesn't copy them over on resize. It wasn't clear how to
    implement that efficiently with a regular dict (I suspect it's not
    possible), so I created a new dict type that considers entries transient
    (meaning the attribute is missing) when me_value == NULL.

    This is also very good for failing hasattr(...) calls and
    try...inst.method()...except style duck typing.

    Now, about the CreateNewInstances benchmark. It creates three classes
    with __init__ methods that assign a few attributes. The missing
    descriptors are cached, and then the cache is resized, which clears the
    cached missing descriptors. Increasing the minimum cache size from 4 to
    8 clears up the problem. However, for any class, SOME minimum cache size
    will properly cache missing descriptors and some other one won't.

    I have some solutions for this floating around in my head, which I'll
    try shortly. (One idea is for missing attributes, if they're not missing
    on the *instance*, to be permanent in the cache.) But I'd like to get
    this patch off my hard drive and into other people's hands for testing,
    poking, prodding, and getting feedback on what's going right and what's not.

    @ntoronto ntoronto mannequin added the interpreter-core (Objects, Python, Grammar, and Parser dirs) label Dec 5, 2007
    @gvanrossum
    Copy link
    Member

    Are you withdrawing this in favor of bpo-1568?

    @ntoronto
    Copy link
    Mannequin Author

    ntoronto mannequin commented Dec 10, 2007

    Yes, as well bpo-1700288 (Armin's attribute lookup caching patch ported to
    2.6) or bpo-1685986 (Armin's original for 2.4), or whatever Raymond finds
    most convenient.

    @gvanrossum
    Copy link
    Member

    That's still ambiguous -- do you want any of those to be closed too?
    Clearly we're not going to patch 2.4.

    @ntoronto
    Copy link
    Mannequin Author

    ntoronto mannequin commented Dec 10, 2007

    Sorry - I'll be more clear. I'd love to see 2.6 get attribute lookup
    caching, and Kevin's port (bpo-1700288) of Armin's original 2.4 patch
    (bpo-1685986) does that for 2.6. bpo-1685986 (2.4) should be closed and
    bpo-1700288 (2.6) should remain open. But Raymond has indicated that he's
    currently working on bpo-1685986 - I think for 2.6, but it's not clear.

    @rhettinger
    Copy link
    Contributor

    I'm trying to look at all of them. Having it split into several patches
    and getting frequent updates and posts is making it difficult.

    Please communicate with me directly (python at rcn dot com) so I can
    find out which versions are the latest and the reason behind each variation.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    interpreter-core (Objects, Python, Grammar, and Parser dirs)
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants