Author ntoronto
Recipients ntoronto
Date 2007-12-05.21:00:19
SpamBayes Score 0.000191955
Marked as misclassified No
Message-id <>
I've attached a patch to accelerate type and instance lookups using a
specialized cache. It includes a benchmarking script (
that tests both successful and failing lookups on list, dict and tuple,
and a simple, deep hierarchy. Benchmark results are here:

Everything tested in is faster except list.__init__ and
list().__init__ (and I haven't got a clue why, so some pointers would be
nice). Pybench is faster overall. TryRaiseExcept is faster for some
non-obvious reason. CreateNewInstances is a little slower, which I'll
discuss in a bit. Setting type attributes is slower, but I haven't
benchmarked that yet. It may not happen often enough that we care as
long as it's not noticeably slow in general usage.

In benchmarks the patch does a little slower, it may be in part because
I removed a manually inlined _PyType_Lookup from
PyObject_GenericGetAttr. Something like it can be put back if it needs
to be.

It works in a fairly obvious way. Every type has a tp_cache, which is a
custom dict type that caches the first value in a type dict in the MRO
for a given name. Lazy cached lookups are done via
_PyType_CachingLookup, which is a drop-in replacement for
_PyType_Lookup. The only way to set an attribute on a type is via its
setattr, so type's setattr notifies subclasses to invalidate specific
cache entries.

The cache dict is custom for two reasons:

1. The regular dict is a little slower because it's more general. The
custom dict is restricted to string-exact keys (types fall back to no
caching if this constraint is violated). Because it's internal to
typeobject.c, it's safe for lookups to return entry pointers rather than
values, so lookups only have to be done once, even on cache misses.

2. Caching missing attributes is crucial for speed on instance attribute
lookups. Both type and metatype instances check all the way through the
MRO for a descriptor before even trying to find an attribute. It's
usually missing. Having to go through the cache machinery to find a
missing attribute for every attribute lookup is expensive. However,
storing all missing attribute queries in the cache is bad - it can grow
unboundedly through hasattr(inst, <random name>) calls.

What's needed is a dict that knows that some of its entries are
transient and doesn't copy them over on resize. It wasn't clear how to
implement that efficiently with a regular dict (I suspect it's not
possible), so I created a new dict type that considers entries transient
(meaning the attribute is missing) when me_value == NULL.

This is also very good for failing hasattr(...) calls and
try...inst.method()...except style duck typing.

Now, about the CreateNewInstances benchmark. It creates three classes
with __init__ methods that assign a few attributes. The missing
descriptors are cached, and then the cache is resized, which clears the
cached missing descriptors. Increasing the minimum cache size from 4 to
8 clears up the problem. However, for any class, SOME minimum cache size
will properly cache missing descriptors and some other one won't.

I have some solutions for this floating around in my head, which I'll
try shortly. (One idea is for missing attributes, if they're not missing
on the *instance*, to be permanent in the cache.) But I'd like to get
this patch off my hard drive and into other people's hands for testing,
poking, prodding, and getting feedback on what's going right and what's not.
File name Uploaded
fastattr-0.patch.txt ntoronto, 2007-12-05.21:00:19
Date User Action Args
2007-12-05 21:00:25ntorontosetspambayes_score: 0.000191955 -> 0.000191955
recipients: + ntoronto
2007-12-05 21:00:25ntorontosetspambayes_score: 0.000191955 -> 0.000191955
messageid: <>
2007-12-05 21:00:24ntorontolinkissue1560 messages
2007-12-05 21:00:23ntorontocreate