Issue 1560: PATCH: Attribute lookup caching

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/45901

classification

Title:	PATCH: Attribute lookup caching
Type:		Stage:
Components:	Interpreter Core	Versions:	Python 2.6

process

Status:	closed	Resolution:	out of date
Dependencies:		Superseder:
Assigned To:		Nosy List:	gvanrossum, ntoronto, rhettinger
Priority:	normal	Keywords:	patch

Created on 2007-12-05 21:00 by ntoronto, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name	Uploaded	Description	Edit
fastattr-0.patch.txt	ntoronto, 2007-12-05 21:00

Messages (6)
msg58229 - (view)	Author: Neil Toronto (ntoronto)	Date: 2007-12-05 21:00
I've attached a patch to accelerate type and instance lookups using a specialized cache. It includes a benchmarking script (fastattr_test.py) that tests both successful and failing lookups on list, dict and tuple, and a simple, deep hierarchy. Benchmark results are here: http://spreadsheets.google.com/ccc?key=pHIJrYc_pnIUpTm6QSG2gZg&hl=en_US Everything tested in fastattr_test.py is faster except list.__init__ and list().__init__ (and I haven't got a clue why, so some pointers would be nice). Pybench is faster overall. TryRaiseExcept is faster for some non-obvious reason. CreateNewInstances is a little slower, which I'll discuss in a bit. Setting type attributes is slower, but I haven't benchmarked that yet. It may not happen often enough that we care as long as it's not noticeably slow in general usage. In benchmarks the patch does a little slower, it may be in part because I removed a manually inlined _PyType_Lookup from PyObject_GenericGetAttr. Something like it can be put back if it needs to be. It works in a fairly obvious way. Every type has a tp_cache, which is a custom dict type that caches the first value in a type dict in the MRO for a given name. Lazy cached lookups are done via _PyType_CachingLookup, which is a drop-in replacement for _PyType_Lookup. The only way to set an attribute on a type is via its setattr, so type's setattr notifies subclasses to invalidate specific cache entries. The cache dict is custom for two reasons: 1. The regular dict is a little slower because it's more general. The custom dict is restricted to string-exact keys (types fall back to no caching if this constraint is violated). Because it's internal to typeobject.c, it's safe for lookups to return entry pointers rather than values, so lookups only have to be done once, even on cache misses. 2. Caching missing attributes is crucial for speed on instance attribute lookups. Both type and metatype instances check all the way through the MRO for a descriptor before even trying to find an attribute. It's usually missing. Having to go through the cache machinery to find a missing attribute for every attribute lookup is expensive. However, storing all missing attribute queries in the cache is bad - it can grow unboundedly through hasattr(inst, <random name>) calls. What's needed is a dict that knows that some of its entries are transient and doesn't copy them over on resize. It wasn't clear how to implement that efficiently with a regular dict (I suspect it's not possible), so I created a new dict type that considers entries transient (meaning the attribute is missing) when me_value == NULL. This is also very good for failing hasattr(...) calls and try...inst.method()...except style duck typing. Now, about the CreateNewInstances benchmark. It creates three classes with __init__ methods that assign a few attributes. The missing descriptors are cached, and then the cache is resized, which clears the cached missing descriptors. Increasing the minimum cache size from 4 to 8 clears up the problem. However, for any class, SOME minimum cache size will properly cache missing descriptors and some other one won't. I have some solutions for this floating around in my head, which I'll try shortly. (One idea is for missing attributes, if they're not missing on the instance, to be permanent in the cache.) But I'd like to get this patch off my hard drive and into other people's hands for testing, poking, prodding, and getting feedback on what's going right and what's not.
msg58355 - (view)	Author: Guido van Rossum (gvanrossum) *	Date: 2007-12-10 18:05
Are you withdrawing this in favor of #1568?
msg58360 - (view)	Author: Neil Toronto (ntoronto)	Date: 2007-12-10 19:16
Yes, as well #1700288 (Armin's attribute lookup caching patch ported to 2.6) or #1685986 (Armin's original for 2.4), or whatever Raymond finds most convenient.
msg58365 - (view)	Author: Guido van Rossum (gvanrossum) *	Date: 2007-12-10 19:51
That's still ambiguous -- do you want any of those to be closed too? Clearly we're not going to patch 2.4.
msg58367 - (view)	Author: Neil Toronto (ntoronto)	Date: 2007-12-10 20:03
Sorry - I'll be more clear. I'd love to see 2.6 get attribute lookup caching, and Kevin's port (#1700288) of Armin's original 2.4 patch (#1685986) does that for 2.6. #1685986 (2.4) should be closed and #1700288 (2.6) should remain open. But Raymond has indicated that he's currently working on #1685986 - I think for 2.6, but it's not clear.
msg58371 - (view)	Author: Raymond Hettinger (rhettinger) *	Date: 2007-12-10 21:22
I'm trying to look at all of them. Having it split into several patches and getting frequent updates and posts is making it difficult. Please communicate with me directly (python at rcn dot com) so I can find out which versions are the latest and the reason behind each variation.

History
Date	User	Action	Args
2022-04-11 14:56:28	admin	set	github: 45901
2007-12-10 21:22:25	rhettinger	set	nosy: + rhettinger messages: + msg58371
2007-12-10 20:03:12	ntoronto	set	messages: + msg58367
2007-12-10 19:51:17	gvanrossum	set	messages: + msg58365
2007-12-10 19:50:25	gvanrossum	set	status: open -> closed resolution: out of date
2007-12-10 19:16:26	ntoronto	set	messages: + msg58360
2007-12-10 18:05:16	gvanrossum	set	nosy: + gvanrossum messages: + msg58355
2007-12-05 21:45:14	loewis	set	keywords: + patch
2007-12-05 21:00:24	ntoronto	create