Issue 9800: Fast path for small int-indexing of lists and tuples

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/54009

classification

Title:	Fast path for small int-indexing of lists and tuples
Type:	performance	Stage:
Components:	Interpreter Core	Versions:	Python 3.2

process

Status:	closed	Resolution:	rejected
Dependencies:		Superseder:
Assigned To:		Nosy List:	Aahz, aahz, georg.brandl, mark.dickinson, pitrou, rhettinger, stutzbach
Priority:	low	Keywords:	needs review

Created on 2010-09-08 18:07 by pitrou, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name	Uploaded	Description	Edit
evalsubscr.patch	pitrou, 2010-09-08 18:07		review

Messages (11)
msg115888 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2010-09-08 18:07
This is an experiment which turns out to yield little benefits, except on micro-benchmarks such as: $ ./python -m timeit -s "a=[1,2,3]" "a[0];a[1];a[-1];a[0];a[1];a[-1];a[0];a[1];a[-1];a[0];a[1];a[-1];a[0];a[1];a[-1];a[0];a[1];a[-1];a[0];a[1];a[-1];a[0];a[1];a[-1];a[0];a[1];a[-1];a[0];a[1];a[-1];" -> before: 1000000 loops, best of 3: 1.14 usec per loop -> after: 1000000 loops, best of 3: 0.759 usec per loop On IRC, Raymond pointed out that the long representation is not very friendly to direct manipulation for small ints, but even then it's not obvious it would benefit a lot: a large amount of time is certainly spent accessing the Python stack, manipulating reference counts, decoding opcodes and checking array bounds. The conclusion could be that such special-casing is futile when the body of code it avoids executing isn't big enough. Also, adding runtime checks has its own performance cost (more CPU instructions to execute, possible pipeline flushes due to branch misprediction, and pollution of branch prediction caches).
msg117902 - (view)	Author: Daniel Stutzbach (stutzbach)	Date: 2010-10-03 02:12
For what it's worth, a similar fast path existed in Python 2 for lists (but not tuples). It was removed for Python 3. I'm not sure why it was removed, but it may have been part of removing the PyInt type.
msg117941 - (view)	Author: Mark Dickinson (mark.dickinson) *	Date: 2010-10-04 07:26
A disadvantage of the patch is that ceval.c needs to know about the internal implementation of a PyLong. At the moment, this information is encapsulated only in Objects/longobject.c and a couple of other places (I think marshal.c).
msg117950 - (view)	Author: Raymond Hettinger (rhettinger) *	Date: 2010-10-04 14:25
In the past, we've allow ceval.c to peer through encapsulation in order to have fast paths.
msg117952 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2010-10-04 14:39
As I mentioned, the speedup is invisible anyway, so it's not really a "fast path" ;)
msg117980 - (view)	Author: Raymond Hettinger (rhettinger) *	Date: 2010-10-04 20:10
At any rate, I believe this used to be a fast-path. IIRC, Aahz put it in after demonstrating a considerable speed boost for common cases. Aahz, do you have any institutional memory around this one?
msg118010 - (view)	Author: Aahz (Aahz) *	Date: 2010-10-05 14:32
Wasn't me! And I've spent too little time on python-dev lately to remember stuff like this. :-(
msg118014 - (view)	Author: Daniel Stutzbach (stutzbach)	Date: 2010-10-05 14:51
I did some spelunking. Guido committed the similar optimization in r8306. The diff is at: http://svn.python.org/view/python/trunk/Python/ceval.c?r1=8087&r2=8306 His commit message was: Huge speedup by inlining some common integer operations: int+int, int-int, int <compareop> int, and list[int]. (Unfortunately, int*int is way too much code to inline.) Also corrected a NULL that should have been a zero. It's possible that these kinds of optimizations were worthwhile with PyInt but aren't worthwhile with PyLong. They were taken out by MvL in r59334, with a commit message of: Remove special-casing of integer operations, to stop using PyInt_CheckExact.
msg118015 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2010-10-05 15:03
> I did some spelunking. Guido committed the similar optimization in r8306. The diff is at: > http://svn.python.org/view/python/trunk/Python/ceval.c?r1=8087&r2=8306 > > His commit message was: > > Huge speedup by inlining some common integer operations: > int+int, int-int, int <compareop> int, and list[int]. > (Unfortunately, int*int is way too much code to inline.) > > Also corrected a NULL that should have been a zero. > > It's possible that these kinds of optimizations were worthwhile with > PyInt but aren't worthwhile with PyLong. It also doesn't say the individual contribution of each optimization, and it also doesn't say on which kind of workloads the "huge speedup" was witnessed (I would say that pystone is a possibility, or perhaps even some timeit micro-benchmark).
msg118104 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2010-10-07 11:58
I think the approach in issue10044 is better.
msg185412 - (view)	Author: Georg Brandl (georg.brandl) *	Date: 2013-03-28 09:58
Closing this old pending issue due to lack of activity.

History
Date	User	Action	Args
2022-04-11 14:57:06	admin	set	github: 54009
2013-03-28 09:58:51	georg.brandl	set	status: pending -> closed nosy: + georg.brandl messages: + msg185412 keywords: + needs review, - patch resolution: rejected
2010-10-07 11:58:16	pitrou	set	status: open -> pending messages: + msg118104
2010-10-05 15:03:09	pitrou	set	messages: + msg118015
2010-10-05 14:51:08	stutzbach	set	messages: + msg118014
2010-10-05 14:32:00	Aahz	set	nosy: + Aahz messages: + msg118010
2010-10-04 20:10:46	rhettinger	set	nosy: + aahz messages: + msg117980
2010-10-04 14:39:55	pitrou	set	messages: + msg117952
2010-10-04 14:25:58	rhettinger	set	messages: + msg117950
2010-10-04 07:26:44	mark.dickinson	set	messages: + msg117941
2010-10-03 02:12:51	stutzbach	set	nosy: + stutzbach messages: + msg117902
2010-09-08 18:07:08	pitrou	create