This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Fast path for small int-indexing of lists and tuples
Type: performance Stage:
Components: Interpreter Core Versions: Python 3.2
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: Nosy List: Aahz, aahz, georg.brandl, mark.dickinson, pitrou, rhettinger, stutzbach
Priority: low Keywords: needs review

Created on 2010-09-08 18:07 by pitrou, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
evalsubscr.patch pitrou, 2010-09-08 18:07 review
Messages (11)
msg115888 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-09-08 18:07
This is an experiment which turns out to yield little benefits, except on micro-benchmarks such as:

$ ./python -m timeit -s "a=[1,2,3]" "a[0];a[1];a[-1];a[0];a[1];a[-1];a[0];a[1];a[-1];a[0];a[1];a[-1];a[0];a[1];a[-1];a[0];a[1];a[-1];a[0];a[1];a[-1];a[0];a[1];a[-1];a[0];a[1];a[-1];a[0];a[1];a[-1];"

-> before: 1000000 loops, best of 3: 1.14 usec per loop
-> after: 1000000 loops, best of 3: 0.759 usec per loop

On IRC, Raymond pointed out that the long representation is not very friendly to direct manipulation for small ints, but even then it's not obvious it would benefit a lot: a large amount of time is certainly spent accessing the Python stack, manipulating reference counts, decoding opcodes and checking array bounds.

The conclusion could be that such special-casing is futile when the body of code it avoids executing isn't big enough. Also, adding runtime checks has its own performance cost (more CPU instructions to execute, possible pipeline flushes due to branch misprediction, and pollution of branch prediction caches).
msg117902 - (view) Author: Daniel Stutzbach (stutzbach) (Python committer) Date: 2010-10-03 02:12
For what it's worth, a similar fast path existed in Python 2 for lists (but not tuples).  It was removed for Python 3.  I'm not sure why it was removed, but it may have been part of removing the PyInt type.
msg117941 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-10-04 07:26
A disadvantage of the patch is that ceval.c needs to know about the internal implementation of a PyLong.  At the moment, this information is encapsulated only in Objects/longobject.c and a couple of other places (I think marshal.c).
msg117950 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2010-10-04 14:25
In the past, we've allow ceval.c to peer through encapsulation in order to have fast paths.
msg117952 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-10-04 14:39
As I mentioned, the speedup is invisible anyway, so it's not really a "fast path" ;)
msg117980 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2010-10-04 20:10
At any rate, I believe this used to be a fast-path.  IIRC, Aahz put it in after demonstrating a considerable speed boost for common cases.  Aahz, do you have any institutional memory around this one?
msg118010 - (view) Author: Aahz (Aahz) * Date: 2010-10-05 14:32
Wasn't me!  And I've spent too little time on python-dev lately to
remember stuff like this.  :-(
msg118014 - (view) Author: Daniel Stutzbach (stutzbach) (Python committer) Date: 2010-10-05 14:51
I did some spelunking.  Guido committed the similar optimization in r8306.    The diff is at:
http://svn.python.org/view/python/trunk/Python/ceval.c?r1=8087&r2=8306

His commit message was:

    Huge speedup by inlining some common integer operations:
    int+int, int-int, int <compareop> int, and list[int].
    (Unfortunately, int*int is way too much code to inline.)

    Also corrected a NULL that should have been a zero.

It's possible that these kinds of optimizations were worthwhile with PyInt but aren't worthwhile with PyLong.

They were taken out by MvL in r59334, with a commit message of:

    Remove special-casing of integer operations, to stop
    using PyInt_CheckExact.
msg118015 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-10-05 15:03
> I did some spelunking.  Guido committed the similar optimization in r8306.    The diff is at:
> http://svn.python.org/view/python/trunk/Python/ceval.c?r1=8087&r2=8306
> 
> His commit message was:
> 
>     Huge speedup by inlining some common integer operations:
>     int+int, int-int, int <compareop> int, and list[int].
>     (Unfortunately, int*int is way too much code to inline.)
> 
>     Also corrected a NULL that should have been a zero.
> 
> It's possible that these kinds of optimizations were worthwhile with
> PyInt but aren't worthwhile with PyLong.

It also doesn't say the individual contribution of each optimization,
and it also doesn't say on which kind of workloads the "huge speedup"
was witnessed (I would say that pystone is a possibility, or perhaps
even some timeit micro-benchmark).
msg118104 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-10-07 11:58
I think the approach in issue10044 is better.
msg185412 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2013-03-28 09:58
Closing this old pending issue due to lack of activity.
History
Date User Action Args
2022-04-11 14:57:06adminsetgithub: 54009
2013-03-28 09:58:51georg.brandlsetstatus: pending -> closed

nosy: + georg.brandl
messages: + msg185412

keywords: + needs review, - patch
resolution: rejected
2010-10-07 11:58:16pitrousetstatus: open -> pending

messages: + msg118104
2010-10-05 15:03:09pitrousetmessages: + msg118015
2010-10-05 14:51:08stutzbachsetmessages: + msg118014
2010-10-05 14:32:00Aahzsetnosy: + Aahz
messages: + msg118010
2010-10-04 20:10:46rhettingersetnosy: + aahz
messages: + msg117980
2010-10-04 14:39:55pitrousetmessages: + msg117952
2010-10-04 14:25:58rhettingersetmessages: + msg117950
2010-10-04 07:26:44mark.dickinsonsetmessages: + msg117941
2010-10-03 02:12:51stutzbachsetnosy: + stutzbach
messages: + msg117902
2010-09-08 18:07:08pitroucreate