Issue 26280: ceval: Optimize list[int] (subscript) operation similarly to CPython 2.7

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/70468

classification

Title:	ceval: Optimize list[int] (subscript) operation similarly to CPython 2.7
Type:	performance	Stage:	resolved
Components:	Interpreter Core	Versions:	Python 3.11

process

Status:	closed	Resolution:	fixed
Dependencies:		Superseder:
Assigned To:		Nosy List:	Jim Fasarakis-Hilliard, Mark.Shannon, gvanrossum, iritkatriel, pablogsal, pitrou, serhiy.storchaka, vstinner, yselivanov, zbyrne
Priority:	normal	Keywords:	patch

Created on 2016-02-03 17:02 by yselivanov, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Files
File name	Uploaded	Description	Edit
26280_stats.diff	zbyrne, 2016-02-05 03:46	quick and dirty counters for a few types	review
subscr_stats.txt	zbyrne, 2016-02-05 17:04
subscr1.patch	zbyrne, 2016-02-17 02:56		review
subscr2.patch	zbyrne, 2016-02-29 23:53	Removed tuple block, inlined a few things	review

Pull Requests
URL	Status	Linked	Edit
PR 27043	merged	iritkatriel, 2021-07-05 22:53

Messages (25)
msg259492 - (view)	Author: Yury Selivanov (yselivanov) *	Date: 2016-02-03 17:02
See also issue #21955
msg259512 - (view)	Author: Zach Byrne (zbyrne) *	Date: 2016-02-03 20:18
Yury, Are you going to tackle this one, or would you like me to?
msg259514 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2016-02-03 20:27
Would be nice to collect statistics about arguments types during running the testsuite. May be list is not the most popular type of collection.
msg259516 - (view)	Author: Yury Selivanov (yselivanov) *	Date: 2016-02-03 20:42
Zach, first I was going to collect some stats on this (as Serhiy also suggests). It would be interesting to collect some stats on how many times BINARY_SUBSCR receives lists, tuples, dicts, and other types. I'd instrument the code to collect those stats and run Python tests suite and benchmarks. I'm pretty sure that optimizing lists (and tuples?) is a great idea. It would also be nice to optimize [-1] lookup. I'd also measure if we should add a fast path for dicts (PyDict_CheckExact + PyDict_GetItem). If you have time to work on this issue, then by all means go ahead. We'll be glad to assist you and review the patch.
msg259517 - (view)	Author: Yury Selivanov (yselivanov) *	Date: 2016-02-03 20:58
Zach, BTW, you can see how I instrumented ceval for stats collection in the patch for issue #26219. That's only for the research purposes, we won't commit that... or maybe we will, but in a separate issue :).
msg259602 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2016-02-04 23:00
> I'm pretty sure that optimizing lists (and tuples?) is a great idea. I think it's a good idea indeed. > It would also be nice to optimize [-1] lookup How is that different from the above? :)
msg259611 - (view)	Author: Zach Byrne (zbyrne) *	Date: 2016-02-05 01:28
Ok, I've started on the instrumenting, thanks for that head start, that would have taken me a while to figure out where to call the stats dump function from. Fun fact: BINARY_SUBSCR is called 717 starting python.
msg259625 - (view)	Author: Zach Byrne (zbyrne) *	Date: 2016-02-05 03:46
I'll put together something comprehensive in a bit, but here's a quick preview: $ ./python Python 3.6.0a0 (default, Feb 4 2016, 20:08:03) [GCC 4.6.3] on linux Type "help", "copyright", "credits" or "license" for more information. >>> exit() Total BINARY_SUBSCR calls: 726 List BINARY_SUBSCR calls: 36 Tuple BINARY_SUBSCR calls: 103 Dict BINARY_SUBSCR calls: 227 Unicode BINARY_SUBSCR calls: 288 Bytes BINARY_SUBSCR calls: 68 [-1] BINARY_SUBSCR calls: 0 $ python bm_elementtree.py -n 100 --timer perf_counter ...[snip]... Total BINARY_SUBSCR calls: 1078533 List BINARY_SUBSCR calls: 513 Tuple BINARY_SUBSCR calls: 1322 Dict BINARY_SUBSCR calls: 1063075 Unicode BINARY_SUBSCR calls: 13150 Bytes BINARY_SUBSCR calls: 248 [-1] BINARY_SUBSCR calls: 0 Lib/test$ ../../python -m unittest discover ...[snip]...^C <== I got bored waiting KeyboardInterrupt Total BINARY_SUBSCR calls: 4732885 List BINARY_SUBSCR calls: 1418730 Tuple BINARY_SUBSCR calls: 1300717 Dict BINARY_SUBSCR calls: 1151766 Unicode BINARY_SUBSCR calls: 409924 Bytes BINARY_SUBSCR calls: 363029 [-1] BINARY_SUBSCR calls: 26623 So dict seems to be the winner here
msg259627 - (view)	Author: Yury Selivanov (yselivanov) *	Date: 2016-02-05 04:10
Looks like we want to specialize it for lists, tuples, and dicts; as expected. Not so sure about [-1, but I suggest to benchmark it anyways ;)
msg259628 - (view)	Author: Zach Byrne (zbyrne) *	Date: 2016-02-05 04:13
One thing I forgot to do was count slices.
msg259632 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2016-02-05 05:53
Looks as statistics varies from test to test too much. Could you please collect the statistics for running all tests?
msg259651 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2016-02-05 09:52
The test suite is not an appropriate workload to run benchmarks or statistics. Can you run with the benchmarks suite instead?
msg259652 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2016-02-05 09:54
I also suggest counting the number of BINARY_SUBSCR calls that are not one of the builtin types under consideration. Also, it would be good to distinguish slicing from integer indexing, for lists and tuples.
msg259677 - (view)	Author: Zach Byrne (zbyrne) *	Date: 2016-02-05 17:04
I'm attaching output from a selection of the benchmarks, I'm counting non-builtins and slices, but for everything, not just lists and tuples. Quick observation: math workloads seem list heavy, text workloads seem dict heavy, and tuples are usually somewhere in the middle.
msg259711 - (view)	Author: STINNER Victor (vstinner) *	Date: 2016-02-06 01:51
Oh, I just created a duplicate with a patch for list[int]: issue #26301.
msg260376 - (view)	Author: Zach Byrne (zbyrne) *	Date: 2016-02-17 02:56
Here's a patch that looks likes Victor's from the duplicate, but with tuples covered as well. I ran some straight forward micro benchmarks but haven't bothered running the benchmark suite yet. Unsurprisingly, optimized paths are faster, and the others take a penalty. [0]byrnez@byrnez-laptop:~/git/python$ ./python.orig -m timeit -s "l = [1,2,3,4,5,6]" "l[3]" 10000000 loops, best of 3: 0.0306 usec per loop [0]byrnez@byrnez-laptop:~/git/python$ ./python -m timeit -s "l = [1,2,3,4,5,6]" "l[3]" 10000000 loops, best of 3: 0.0243 usec per loop [0]byrnez@byrnez-laptop:~/git/python$ ./python.orig -m timeit -s "l = (1,2,3,4,5,6)" "l[3]" 10000000 loops, best of 3: 0.0291 usec per loop [0]byrnez@byrnez-laptop:~/git/python$ ./python -m timeit -s "l = (1,2,3,4,5,6)" "l[3]" 10000000 loops, best of 3: 0.0241 usec per loop [0]byrnez@byrnez-laptop:~/git/python$ ./python.orig -m timeit -s "l = 'asdfasdf'" "l[3]" 10000000 loops, best of 3: 0.034 usec per loop [0]byrnez@byrnez-laptop:~/git/python$ ./python -m timeit -s "l = 'asdfasdf'" "l[3]" 10000000 loops, best of 3: 0.0366 usec per loop [0]byrnez@byrnez-laptop:~/git/python$ ./python.orig -m timeit -s "l = [1,2,3,4,5,6]" "l[:3]" 10000000 loops, best of 3: 0.124 usec per loop [0]byrnez@byrnez-laptop:~/git/python$ ./python -m timeit -s "l = [1,2,3,4,5,6]" "l[:3]" 10000000 loops, best of 3: 0.125 usec per loop
msg260377 - (view)	Author: STINNER Victor (vstinner) *	Date: 2016-02-17 07:07
I suggest to try to inline PyList_GetItem: use PyList_GET_ITEM and raise the exception manually if needed. I'm not sure that it's ok to add PyLong_AsSize_t() to the slow path. Copy the code in each if? A macro can help.
msg260497 - (view)	Author: Zach Byrne (zbyrne) *	Date: 2016-02-19 02:19
Is it worth handling the exception, or just let it take the slow path and get caught by PyObject_GetItem()? We're still making sure the index is in bounds. Also, where would be an appropriate place to put a macro for adjusting negative indices?
msg261034 - (view)	Author: Zach Byrne (zbyrne) *	Date: 2016-02-29 23:53
The new patch "subscr2" removes the tuple block, and addresses Victor's comments. This one looks a little faster, down to 0.0215 usec for the same test.
msg396986 - (view)	Author: Irit Katriel (iritkatriel) *	Date: 2021-07-05 10:36
This looks like a case of specialization.
msg396998 - (view)	Author: Guido van Rossum (gvanrossum) *	Date: 2021-07-05 15:45
Very much so. Irit, do you want to give it a try? (Note there are helpful instructions now in Python/adaptive.md.)
msg396999 - (view)	Author: Irit Katriel (iritkatriel) *	Date: 2021-07-05 15:47
Sure, I'll have a look.
msg397031 - (view)	Author: Pablo Galindo Salgado (pablogsal) *	Date: 2021-07-05 23:42
Here is my previous attempt at this, for reference: https://github.com/python/cpython/pull/9853
msg397543 - (view)	Author: Mark Shannon (Mark.Shannon) *	Date: 2021-07-15 12:13
New changeset 641345d636320a6fca04a5271fa4c4c5ba3e5437 by Irit Katriel in branch 'main': bpo-26280: Port BINARY_SUBSCR to PEP 659 adaptive interpreter (GH-27043) https://github.com/python/cpython/commit/641345d636320a6fca04a5271fa4c4c5ba3e5437
msg397557 - (view)	Author: Guido van Rossum (gvanrossum) *	Date: 2021-07-15 15:11
Way to go Irit!-- --Guido (mobile)

History
Date	User	Action	Args
2022-04-11 14:58:27	admin	set	github: 70468
2021-07-15 15:11:26	gvanrossum	set	messages: + msg397557
2021-07-15 12:23:09	iritkatriel	set	status: open -> closed resolution: fixed stage: patch review -> resolved
2021-07-15 12:13:15	Mark.Shannon	set	messages: + msg397543
2021-07-05 23:42:20	pablogsal	set	nosy: + pablogsal messages: + msg397031
2021-07-05 22:56:36	iritkatriel	set	versions: + Python 3.11, - Python 3.6
2021-07-05 22:53:36	iritkatriel	set	stage: needs patch -> patch review pull_requests: + pull_request25602
2021-07-05 15:47:54	iritkatriel	set	messages: + msg396999
2021-07-05 15:45:51	gvanrossum	set	messages: + msg396998
2021-07-05 10:36:18	iritkatriel	set	nosy: + gvanrossum, Mark.Shannon, iritkatriel messages: + msg396986
2017-05-26 16:25:22	Jim Fasarakis-Hilliard	set	nosy: + Jim Fasarakis-Hilliard
2016-03-01 01:36:27	vstinner	set	title: ceval: Optimize list -> ceval: Optimize list[int] (subscript) operation similarly to CPython 2.7
2016-02-29 23:53:30	zbyrne	set	files: + subscr2.patch messages: + msg261034
2016-02-19 02:19:51	zbyrne	set	messages: + msg260497
2016-02-17 07:07:21	vstinner	set	messages: + msg260377 title: ceval: Optimize list[int] (subscript) operation similarly to CPython 2.7 -> ceval: Optimize list
2016-02-17 02:56:22	zbyrne	set	files: + subscr1.patch messages: + msg260376
2016-02-06 02:26:07	vstinner	link	issue26301 superseder
2016-02-06 01:51:49	vstinner	set	messages: + msg259711
2016-02-06 01:51:03	vstinner	set	title: ceval: Optimize [] operation similarly to CPython 2.7 -> ceval: Optimize list[int] (subscript) operation similarly to CPython 2.7
2016-02-05 17:04:45	zbyrne	set	files: + subscr_stats.txt messages: + msg259677
2016-02-05 09:54:07	pitrou	set	messages: + msg259652
2016-02-05 09:52:53	pitrou	set	messages: + msg259651
2016-02-05 05:53:03	serhiy.storchaka	set	messages: + msg259632
2016-02-05 04:13:56	zbyrne	set	messages: + msg259628
2016-02-05 04:10:03	yselivanov	set	messages: + msg259627
2016-02-05 03:46:19	zbyrne	set	files: + 26280_stats.diff keywords: + patch messages: + msg259625
2016-02-05 01:28:24	zbyrne	set	messages: + msg259611
2016-02-04 23:00:40	pitrou	set	nosy: + pitrou messages: + msg259602
2016-02-03 20:58:10	yselivanov	set	messages: + msg259517
2016-02-03 20:42:36	yselivanov	set	messages: + msg259516
2016-02-03 20:27:35	serhiy.storchaka	set	nosy: + serhiy.storchaka messages: + msg259514
2016-02-03 20:18:16	zbyrne	set	messages: + msg259512
2016-02-03 17:02:18	yselivanov	create