Issue 42147: Micro optimization for longrange iteration if step is 1

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/86313

classification

Title:	Micro optimization for longrange iteration if step is 1
Type:	performance	Stage:	resolved
Components:	Interpreter Core	Versions:	Python 3.10

process

Status:	closed	Resolution:	rejected
Dependencies:		Superseder:
Assigned To:	corona10	Nosy List:	corona10, mark.dickinson, methane, pablogsal, rhettinger, serhiy.storchaka, vstinner
Priority:	normal	Keywords:	patch

Created on 2020-10-25 14:25 by corona10, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Files
File name	Uploaded	Description	Edit
bench_longrange.py	corona10, 2020-10-25 14:25

Pull Requests
URL	Status	Linked	Edit
PR 22971	closed	corona10, 2020-10-25 14:28

Messages (19)
msg379578 - (view)	Author: Dong-hee Na (corona10) *	Date: 2020-10-25 14:25
This is a similar case with https://bugs.python.org/issue41902 A quite possible usecase when the user use range with big number ./python -m pyperf compare_to longrange_master.json longrange_opt.json Mean +- std dev: [longrange_master] 6.45 us +- 0.09 us -> [longrange_opt] 5.72 us +- 0.07 us: 1.13x faster (-11%)
msg379586 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2020-10-25 16:03
Sorry, but I do not think it is worth. If there are any issues with multiplying by 1, it may be worth to optimize multiplication of integers in general. But iterating range(1 << 1000, (1 << 1000) + 100) looks a pretty artificial example.
msg379587 - (view)	Author: Dong-hee Na (corona10) *	Date: 2020-10-25 16:18
At bpo-41092, we decide not to optimize mulitply operation itself. I approach as same case that step=1 is quite a lot usecase until user designate step value. If range with longobject is not that much use case, I can agree with your opinion
msg379588 - (view)	Author: Dong-hee Na (corona10) *	Date: 2020-10-25 16:27
s/ I approach as same case that step=1 is quite a lot usecase until user designate step value. /My assumtion is that step=1 is quite often usecase until user designate step value.
msg379590 - (view)	Author: Dong-hee Na (corona10) *	Date: 2020-10-25 16:57
if somebody try to sum very large value with range. sum(range(1 << 1000, (1 << 1000) + 100)) it can be affected by this patch.
msg379605 - (view)	Author: Pablo Galindo Salgado (pablogsal) *	Date: 2020-10-25 20:34
I concur with Serhiy
msg379627 - (view)	Author: Inada Naoki (methane) *	Date: 2020-10-26 01:30
I agree that 1<<1000 is artificial example. But isn't this patch affects more regular patterns like [[] for _ in range(10000)]? If we reject this, shouldn't we revert GH-22479 too? I believe iterating range object is much more common use case than range.index.
msg379628 - (view)	Author: Pablo Galindo Salgado (pablogsal) *	Date: 2020-10-26 01:50
My particular opinion is that I would advise against having more branching, especially for code that is rare: it augments the possibilities of branch mispredictions, the changes of untested code paths, and the possibilities of unknown interactions. In this case, I agree that the code is simple enough, but I still feel that the cost may not be worth the extra maintenance needs of those branches.
msg379630 - (view)	Author: Inada Naoki (methane) *	Date: 2020-10-26 02:38
I had not noticed that there are rangeobject and longrangeobject. It doesn't affect to range(10000) definitely.
msg379631 - (view)	Author: Dong-hee Na (corona10) *	Date: 2020-10-26 02:41
@methane To be more precise, [[] for _ in range(10000)] will not effort this patch because range(10000) does not create longrangeiter object. This patch will affect to if the number is veeery large. On the other hand, GH-22479 is affect to all index API() whether the number is large or small. p.s by the way, Naoki is the last name or first name? ;)
msg379632 - (view)	Author: Dong-hee Na (corona10) *	Date: 2020-10-26 02:42
s / will not effort / will not be affected
msg379633 - (view)	Author: Dong-hee Na (corona10) *	Date: 2020-10-26 02:55
@pablogsal, @serhiy.storchaka > I would advise against having more branching, especially for code that is rare: About bpo-41902, I believe those patches affect all range objects which is created with step is one . I believe the usage of creation of range object with range(start), range(start, stop) is more than range(start, stop, step) unless there is significant proof that users prefer to create with range(start, stop, step) So I think that those patches are worth to do it. but PR 22971 is only affecting to longrange iteration. it means if there no worth to optimize range(start), range(start, stop) when the start and stop is big number, this issue could be rejected. And if the patch is rejected, I would like to suggest write a comment on the codeline for the future contributors why we decided to not optimizing this.
msg379634 - (view)	Author: Inada Naoki (methane) *	Date: 2020-10-26 02:55
> On the other hand, GH-22479 is affect to all index API() whether the number is large or small. Then, no need to revert GH-22479 for consistency. Thanks. > p.s by the way, Naoki is the last name or first name? ;) It is difficult to say what is first name. Inada is surname and Naoki is given name. Surname comes first in Japanese. Traditionally, we used reversed name order when write name in English to follow English-world manner. But China and Korea don't reverse name. And Japan is stopping reversed name order too. Japanese junior high schools teach surname-given name since about 2000.
msg379635 - (view)	Author: Dong-hee Na (corona10) *	Date: 2020-10-26 03:15
Sorry for the offtopic @methane Looks like Inada san is the right expression. ;) > But China and Korea don't reverse name. This is a very intereting fact, the national system of Korea are using surname-given name order or printing with a distinc section. A good example is the passport, surname: Na givenname: Dong-hee But the education of Korea teaches to use given-surname order when going abroad ;) So I always use the given-surname order when I have to write my name in English.
msg379636 - (view)	Author: Inada Naoki (methane) *	Date: 2020-10-26 03:20
Oh, I didn't know that. Thank you. I thought Chinese and Korean use surname-given name order because of "Xí Jìnpíng", "Moon Jae-in", and "Kim Jong-un". Japanese previous P.M. used "Shinzo Abe" English notation, but changed it to "Abe Shinzo" recently.
msg379651 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2020-10-26 08:12
I think that In sum(range(1 << 1000, (1 << 1000) + 100)) it is dwarfed by repeated addition of long integers in sum(). I expect the the effect of this optimization is even less that 13% in this case. In any case, for sum(range(a, b)) it is better to use formula (b-a)*(a+b-1)//2 if it is performance critical to you. Iterating range object is pretty rare in Python (in most cases you iterate a sequence itself or enumerate object), and iterating range object for large integer is especially rare. Also, in real code you do not just iterate for the sake of iterating, you execute some code in every iteration. And it will make the effect of the optimization of this case much smaller, closer to 1% or 0.1% that to 13%. Last week Inada-san played with my old optimization patch for using free lists for integers and decided to close the issue. He, and I, and other core developers agreed that it is not worth. Besides that optimization would help in much more cases and had good effect in microbenchmarks. And I closed also my other optimization patch few days ago. And did it many times in past. It is fanny to microoptimize the code, add special cases here and there, but it makes code larger and less maintenable, and a lot of such microoptimizations can have negative cumulative effect on the performance of general code. Sometimes the work of the core developer is to say "No" to his own code.
msg379654 - (view)	Author: Dong-hee Na (corona10) *	Date: 2020-10-26 08:48
> Sometimes the work of the core developer is to say "No" to his own code. Thank you for the careful sentence. For clear the air, I never think that my patch should be accepted. This is why I always get reviews from other core devs and I always accept the review result. (You may know my a lot of rejected PR ;)) but for the future record, we always record why someone tried this and why rejected(e.g why not worthed to do it) so this is the reason why I talked about why I tried this optimization. In the future, if someone submits a similar patch we can give this issue to them.
msg379655 - (view)	Author: Dong-hee Na (corona10) *	Date: 2020-10-26 08:50
I will close this issue by tomorrow with rejected ;)
msg379663 - (view)	Author: Dong-hee Na (corona10) *	Date: 2020-10-26 13:35
I close this issue with rejected status. Thank you serhiy, pablogsal and Inada-san for discussion ;)

History
Date	User	Action	Args
2022-04-11 14:59:37	admin	set	github: 86313
2020-10-26 13:35:50	corona10	set	status: open -> closed resolution: rejected messages: + msg379663 stage: patch review -> resolved
2020-10-26 08:50:41	corona10	set	messages: + msg379655
2020-10-26 08:48:42	corona10	set	messages: + msg379654
2020-10-26 08:12:26	serhiy.storchaka	set	messages: + msg379651
2020-10-26 03:20:37	methane	set	messages: + msg379636
2020-10-26 03:15:25	corona10	set	messages: + msg379635
2020-10-26 02:55:54	methane	set	messages: + msg379634
2020-10-26 02:55:43	corona10	set	messages: + msg379633
2020-10-26 02:42:03	corona10	set	messages: + msg379632
2020-10-26 02:41:23	corona10	set	messages: + msg379631
2020-10-26 02:38:38	methane	set	messages: + msg379630
2020-10-26 01:50:19	pablogsal	set	messages: + msg379628
2020-10-26 01:30:20	methane	set	messages: + msg379627
2020-10-25 20:34:59	pablogsal	set	nosy: + pablogsal messages: + msg379605
2020-10-25 16:57:23	corona10	set	messages: + msg379590
2020-10-25 16:27:51	corona10	set	messages: + msg379588
2020-10-25 16:18:41	corona10	set	messages: + msg379587
2020-10-25 16:03:17	serhiy.storchaka	set	nosy: + serhiy.storchaka, rhettinger, mark.dickinson messages: + msg379586
2020-10-25 14:28:19	corona10	set	keywords: + patch stage: patch review pull_requests: + pull_request21886
2020-10-25 14:25:05	corona10	create