Issue 30049: Don't cache tp_iternext

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/74235

classification

Title:	Don't cache tp_iternext
Type:	behavior	Stage:	resolved
Components:	Extension Modules, Interpreter Core	Versions:	Python 3.7

process

Status:	closed	Resolution:	rejected
Dependencies:		Superseder:
Assigned To:	rhettinger	Nosy List:	josh.r, rhettinger, serhiy.storchaka
Priority:	normal	Keywords:

Created on 2017-04-12 08:39 by serhiy.storchaka, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Pull Requests
URL	Status	Linked	Edit
PR 1088	closed	serhiy.storchaka, 2017-04-12 08:47

Messages (3)
msg291530 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2017-04-12 08:39
Some operations cache the value of the tp_iternext slot and call it in a loop. But calling tp_iternext can run arbitrary code, release GIL and change the tp_iternext slot in the same or other thread. This can leads to visible behavior difference, such as different iterated values, different raised exceptions or infinite loop. In the past this even could cause a crash (see issue3720), but seems now this is impossible. Examples (list constructor caches tp_iternext, tuple constructor doesn't): >>> def make_iter(): ... class Iterator: ... def __iter__(self): ... return self ... def __next__(self): ... del Iterator.__next__ ... return 1 ... return Iterator() ... >>> tuple(make_iter()) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: 'Iterator' object is not iterable >>> list(make_iter()) Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: __next__ >>> >>> def make_iter2(): ... it2 = iter((2,)) ... def subiter(): ... Iterator.__next__ = Iterator.next2 ... yield 1 ... class Iterator(filter): ... def next2(self): ... return next(it2) ... return Iterator(lambda x: True, subiter()) ... >>> tuple(make_iter2()) (1, 2) >>> list(make_iter2()) [1] The tp_iternext is cached for performance, and removing the caching can cause performance regression. But actually the difference is very small. I found a measurable difference (up to 5%) only in following artificial examples in which tp_iternext is called in very tight and long loops: $ ./python -m perf timeit --compare-to ./python-orig -s 'a = [0]1000000+[1]' -- 'next(filter(None, a))'python-orig: ..................... 47.2 ms +- 0.2 ms python: ..................... 49.7 ms +- 0.3 ms Mean +- std dev: [python-orig] 47.2 ms +- 0.2 ms -> [python] 49.7 ms +- 0.3 ms: 1.05x slower (+5%) $ ./python -m perf timeit --compare-to ./python-orig -s 'from itertools import repeat, islice' -- 'next(islice(repeat(1), 1000000, None))' python-orig: ..................... 15.4 ms +- 0.1 ms python: ..................... 16.0 ms +- 0.2 ms Mean +- std dev: [python-orig] 15.4 ms +- 0.1 ms -> [python] 16.0 ms +- 0.2 ms: 1.04x slower (+4%) $ ./python -m perf timeit --compare-to ./python-orig -s 'from itertools import repeat; from collections import deque' -- 'deque(repeat(1, 1000000), 0)' python-orig: ..................... 14.2 ms +- 0.1 ms python: ..................... 14.8 ms +- 0.2 ms Mean +- std dev: [python-orig] 14.2 ms +- 0.1 ms -> [python] 14.8 ms +- 0.2 ms: 1.05x slower (+5%) In all other other cases, when involved creation of a collection (list, bytearray, deque (with maxlen != 0) constructors, ''.join), or calling other code (builtins all, max, map, itertools functions), or for shorter loops the difference is hardly distinguished from the random noise. $ ./python -m perf timeit --compare-to ./python-orig -s 'a = [0]1000' -- 'list(iter(a))' python-orig: ..................... 31.8 us +- 0.3 us python: ..................... 31.8 us +- 0.4 us Mean +- std dev: [python-orig] 31.8 us +- 0.3 us -> [python] 31.8 us +- 0.4 us: 1.00x faster (-0%) Not significant! $ ./python -m perf timeit --compare-to ./python-orig -s 'a = [1]1000' -- 'all(a)' python-orig: ..................... 47.4 us +- 0.2 us python: ..................... 48.0 us +- 0.3 us Mean +- std dev: [python-orig] 47.4 us +- 0.2 us -> [python] 48.0 us +- 0.3 us: 1.01x slower (+1%) $ ./python -m perf timeit --compare-to ./python-orig -s 'a = [1]1000' -- 'max(a)' python-orig: ..................... 108 us +- 1 us python: ..................... 108 us +- 1 us Mean +- std dev: [python-orig] 108 us +- 1 us -> [python] 108 us +- 1 us: 1.00x faster (-0%) Not significant! $ ./python -m perf timeit --compare-to ./python-orig -s 'a = [0]*1000000+[1]' -- 'next(filter(lambda x: x, a))' python-orig: ..................... 527 ms +- 8 ms python: ..................... 528 ms +- 2 ms Mean +- std dev: [python-orig] 527 ms +- 8 ms -> [python] 528 ms +- 2 ms: 1.00x slower (+0%) Not significant! $ ./python -m perf timeit --compare-to ./python-orig -s 'from itertools import repeat, islice' -- 'next(islice(repeat(1), 100, None))' python-orig: ..................... 4.72 us +- 0.05 us python: ..................... 4.72 us +- 0.04 us Mean +- std dev: [python-orig] 4.72 us +- 0.05 us -> [python] 4.72 us +- 0.04 us: 1.00x faster (-0%) Not significant! $ ./python -m perf timeit --compare-to ./python-orig -s 'from itertools import repeat; from collections import deque' -- 'deque(repeat(1, 100), 0)' python-orig: ..................... 4.16 us +- 0.11 us python: ..................... 4.11 us +- 0.05 us Mean +- std dev: [python-orig] 4.16 us +- 0.11 us -> [python] 4.11 us +- 0.05 us: 1.01x faster (-1%)
msg291818 - (view)	Author: Josh Rosenberg (josh.r) *	Date: 2017-04-18 02:13
I'm having a hard time thinking of legitimate cases where replacing __next__ mid-iteration is a thing. Doing so retroactively on the class (so it changes all outstanding iterators, no matter what state they're in) seems dubious at best, and it wouldn't be thread-safe if many different ways if iterators of said type are being used in multiple threads. Is the dynamic type lookup on every iteration necessary too? Would failing to do so cause crashes if evil iterator reassigns self.__class__ or something? I feel like if segfaults aren't a possibility (I haven't been able to cause one on Py 3.5 by reassigning the class's __next__ or the instance's __class__), it's not our job to fix intentionally broken code at the expense of all other code. Raymond okayed the previous issue because it was a segfaulting bug, but it seems like this is just a "broken code is broken" bug.
msg291854 - (view)	Author: Raymond Hettinger (rhettinger) *	Date: 2017-04-19 00:41
> it seems like this is just a "broken code is broken" bug I agree. Unless there were a crasher of some sort, I prefer to keep the existing code, some of which has been deployed successfully for well over a decade.

History
Date	User	Action	Args
2022-04-11 14:58:45	admin	set	github: 74235
2017-04-19 00:41:26	rhettinger	set	status: open -> closed resolution: rejected messages: + msg291854 stage: patch review -> resolved
2017-04-18 02:13:33	josh.r	set	nosy: + josh.r messages: + msg291818
2017-04-12 11:07:10	rhettinger	set	assignee: rhettinger
2017-04-12 08:47:13	serhiy.storchaka	set	pull_requests: + pull_request1231
2017-04-12 08:39:23	serhiy.storchaka	create