Issue 37131: all(range()...)) is needlessley slow

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/81312

classification

Title:	all(range()...)) is needlessley slow
Type:	performance	Stage:	resolved
Components:	Interpreter Core	Versions:	Python 3.7

process

Status:	closed	Resolution:	not a bug
Dependencies:		Superseder:
Assigned To:		Nosy List:	Petersen, serhiy.storchaka, steven.daprano
Priority:	normal	Keywords:

Created on 2019-06-02 09:39 by Petersen, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Messages (6)
msg344263 - (view)	Author: Terji (Petersen) *	Date: 2019-06-02 09:39
Checking if a range's items are ll truthy can be done y checking if 0 the range contains 0, however currently Python iterates over the range, making the operation slower than needed. >>> rng = range(1, 1_000_000) >>> timeit all(rng) 19.9 ms ± 599 µs per loop If the all function could special case range, this could be like made fast like this: if isinstance(obj, range): if 0 in obj: return False return True
msg344264 - (view)	Author: Steven D'Aprano (steven.daprano) *	Date: 2019-06-02 09:58
Why should ``all()`` special case ``range``? Apart from showing off benchmarks, what's the point? Should ``any()`` also special case it? How about other lazy sequences and computed iterables, such as itertools.count objects, itertools.cycle objects, itertools.repeat objects, etc? How many special cases do we want? Personally, I don't necessarily oppose special-casing types, but it breaks the independence of types, and couples the all() function and the range() function. To be worth doing, we should get some practical benefit out of it. It isn't clear what that benefit is. Don't get me wrong, its not like I want all(range(1, 10**10)) to be slow. But nor do I want to couple the all() function to the range function and complicate the implementation if there is no corresponding benefit. After all, testing with all() is fundamentally an O(N) worst case operation so anything better than that is a nice surprise. If there were special dunders __all__ and __any__ it would be easy to encapsulate this behaviour inside the range objects themselves, and neither any() nor all() would need to know anything about range objects.
msg344265 - (view)	Author: Terji (Petersen) *	Date: 2019-06-02 10:25
>> Why should ``all()`` special case ``range``? Apart from showing off benchmarks, what's the point? Mostly to avoid silly mistakes, and the overhead of doing it would be very small, so a very small trade-off. >> Should ``any()`` also special case it? No, ``any()``takes at most two iterations to check truthiness in a ``range()``, so that's wouldn't be needed. >> How about other lazy sequences and computed iterables, such as itertools.count objects, itertools.cycle objects, itertools.repeat objects, etc? How many special cases do we want? No, I wouldn't propose that. The argument would be that those are iterators/generators while ``range()`` is special-cased built-in sequence with known properties.
msg344267 - (view)	Author: Terji (Petersen) *	Date: 2019-06-02 10:30
>> If there were special dunders __all__ and __any__ it would be easy to encapsulate this behaviour inside the range objects themselves, and neither any() nor all() would need to know anything about range objects. This sounds very interesting, and more general. It would be useful for e.g. numpy arrays, where ``all(arr)`` currently is slower than ``arr.all()``. Don't know it there's a reason it wasn't implemented this way originally?
msg344268 - (view)	Author: Steven D'Aprano (steven.daprano) *	Date: 2019-06-02 10:37
> >> Why should ``all()`` special case ``range``? Apart from showing off benchmarks, what's the point? > > Mostly to avoid silly mistakes What sort of silly mistakes? > and the overhead of doing it would be very small, so a very small trade-off. Its not just the overhead, its the coupling between two chunks of code which should be independent.
msg344282 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2019-06-02 11:59
You can implement it yourself. _all = all def all(obj): if isinstance(obj, range): return 0 not in obj return _all(obj) I do not see reasons to do this in the stdlib.

History
Date	User	Action	Args
2022-04-11 14:59:16	admin	set	github: 81312
2019-06-02 11:59:57	serhiy.storchaka	set	status: open -> closed nosy: + serhiy.storchaka messages: + msg344282 resolution: not a bug stage: resolved
2019-06-02 10:37:37	steven.daprano	set	messages: + msg344268
2019-06-02 10:30:44	Petersen	set	messages: + msg344267
2019-06-02 10:25:22	Petersen	set	messages: + msg344265
2019-06-02 09:58:04	steven.daprano	set	nosy: + steven.daprano messages: + msg344264
2019-06-02 09:39:49	Petersen	create