Message248492
On Wed, Aug 12, 2015 at 09:23:26PM +0000, flying sheep wrote:
> Python has iterators and iterables. iterators are non-reentrant
> iterables: once they are exhausted, they are useless.
Correct.
> But there are also iterables that create new, iterators whenever
> iter(iterable) is called (e.g. implicitly in a for loop). They are
> reentrant. This is why you can loop sequences such as lists more than
> once.
The *iterable* itself may be reentrant, but the iterator formed from
iter(iterable) is not. So by your previous comment, giving the iterator
form a length is not appropriate.
Do you know of any non-iterator iterables which do not have a length
when they could? With the exception of tee, all the functions in
itertools return iterators.
> One of those reentrant iterables is range(), whose __iter__ functions
> creates new lazy iterables, which has a __len__, and so on. It even
> has random access just like a sequence.
You are misinterpreting what you are seeing. range objects already
are sequences with a length, and nothing needs be done with them. But
iter(range) are not sequences, they are iterators, and then are not
sized and have no __len__ method:
py> it = iter(range(10))
py> len(it)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: object of type 'range_iterator' has no len()
If range_iterator objects were given a length, what would it be? Should
it be the length of the underlying range object, which is easy to
calculate but wrong? That's what you suggest below (your comments about
chain). Or the length of how many items are yet to be seen, which is
surprising in other ways?
> Now it’s always entirely possible to *lazily* determine
> len(chain(range(200), [1,2,5])),
Sure. But chain doesn't just accept range objects and lists as
arguments, it accepts *arbitrary iterables* which you accept cannot be
sized. So len(chain_obj) *may or may not* raise TypeError. Since you
can't rely on it having a length, you have to program as if it doesn't.
So in practice, I believe this will just add complication.
> which is of course len(range(200)) +
> len([1,2,5]) = 200 + 3 = 203. No reentrant iterables are necessary
> here, only iterables with a __len__. (Simply calling len() on them all
> is sufficient, as it could only create a TypeError which would
> propagate upwards)
That would be wrong. Consider:
it = chain("ab", "cd")
throw_away = next(it)
assert len(it) == 2 + 2 # call len() on the sequences
assert len(list(it)) == len(it) # fails since 3 != 4 |
|
Date |
User |
Action |
Args |
2015-08-13 00:47:10 | steven.daprano | set | recipients:
+ steven.daprano, rhettinger, r.david.murray, flying sheep |
2015-08-13 00:47:09 | steven.daprano | link | issue24849 messages |
2015-08-13 00:47:06 | steven.daprano | create | |
|