Title: Generator expression bug?
Type: behavior Stage:
Components: Versions: Python 3.0, Python 2.4, Python 3.1, Python 2.7, Python 2.6, Python 2.5
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: r.david.murray, svenrahmann, terry.reedy
Priority: normal Keywords:

Created on 2009-05-08 13:24 by svenrahmann, last changed 2009-05-08 19:11 by terry.reedy. This issue is now closed.

File name Uploaded Description Edit svenrahmann, 2009-05-08 13:24 computing a 5x5 multiplicaton table in 3 different ways - one doesn't work as expected svenrahmann, 2009-05-08 17:18
Messages (4)
msg87440 - (view) Author: Sven Rahmann (svenrahmann) Date: 2009-05-08 13:24
Lists from list comprehensions and generator objects from generator
expressions behave differently when we repeatedly want to iterate over them.

This may or may not be a bug, but it is certainly not clear from the
documentation (see documentation of "for" statement in all recent python

The reason seems to be that generator expressions, once exhausted, are
not reset by using them again in a for loop.
This is different for lists and range objects.

The attached example illustrates the phenomenon.
It is written for Python 3, but the same phenomenon occurs in the 2.x
msg87443 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2009-05-08 14:10
This is not a bug.  It's not even a doc bug, IMO.

When you do

  num1 = [x for x in range(0, 6)]

it is not that you are assigning a list comprehension to num1, what you
are doing is running a list comprehension to create an actual list,
which is what gets assigned to num1.  The docs are pretty clear about
that, I think.  So yes you can iterate over a list multiple times,
because of how it implements the iteration protocol. 

On the other hand, when you do

  num3 = (x for x in range(0, 6))

you create a generator object, which is what gets assigned to num3. 
Generators created by generator expressions can only be iterated over
until they are exhausted.  That is a major point of their existence:
producing one item at a time on demand and not saving them.

A range object is its own special case, and is neither a list nor a
generator.  It is reusable, as you found.

None of this should be documented in the 'for' statement.  The for
statement explains the protocol it follows.  What happens when you use
it to iterate over any given object depends on how that object
impelements the iteration protcol.  So you have to look to the
documentation of those objects for further enlightenment, I'm afraid.
msg87451 - (view) Author: Sven Rahmann (svenrahmann) Date: 2009-05-08 17:18
I complete agree that by
x = (z for z in y)
I create and assign a generator object to x.

I'm afraid I disagree about "not a doc bug".
The documentation for "for" reads:

for_stmt ::=  "for" target_list "in" expression_list ":" suite
              ["else" ":" suite]

The expression list is evaluated once; it should yield an iterable
object. An iterator is created for the result of the expression_list.

This ("an iterator is created") suggests that a new iterator is created
for the generator object (the iterable).

I was actually surprised to find that the __iter__() function of a
generator object returns the generator object itself. 

If generator objects behave as they do, I'm probably going to file a
feature request for something like "reusable" generators.

In fact, with the attached file I'm trying to extract a column from a
matrix and use it for several computations. Since I don't want to copy
the values of a column (imagine a huge matrix), I want to create a
reusable generator object that repeatedly returns a generator object
that enumerates the values of a single column.

My impression was that generator expressions are useful for just this
type of application.

Therefore, the attached file now as a small class ReusableGenerator that
implements this behavior. However, the "ugly" part is that in order to
create it, you have to pass it a function that returns a generator
object, not the generator object itself.

Another attempt by deep-copying completely fails, and I don't understand
why this is the case; probably there's a good reason.
msg87460 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2009-05-08 19:11
Questions and discussions like this should be directed to the
python-list, mirrored on newsgroups comp.lang.python and
gmane.comp.python.general (or possibly other forums).

I will say this much but *only* this much here:
1. Generators are iterators; by definition, iterator.__init__ returns
itself.  Nearly all iterators are single use.
2. Generator functions and their abbreviated form, generator
expressions, create generators.  To re-iterate, re-call the generator
function or re-execute the generator expression.

It is possible that discussion elsewhere would generator specific doc
improvement suggestions that could be submitted in a new issue.
Date User Action Args
2009-05-08 19:11:34terry.reedysetstatus: open -> closed
nosy: + terry.reedy
messages: + msg87460

2009-05-08 17:18:51svenrahmannsetstatus: closed -> open
files: +
messages: + msg87451
2009-05-08 14:10:19r.david.murraysetstatus: open -> closed

nosy: + r.david.murray
messages: + msg87443

resolution: not a bug
2009-05-08 13:24:49svenrahmanncreate