Issue 11889: 'enumerate' 'start' parameter documentation is confusing

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/56098

classification

Title:	'enumerate' 'start' parameter documentation is confusing
Type:	behavior	Stage:
Components:	Documentation	Versions:	Python 3.2, Python 3.3, Python 2.7

process

Status:	closed	Resolution:	fixed
Dependencies:		Superseder:
Assigned To:	rhettinger	Nosy List:	eric.araujo, phammer, python-dev, r.david.murray, rhettinger, terry.reedy
Priority:	low	Keywords:

Created on 2011-04-20 16:08 by phammer, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Messages (8)
msg134162 - (view)	Author: Peter Hammer (phammer)	Date: 2011-04-20 16:08
""" A point of confusion using the builtin function 'enumerate' and enlightenment for those who, like me, have been confused. Note, this confusion was discussed at length at http://bugs.python.org/issue2831 prior to the 'start' parameter being added to 'enumerate'. The confusion discussed herein was forseen in that discussion, and ultimately discounted. There remains, IMO, an issue with the clarity of the documentation that needs to be addressed. That is, the closed issue at http://bugs.python.org/issue8635 concerning the 'enumerate' docstring does not address the confusion that prompted this posting. Consider: x=['a','b','c','d','e'] y=['f','g','h','i','j'] print 0,y[0] for i,c in enumerate(y,1): print i,c if c=='g': print x[i], 'y[%i]=g' % (i) continue print x[i] This code produces the following unexpected output, using python 2.7, which is apparently the correct behavior (see commentary below). This example is an abstract simplification of a program defect encountered in practice: >>> 0 f 1 f b 2 g c y[2]=g 3 h d 4 i e 5 j Traceback (most recent call last): File "Untitled", line 9 print x[i] IndexError: list index out of range Help on 'enumerate' yields: >>> help(enumerate) Help on class enumerate in module __builtin__: class enumerate(object) \| enumerate(iterable[, start]) -> iterator for index, value of iterable \| \| Return an enumerate object. iterable must be another object that supports \| iteration. The enumerate object yields pairs containing a count (from \| start, which defaults to zero) and a value yielded by the iterable argument. \| enumerate is useful for obtaining an indexed list: \| (0, seq[0]), (1, seq[1]), (2, seq[2]), ... \| \| Methods defined here: \| \| __getattribute__(...) \| x.__getattribute__('name') <==> x.name \| \| __iter__(...) \| x.__iter__() <==> iter(x) \| \| next(...) \| x.next() -> the next value, or raise StopIteration \| \| ---------------------------------------------------------------------- \| Data and other attributes defined here: \| \| __new__ = <built-in method __new__ of type object> \| T.__new__(S, ...) -> a new object with type S, a subtype of T >>> Commentary: The expected output was: >>> 0 f 1 g b y[2]=g 2 h c 3 i d 4 j e >>> That is, it was expected that the iterator would yield a value corresponding to the index, whether the index started at zero or not. Using the notation of the doc string, with start=1, the expected behavior was: \| (1, seq[1]), (2, seq[2]), (3, seq[3]), ... while the actual behavior is: \| (1, seq[0]), (2, seq[1]), (3, seq[2]), ... The practical problem in the real world code was to do something special with the zero index value of x and y, then run through the remaining values, doing one of two things with x and y, correlated, depending on the value of y. I can see now that the doc string does in fact correctly specify the actual behavior: nowhere does it say the iterator will begin at any other place than the beginning, so this is not a python bug. I do however question the general usefulness of such behavior. Normally, indices and values are expected to be correlated. The correct behavior can be simply implemented without using 'enumerate': x=['a','b','c','d','e'] y=['f','g','h','i','j'] print 0,y[0] for i in xrange(1,len(y)): c=y[i] print i,c if c=='g': print x[i], 'y[%i]=g' % (i) continue print x[i] This produces the expected results. If one insists on using enumerate to produce the correct behavior in this example, it can be done as follows: """ x=['a','b','c','d','e'] y=['f','g','h','i','j'] seq=enumerate(y) print '%s %s' % seq.next() for i,c in seq: print i,c if c=='g': print x[i], 'y[%i]=g' % (i) continue print x[i] """ This version produces the expected results, while achieving clarity comparable to that which was sought in the original incorrect code. Looking a little deeper, the python documentation on enumerate states: enumerate(sequence[, start=0]) Return an enumerate object. sequence must be a sequence, an iterator, or some other object which supports iteration. The next() method of the iterator returned by enumerate() returns a tuple containing a count (from start which defaults to 0) and the corresponding value obtained from iterating over iterable. enumerate() is useful for obtaining an indexed series: (0, seq[0]), (1, seq[1]), (2, seq[2]), This makes a pretty clear implication the value corresponds to the index, so perhaps there really is an issue here. Have at it. I'm going back to work, using 'enumerate' as it actually is, now that I clearly understand it. One thing is certain: the documentation has to be clarified, for the confusion foreseen prior to adding the start parameter is very real. """
msg134169 - (view)	Author: R. David Murray (r.david.murray) *	Date: 2011-04-20 17:52
If you know what an iterator is, the documentation, it seems to me, is clear. That is, an iterator cannot be indexed, so the behavior you expected could not be implemented by enumerate. That doesn't meant the docs shouldn't be improved. An example with a non-zero start would make things clear.
msg134290 - (view)	Author: Terry J. Reedy (terry.reedy) *	Date: 2011-04-23 01:31
Note: 3.x correct gives the signature at enumerate(iterable, start) rather that enumerate(sequence, start). I agree that the current entry is a bit awkward. Perhaps the doc would be clearer with a reference to zipping. Removing the unneeded definition of iterable (which should be linked to the definition in the glossary, along with iterator), my suggestion is: ''' enumerate(iterable, start=0) Return an enumerate object, an iterator of tuples, that zips together a sequence of counts and iterable. Each tuple contain a count and an item from iterable, in that order. The counts begin with start, which defaults to 0. enumerate() is useful for obtaining an indexed series: enumerate(seq) produces (0, seq[0]), (1, seq[1]), (2, seq[2]), .... For another example, which uses start: >>> for i, season in enumerate(['Spring','Summer','Fall','Winter'], 1): ... print(i, season) 1 Spring 2 Summer 3 Fall 4 Winter ''' Note that I changed the example to use a start of 1 instead of 0, to produce a list in traditional form, which is one reason to have the parameter!
msg134302 - (view)	Author: Éric Araujo (eric.araujo) *	Date: 2011-04-23 14:42
+1 to what David says. Terry’s patch is a good starting point; I think Raymond will commit something along its lines.
msg134311 - (view)	Author: Raymond Hettinger (rhettinger) *	Date: 2011-04-23 16:54
I've got it from here. Thanks.
msg136824 - (view)	Author: Peter Hammer (phammer)	Date: 2011-05-25 03:22
""" Changing the 'enumerate' doc string text from: \| (0, seq[0]), (1, seq[1]), (2, seq[2]), ... to: \| (start, seq[0]), (start+1, seq[1]), (start+2, seq[2]), ... would completely disambiguate the doc string at the modest cost of sixteen additional characters, a small price for pellucid clarity. The proposed changes to the formal documentation also seem to me to be prudent, and I hope at this late writing, they have already been committed. I conclude with a code fragment for the edification of R. David Murray. """ class numerate(object): """ A demonstration of a plausible incorrect interpretation of the 'enumerate' function's doc string and documentation. """ def __init__(self,seq,start=0): self.seq=seq; self.index=start-1 try: if seq.next: pass #test for iterable for i in xrange(start): self.seq.next() except: if type(seq)==dict: self.seq=seq.keys() self.seq=iter(self.seq[start:]) def next(self): self.index+=1 return self.index,self.seq.next() def __iter__(self): return self if __name__ == "__main__": #s=['spring','summer','autumn','winter'] s={'spring':'a','summer':'b','autumn':'c','winter':'d'} #s=enumerate(s)#,2) s=numerate(s,2) for t in s: print t
msg139051 - (view)	Author: Roundup Robot (python-dev)	Date: 2011-06-25 12:57
New changeset 0ca8ffffd90b by Raymond Hettinger in branch '2.7': Issue 11889: Clarify docs for enumerate. http://hg.python.org/cpython/rev/0ca8ffffd90b
msg139054 - (view)	Author: Roundup Robot (python-dev)	Date: 2011-06-25 13:01
New changeset d0df12b32522 by Raymond Hettinger in branch '3.2': Issue 11889: Clarify docs for enumerate. http://hg.python.org/cpython/rev/d0df12b32522 New changeset 9b827e3998f6 by Raymond Hettinger in branch 'default': Issue 11889: Clarify docs for enumerate. http://hg.python.org/cpython/rev/9b827e3998f6

History
Date	User	Action	Args
2022-04-11 14:57:16	admin	set	github: 56098
2011-06-25 13:02:18	rhettinger	set	status: open -> closed resolution: fixed
2011-06-25 13:01:22	python-dev	set	messages: + msg139054
2011-06-25 12:57:13	python-dev	set	nosy: + python-dev messages: + msg139051
2011-05-25 04:46:57	rhettinger	set	priority: normal -> low
2011-05-25 03:22:38	phammer	set	messages: + msg136824
2011-04-23 16:54:45	rhettinger	set	messages: + msg134311
2011-04-23 14:42:00	eric.araujo	set	nosy: + eric.araujo messages: + msg134302
2011-04-23 01:31:46	terry.reedy	set	nosy: + terry.reedy messages: + msg134290 versions: + Python 3.2, Python 3.3
2011-04-20 19:27:27	rhettinger	set	assignee: rhettinger components: + Documentation, - None nosy: + rhettinger
2011-04-20 17:52:26	r.david.murray	set	nosy: + r.david.murray messages: + msg134169
2011-04-20 16:08:55	phammer	create