Issue 2831: Adding start to enumerate()

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/47080

classification

Title:	Adding start to enumerate()
Type:	enhancement	Stage:
Components:	Interpreter Core	Versions:	Python 3.0, Python 2.6

process

Status:	closed	Resolution:	accepted
Dependencies:		Superseder:
Assigned To:		Nosy List:	georg.brandl, gsakkis, gvanrossum, ncoghlan, rhettinger, scott.dial
Priority:	normal	Keywords:	patch

Created on 2008-05-12 03:49 by scott.dial, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name	Uploaded	Description	Edit
enumerate.diff	scott.dial, 2008-05-12 03:54	patch to add start= to enumerate

Messages (14)
msg66705 - (view)	Author: Scott Dial (scott.dial)	Date: 2008-05-12 03:49
Georg Brandel suggested enumerate() should have the ability to start on an arbitrary number (instead of always starting at 0). I suggest such a parameter should be keyword-only. Attached is a patch to add such a feature along with added test cases. Documentation still needs to be updated, but I wasn't sure how best to handle that anyways. I wasn't sure how best to handle a keyword-only argument, so I'd be interested to know if there is a better way.
msg66709 - (view)	Author: Raymond Hettinger (rhettinger) *	Date: 2008-05-12 06:19
If a start argument gets accepted, it should be positional, not a keyword-only argument. That is a complete waste when there is just one argument with a straight-forward interpretation. Besides, METH_O is a lot faster than the alternatives.
msg66710 - (view)	Author: Raymond Hettinger (rhettinger) *	Date: 2008-05-12 06:23
Forget the part about METH_O. That was incorrect. Another idea to order the positional args as ([start,], iterator). That corresponds to with range([start,] stop) and it matches the output order (number, element): for i, element in enumerate(10, iterable): ^-----------------------^ ^-------------------------^
msg66711 - (view)	Author: Scott Dial (scott.dial)	Date: 2008-05-12 06:35
As it stands, enumerate() already takes a "sequence" keyword as an alternative to the first positional argument (although this seems to be completely undocumented). So, as you say, METH_O is a no go. I agree with you in that my original complaint with the positional argument was that enumerate(iterable, start) was "backwards." My other argument was that a large number of these iterator utility functions are foo(*iterable) and upon seeing enumerate(foo, bar), a reader might be inclined to assume it was equivalent to enumerate(chain(foo, bar)).
msg66712 - (view)	Author: Raymond Hettinger (rhettinger) *	Date: 2008-05-12 07:00
FWIW, at one point, Guido rejected all variants of the idea. His first objection was that enumerate() is all about pairing values with sequence indices, so starting from anything other than zero is in conflict with the core concept. His second objection is that all variants can easily be misread as starting at the nth item in the sequence (much like islice() does now): enumerate(3, 'abcdefg') --> (3,'d') (4,'e') (5, 'f') (6, 'g'). The latter mis-reading becomes more likely for those who think of enumerate as providing indices. In fact, one of the suggested names for enumerate was "indices".
msg66776 - (view)	Author: Nick Coghlan (ncoghlan) *	Date: 2008-05-13 09:49
Note that this functionality is currently available as follows: >>> from itertools import count >>> list(zip(count(3), 'abcdefg') [(3, 'a'), (4, 'b'), (5, 'c'), (6, 'd'), (7, 'e'), (8, 'f'), (9, 'g')] The enumerate(itr) builtin is just a convenience to avoid a module import for the most basic zip(count(), itr) version. The proposed patch would enable the example above to be written more verbosely as: >>> list(enumerate('abcdefg', start=3)) Or, with the positional argument approach as: >>> list(enumerate(3, 'abcdefg')) So, more verbose than the existing approach, and ambiguous to boot - as Raymond noted, with the first it really isn't clear whether the first value returned would be (3, 'd') or (3, 'a'), and with the second form it isn't clear whether we're skipping the first three items, or returning only those items. Let's keep the builtins simple, and let itertools handle the variants - that's why the module exists.
msg66778 - (view)	Author: Nick Coghlan (ncoghlan) *	Date: 2008-05-13 10:06
Mentioning the zip(count(start), itr) version in the enumerate() docs may be a good idea though. (And of course, in 2.x, it should be izip() rather than zip() to preserve the memory efficiency of enumerate())
msg66783 - (view)	Author: Guido van Rossum (gvanrossum) *	Date: 2008-05-13 14:24
> Thanks. I think this part is the main reason I see a start argument to > enumerate as potentially problematic: > > """all variants can easily be misread as starting at the nth item in the > sequence (much like islice() does now): enumerate(3, 'abcdefg') --> > (3,'d') (4,'e') (5, 'f') (6, 'g').""" So the ambiguity is that enumerate(it, start=N) could be taken as skipping the first N items of it rather than adding N to the index it returns. (And it is my own argument!) I'd like to withdraw this argument. There are two separate use cases for using enumerate(): one is to iterate over a sequence and to have a handy index by which to update the value in the sequence. Another is for 1-based counting, usually when printing 1-based ordinals (such as line numbers in files, dates in a month or months in a year, etc.). N-based counting is less common but still conceivable. However I see no use for skipping items from the start, and if that use case ever came up, passing a slice to enumerate() would be the appropriate thing to do. In fact, if you passed in a slice, you might also want to pass a corresponding start value so the indices produced match those of the original sequence. So, I am still in favor of adding a new argument to enumerate(). I'm neutral on the need for a keyword (don't think it would hurt, not sure how much it matters). I'm strongly against making it an optional leading argument like Raymond proposed; that's a style I just don't want to promote, range() and the curses module notwithstanding. > Is the need to use zip(count(3), seq) for the offset index case really such > a burden given the associated benefits in keeping the builtin function > really simple and easy to understand? Yes, zip(count(3), seq) is too complex for this simple use case. I've always solved this so far with this less-than-elegant but certainly simpler idiom (except for users stuck in the tradition of for-loops in certain older languages :-): for i, line in enumerat(lines): i += 1 print "%4d. %s" % (i, line) and variants thereof.
msg66789 - (view)	Author: Georg Brandl (georg.brandl) *	Date: 2008-05-13 18:34
Okay. I'm against making the argument keyword-only -- IMO keyword-only arguments really should only be used in cases where their existence has some advantage, like for max().
msg66790 - (view)	Author: Guido van Rossum (gvanrossum) *	Date: 2008-05-13 18:35
Sure, fine.
msg66792 - (view)	Author: Georg Brandl (georg.brandl) *	Date: 2008-05-13 19:05
Okay, committed a matching patch in r63208. Thank you all!
msg105111 - (view)	Author: George Sakkis (gsakkis)	Date: 2010-05-05 23:56
Just discovered this by chance; I would probably have noticed it earlier if the docstring had been updated. Let me know if it needs a new documentation bug ticket and I'll create one. Pretty handy feature by the way, thanks for adding it!
msg105145 - (view)	Author: Nick Coghlan (ncoghlan) *	Date: 2010-05-06 13:16
Created issue 8635 for the incomplete docstring
msg105148 - (view)	Author: Scott Dial (scott.dial)	Date: 2010-05-06 13:49
Created issue8636 for the broken test cases.

History
Date	User	Action	Args
2022-04-11 14:56:34	admin	set	github: 47080
2010-05-06 13:49:14	scott.dial	set	messages: + msg105148
2010-05-06 13:16:37	ncoghlan	set	messages: + msg105145
2010-05-05 23:56:32	gsakkis	set	nosy: + gsakkis messages: + msg105111
2008-05-13 19:05:43	georg.brandl	set	status: open -> closed resolution: accepted messages: + msg66792
2008-05-13 18:35:06	gvanrossum	set	messages: + msg66790
2008-05-13 18:34:08	georg.brandl	set	nosy: + georg.brandl messages: + msg66789
2008-05-13 14:26:33	gvanrossum	set	nosy: + gvanrossum messages: + msg66783
2008-05-13 10:06:40	ncoghlan	set	messages: + msg66778
2008-05-13 09:50:04	ncoghlan	set	nosy: + ncoghlan messages: + msg66776
2008-05-12 07:00:02	rhettinger	set	messages: + msg66712
2008-05-12 06:35:24	scott.dial	set	messages: + msg66711
2008-05-12 06:23:54	rhettinger	set	messages: + msg66710
2008-05-12 06:19:12	rhettinger	set	nosy: + rhettinger messages: + msg66709
2008-05-12 03:54:36	scott.dial	set	files: - enumerate.diff
2008-05-12 03:54:32	scott.dial	set	files: + enumerate.diff
2008-05-12 03:53:21	scott.dial	set	files: - enumerate.diff
2008-05-12 03:53:07	scott.dial	set	files: + enumerate.diff
2008-05-12 03:49:33	scott.dial	create