classification
Title: Adding start to enumerate()
Type: enhancement Stage:
Components: Interpreter Core Versions: Python 3.0, Python 2.6
process
Status: closed Resolution: accepted
Dependencies: Superseder:
Assigned To: Nosy List: georg.brandl, gsakkis, gvanrossum, ncoghlan, rhettinger, scott.dial
Priority: normal Keywords: patch

Created on 2008-05-12 03:49 by scott.dial, last changed 2010-05-06 13:49 by scott.dial. This issue is now closed.

Files
File name Uploaded Description Edit
enumerate.diff scott.dial, 2008-05-12 03:54 patch to add start= to enumerate
Messages (14)
msg66705 - (view) Author: Scott Dial (scott.dial) Date: 2008-05-12 03:49
Georg Brandel suggested enumerate() should have the ability to start on
an arbitrary number (instead of always starting at 0). I suggest such a
parameter should be keyword-only. Attached is a patch to add such a
feature along with added test cases. Documentation still needs to be
updated, but I wasn't sure how best to handle that anyways.

I wasn't sure how best to handle a keyword-only argument, so I'd be
interested to know if there is a better way.
msg66709 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2008-05-12 06:19
If a start argument gets accepted, it should be positional, not a 
keyword-only argument.  That is a complete waste when there is just one 
argument with a straight-forward interpretation.  

Besides, METH_O is a lot faster than the alternatives.
msg66710 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2008-05-12 06:23
Forget the part about METH_O.  That was incorrect.

Another idea to order the positional args as ([start,], iterator).  
That corresponds to with range([start,] stop) and it matches the output 
order (number, element):

    for i, element in enumerate(10, iterable):
        ^-----------------------^
              ^-------------------------^
msg66711 - (view) Author: Scott Dial (scott.dial) Date: 2008-05-12 06:35
As it stands, enumerate() already takes a "sequence" keyword as an
alternative to the first positional argument (although this seems to be
completely undocumented). So, as you say, METH_O is a no go.

I agree with you in that my original complaint with the positional
argument was that enumerate(iterable, start) was "backwards." My other
argument was that a large number of these iterator utility functions are
foo(*iterable) and upon seeing enumerate(foo, bar), a reader might be
inclined to assume it was equivalent to enumerate(chain(foo, bar)).
msg66712 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2008-05-12 07:00
FWIW, at one point, Guido rejected all variants of the idea.  His first 
objection was that enumerate() is all about pairing values with 
sequence indices, so starting from anything other than zero is in 
conflict with the core concept.  His second objection is that all 
variants can easily be misread as starting at the nth item in the 
sequence (much like islice() does now):   enumerate(3, 'abcdefg') --> 
(3,'d') (4,'e') (5, 'f') (6, 'g').  The latter mis-reading becomes more 
likely for those who think of enumerate as providing indices.  In fact, 
one of the suggested names for enumerate was "indices".
msg66776 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2008-05-13 09:49
Note that this functionality is currently available as follows:

>>> from itertools import count
>>> list(zip(count(3), 'abcdefg')
[(3, 'a'), (4, 'b'), (5, 'c'), (6, 'd'), (7, 'e'), (8, 'f'), (9, 'g')]

The enumerate(itr) builtin is just a convenience to avoid a module
import for the most basic zip(count(), itr) version.

The proposed patch would enable the example above to be written more
verbosely as:

>>> list(enumerate('abcdefg', start=3))

Or, with the positional argument approach as:

>>> list(enumerate(3, 'abcdefg'))


So, more verbose than the existing approach, and ambiguous to boot - as
Raymond noted, with the first it really isn't clear whether the first
value returned would be (3, 'd') or (3, 'a'), and with the second form
it isn't clear whether we're skipping the first three items, or
returning only those items.

Let's keep the builtins simple, and let itertools handle the variants -
that's why the module exists.
msg66778 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2008-05-13 10:06
Mentioning the zip(count(start), itr) version in the enumerate() docs
may be a good idea though.

(And of course, in 2.x, it should be izip() rather than zip() to
preserve the memory efficiency of enumerate())
msg66783 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2008-05-13 14:24
> Thanks. I think this part is the main reason I see a start argument to
> enumerate as potentially problematic:
>
> """all variants can easily be misread as starting at the nth item in the
>   sequence (much like islice() does now):   enumerate(3, 'abcdefg') -->
>   (3,'d') (4,'e') (5, 'f') (6, 'g')."""

So the ambiguity is that enumerate(it, start=N) could be taken as
skipping the first N items of it rather than adding N to the index it
returns. (And it is my own argument!) I'd like to withdraw this
argument. There are two separate use cases for using enumerate(): one is
to iterate over a sequence and to have a handy index by which to update
the value in the sequence. Another is for 1-based counting, usually when
printing 1-based ordinals (such as line numbers in files, dates in a
month or months in a year, etc.). N-based counting is less common but
still conceivable. However I see no use for skipping items from the
start, and if that use case ever came up, passing a slice to enumerate()
would be the appropriate thing to do. In fact, if you passed in a slice,
you might also want to pass a corresponding start value so the indices
produced match those of the original sequence.

So, I am still in favor of adding a new argument to enumerate().

I'm neutral on the need for a keyword (don't think it would hurt, not
sure how much it matters). I'm strongly against making it an optional
*leading* argument like Raymond proposed; that's a style I just don't
want to promote, range() and the curses module notwithstanding.

> Is the need to use zip(count(3), seq) for the offset index case really
such
> a burden given the associated benefits in keeping the builtin function
> really simple and easy to understand?

Yes, zip(count(3), seq) is too complex for this simple use case. I've
always solved this so far with this less-than-elegant but certainly
simpler idiom (except for users stuck in the tradition of for-loops in
certain older languages :-):

for i, line in enumerat(lines):
  i += 1
  print "%4d. %s" % (i, line)

and variants thereof.
msg66789 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2008-05-13 18:34
Okay. I'm against making the argument keyword-only -- IMO keyword-only
arguments really should only be used in cases where their existence has
some advantage, like for max().
msg66790 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2008-05-13 18:35
Sure, fine.
msg66792 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2008-05-13 19:05
Okay, committed a matching patch in r63208. Thank you all!
msg105111 - (view) Author: George Sakkis (gsakkis) Date: 2010-05-05 23:56
Just discovered this by chance; I would probably have noticed it earlier if the docstring had been updated. Let me know if it needs a new documentation bug ticket and I'll create one.

Pretty handy feature by the way, thanks for adding it!
msg105145 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2010-05-06 13:16
Created issue 8635 for the incomplete docstring
msg105148 - (view) Author: Scott Dial (scott.dial) Date: 2010-05-06 13:49
Created issue8636 for the broken test cases.
History
Date User Action Args
2010-05-06 13:49:14scott.dialsetmessages: + msg105148
2010-05-06 13:16:37ncoghlansetmessages: + msg105145
2010-05-05 23:56:32gsakkissetnosy: + gsakkis
messages: + msg105111
2008-05-13 19:05:43georg.brandlsetstatus: open -> closed
resolution: accepted
messages: + msg66792
2008-05-13 18:35:06gvanrossumsetmessages: + msg66790
2008-05-13 18:34:08georg.brandlsetnosy: + georg.brandl
messages: + msg66789
2008-05-13 14:26:33gvanrossumsetnosy: + gvanrossum
messages: + msg66783
2008-05-13 10:06:40ncoghlansetmessages: + msg66778
2008-05-13 09:50:04ncoghlansetnosy: + ncoghlan
messages: + msg66776
2008-05-12 07:00:02rhettingersetmessages: + msg66712
2008-05-12 06:35:24scott.dialsetmessages: + msg66711
2008-05-12 06:23:54rhettingersetmessages: + msg66710
2008-05-12 06:19:12rhettingersetnosy: + rhettinger
messages: + msg66709
2008-05-12 03:54:36scott.dialsetfiles: - enumerate.diff
2008-05-12 03:54:32scott.dialsetfiles: + enumerate.diff
2008-05-12 03:53:21scott.dialsetfiles: - enumerate.diff
2008-05-12 03:53:07scott.dialsetfiles: + enumerate.diff
2008-05-12 03:49:33scott.dialcreate