msg66705 - (view) |
Author: Scott Dial (scott.dial) |
Date: 2008-05-12 03:49 |
Georg Brandel suggested enumerate() should have the ability to start on
an arbitrary number (instead of always starting at 0). I suggest such a
parameter should be keyword-only. Attached is a patch to add such a
feature along with added test cases. Documentation still needs to be
updated, but I wasn't sure how best to handle that anyways.
I wasn't sure how best to handle a keyword-only argument, so I'd be
interested to know if there is a better way.
|
msg66709 - (view) |
Author: Raymond Hettinger (rhettinger) * |
Date: 2008-05-12 06:19 |
If a start argument gets accepted, it should be positional, not a
keyword-only argument. That is a complete waste when there is just one
argument with a straight-forward interpretation.
Besides, METH_O is a lot faster than the alternatives.
|
msg66710 - (view) |
Author: Raymond Hettinger (rhettinger) * |
Date: 2008-05-12 06:23 |
Forget the part about METH_O. That was incorrect.
Another idea to order the positional args as ([start,], iterator).
That corresponds to with range([start,] stop) and it matches the output
order (number, element):
for i, element in enumerate(10, iterable):
^-----------------------^
^-------------------------^
|
msg66711 - (view) |
Author: Scott Dial (scott.dial) |
Date: 2008-05-12 06:35 |
As it stands, enumerate() already takes a "sequence" keyword as an
alternative to the first positional argument (although this seems to be
completely undocumented). So, as you say, METH_O is a no go.
I agree with you in that my original complaint with the positional
argument was that enumerate(iterable, start) was "backwards." My other
argument was that a large number of these iterator utility functions are
foo(*iterable) and upon seeing enumerate(foo, bar), a reader might be
inclined to assume it was equivalent to enumerate(chain(foo, bar)).
|
msg66712 - (view) |
Author: Raymond Hettinger (rhettinger) * |
Date: 2008-05-12 07:00 |
FWIW, at one point, Guido rejected all variants of the idea. His first
objection was that enumerate() is all about pairing values with
sequence indices, so starting from anything other than zero is in
conflict with the core concept. His second objection is that all
variants can easily be misread as starting at the nth item in the
sequence (much like islice() does now): enumerate(3, 'abcdefg') -->
(3,'d') (4,'e') (5, 'f') (6, 'g'). The latter mis-reading becomes more
likely for those who think of enumerate as providing indices. In fact,
one of the suggested names for enumerate was "indices".
|
msg66776 - (view) |
Author: Alyssa Coghlan (ncoghlan) * |
Date: 2008-05-13 09:49 |
Note that this functionality is currently available as follows:
>>> from itertools import count
>>> list(zip(count(3), 'abcdefg')
[(3, 'a'), (4, 'b'), (5, 'c'), (6, 'd'), (7, 'e'), (8, 'f'), (9, 'g')]
The enumerate(itr) builtin is just a convenience to avoid a module
import for the most basic zip(count(), itr) version.
The proposed patch would enable the example above to be written more
verbosely as:
>>> list(enumerate('abcdefg', start=3))
Or, with the positional argument approach as:
>>> list(enumerate(3, 'abcdefg'))
So, more verbose than the existing approach, and ambiguous to boot - as
Raymond noted, with the first it really isn't clear whether the first
value returned would be (3, 'd') or (3, 'a'), and with the second form
it isn't clear whether we're skipping the first three items, or
returning only those items.
Let's keep the builtins simple, and let itertools handle the variants -
that's why the module exists.
|
msg66778 - (view) |
Author: Alyssa Coghlan (ncoghlan) * |
Date: 2008-05-13 10:06 |
Mentioning the zip(count(start), itr) version in the enumerate() docs
may be a good idea though.
(And of course, in 2.x, it should be izip() rather than zip() to
preserve the memory efficiency of enumerate())
|
msg66783 - (view) |
Author: Guido van Rossum (gvanrossum) * |
Date: 2008-05-13 14:24 |
> Thanks. I think this part is the main reason I see a start argument to
> enumerate as potentially problematic:
>
> """all variants can easily be misread as starting at the nth item in the
> sequence (much like islice() does now): enumerate(3, 'abcdefg') -->
> (3,'d') (4,'e') (5, 'f') (6, 'g')."""
So the ambiguity is that enumerate(it, start=N) could be taken as
skipping the first N items of it rather than adding N to the index it
returns. (And it is my own argument!) I'd like to withdraw this
argument. There are two separate use cases for using enumerate(): one is
to iterate over a sequence and to have a handy index by which to update
the value in the sequence. Another is for 1-based counting, usually when
printing 1-based ordinals (such as line numbers in files, dates in a
month or months in a year, etc.). N-based counting is less common but
still conceivable. However I see no use for skipping items from the
start, and if that use case ever came up, passing a slice to enumerate()
would be the appropriate thing to do. In fact, if you passed in a slice,
you might also want to pass a corresponding start value so the indices
produced match those of the original sequence.
So, I am still in favor of adding a new argument to enumerate().
I'm neutral on the need for a keyword (don't think it would hurt, not
sure how much it matters). I'm strongly against making it an optional
*leading* argument like Raymond proposed; that's a style I just don't
want to promote, range() and the curses module notwithstanding.
> Is the need to use zip(count(3), seq) for the offset index case really
such
> a burden given the associated benefits in keeping the builtin function
> really simple and easy to understand?
Yes, zip(count(3), seq) is too complex for this simple use case. I've
always solved this so far with this less-than-elegant but certainly
simpler idiom (except for users stuck in the tradition of for-loops in
certain older languages :-):
for i, line in enumerat(lines):
i += 1
print "%4d. %s" % (i, line)
and variants thereof.
|
msg66789 - (view) |
Author: Georg Brandl (georg.brandl) * |
Date: 2008-05-13 18:34 |
Okay. I'm against making the argument keyword-only -- IMO keyword-only
arguments really should only be used in cases where their existence has
some advantage, like for max().
|
msg66790 - (view) |
Author: Guido van Rossum (gvanrossum) * |
Date: 2008-05-13 18:35 |
Sure, fine.
|
msg66792 - (view) |
Author: Georg Brandl (georg.brandl) * |
Date: 2008-05-13 19:05 |
Okay, committed a matching patch in r63208. Thank you all!
|
msg105111 - (view) |
Author: George Sakkis (gsakkis) |
Date: 2010-05-05 23:56 |
Just discovered this by chance; I would probably have noticed it earlier if the docstring had been updated. Let me know if it needs a new documentation bug ticket and I'll create one.
Pretty handy feature by the way, thanks for adding it!
|
msg105145 - (view) |
Author: Alyssa Coghlan (ncoghlan) * |
Date: 2010-05-06 13:16 |
Created issue 8635 for the incomplete docstring
|
msg105148 - (view) |
Author: Scott Dial (scott.dial) |
Date: 2010-05-06 13:49 |
Created issue8636 for the broken test cases.
|
|
Date |
User |
Action |
Args |
2022-04-11 14:56:34 | admin | set | github: 47080 |
2010-05-06 13:49:14 | scott.dial | set | messages:
+ msg105148 |
2010-05-06 13:16:37 | ncoghlan | set | messages:
+ msg105145 |
2010-05-05 23:56:32 | gsakkis | set | nosy:
+ gsakkis messages:
+ msg105111
|
2008-05-13 19:05:43 | georg.brandl | set | status: open -> closed resolution: accepted messages:
+ msg66792 |
2008-05-13 18:35:06 | gvanrossum | set | messages:
+ msg66790 |
2008-05-13 18:34:08 | georg.brandl | set | nosy:
+ georg.brandl messages:
+ msg66789 |
2008-05-13 14:26:33 | gvanrossum | set | nosy:
+ gvanrossum messages:
+ msg66783 |
2008-05-13 10:06:40 | ncoghlan | set | messages:
+ msg66778 |
2008-05-13 09:50:04 | ncoghlan | set | nosy:
+ ncoghlan messages:
+ msg66776 |
2008-05-12 07:00:02 | rhettinger | set | messages:
+ msg66712 |
2008-05-12 06:35:24 | scott.dial | set | messages:
+ msg66711 |
2008-05-12 06:23:54 | rhettinger | set | messages:
+ msg66710 |
2008-05-12 06:19:12 | rhettinger | set | nosy:
+ rhettinger messages:
+ msg66709 |
2008-05-12 03:54:36 | scott.dial | set | files:
- enumerate.diff |
2008-05-12 03:54:32 | scott.dial | set | files:
+ enumerate.diff |
2008-05-12 03:53:21 | scott.dial | set | files:
- enumerate.diff |
2008-05-12 03:53:07 | scott.dial | set | files:
+ enumerate.diff |
2008-05-12 03:49:33 | scott.dial | create | |