classification
Title: itertools.grouper
Type: enhancement Stage:
Components: Library (Lib) Versions: Python 3.1, Python 2.7
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: rhettinger Nosy List: cvrebert, eric.snow, lieryan, rhettinger
Priority: normal Keywords:

Created on 2009-05-14 17:05 by lieryan, last changed 2012-06-29 22:20 by eric.snow. This issue is now closed.

Messages (4)
msg87743 - (view) Author: Lie Ryan (lieryan) Date: 2009-05-14 17:05
An itertool to Group-by-n 

>>> lst = range(15)
>>> itertools.grouper(lst, 5)
[[0, 1, 2, 3, 4], [5, 6, 7, 8, 9], [10, 11, 12, 13, 14]]

This function is often asked in several c.l.py discussions, such as these: 
http://comments.gmane.org/gmane.comp.python.general/623377
http://comments.gmane.org/gmane.comp.python.general/622763

There are several issues. What should be done if the number of items in
the original list is not exactly divisible?
- raise an error as default
- pad with a value from 3rd argument
- make the last one shorter, maybe using keyword arguments or sentinel
to 3rd argument

or should there be separate functions for each of them?

What about infinite list? Most recipes for the function uses zip which
breaks with infinite list.
msg87745 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2009-05-14 17:13
This has been rejected before. 

* It is not a fundamental itertool primitive.  The recipes section in
the docs shows a clean, fast implementation derived from zip_longest().  

* There is some debate on a correct API for odd lengths.  Some people
want an exception, some want fill-in values, some want truncation, and
some want a partially filled-in tuple.  The alone is reason enough not
to set one behavior in stone.

* There is an issue with having too many itertools.  The module taken as
a whole becomes more difficult to use as new tools are added.
msg87750 - (view) Author: Lie Ryan (lieryan) Date: 2009-05-14 18:20
All implementations relying on zip or zip_longest breaks with infinite
iterable (e.g. itertools.count()).

And it is not impossible to define a clean, flexible, and familiar API
which will be similar to open()'s mode or unicode error mode. The modes
would be 'error' (default), 'pad', 'truncate', and 'partial' (maybe
should suggest a better name than 'partial')

> There is an issue with having too many itertools.  
> The module taken as a whole becomes more 
> difficult to use as new tools are added.

It should also be weighed that a lot of people are expecting for this
kind of function in itertools. I think there are other functions in
itertools that have more questionable value than groupers, such as starmap.
msg87756 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2009-05-14 19:09
> All implementations relying on zip or zip_longest breaks 
> with infinite iterable (e.g. itertools.count()).

How is it broken?  
Infinite in, infinite out.

>>> def grouper(n, iterable, fillvalue=None):
...    args = [iter(iterable)] * n
...    return zip_longest(*args, fillvalue=fillvalue)

>>> g = grouper(3, count())
>>> next(g)
(0, 1, 2)
>>> next(g)
(3, 4, 5)
>>> next(g)
(6, 7, 8)
>>> next(g)

> And it is not impossible to define a clean, flexible, 
> and familiar API which will be similar to open()'s mode
> or unicode error mode. The modes would be 'error' 
> (default), 'pad', 'truncate', and 'partial'

Of course, it's possible.  I find that to be bad design.  Generally, we
follow Guido's advice and create separate functions rather than overload
a single function with flags -- that is why we have filterfalse()
instead of a flag on filter().  When people suggest an API with multiple
flags, it can be a symptom of hyper-generalization where api complexity
gets substituted for writing a simple function that does what you want
in the first place.  IMO, it is easier to learn the zip(g, g, g) idiom
and customize it to your own needs than to learn a new tool with four
flag options that control its output signature.
History
Date User Action Args
2012-06-29 22:20:33eric.snowsetnosy: + eric.snow
2009-05-15 04:17:14cvrebertsetnosy: + cvrebert
2009-05-14 19:09:48rhettingersetmessages: + msg87756
2009-05-14 18:20:50lieryansetmessages: + msg87750
2009-05-14 17:13:30rhettingersetstatus: open -> closed

assignee: rhettinger
versions: + Python 3.1, Python 2.7
nosy: + rhettinger

messages: + msg87745
resolution: rejected
2009-05-14 17:05:29lieryancreate