classification
Title: idiom for clustering a data series into n-length groups
Type: enhancement Stage:
Components: Documentation Versions: Python 3.4, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: rhettinger Nosy List: Paddy McCarthy, docs@python, ethan.furman, python-dev, r.david.murray, rhettinger
Priority: normal Keywords:

Created on 2015-03-18 00:18 by Paddy McCarthy, last changed 2015-05-13 09:35 by rhettinger. This issue is now closed.

Messages (6)
msg238365 - (view) Author: Paddy McCarthy (Paddy McCarthy) Date: 2015-03-18 00:18
In the zip section of the documentation, e.g. https://docs.python.org/3/library/functions.html#zip There is mention of an idiom for clustering a data series into n-length groups that I seem to only come across when people are explaining how it works on blog entries such as the three mentioned here: http://www.reddit.com/r/programming/comments/2z4rv4/a_function_for_partitioning_python_arrays/cpfvwun?context=3

It is not a straight-forward bit of code and so I think it should either be explained in more detail in the documentation or removed as an idiom, or I guess it could be encapsulated in a function and added to the stdlib.
msg238369 - (view) Author: Ethan Furman (ethan.furman) * (Python committer) Date: 2015-03-18 01:00
I think an example should suffice:

>>> s = [1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> n = 3
>>> zip(*[iter(s)]*n)
[(1, 2, 3), (4, 5, 6), (7, 8, 9)]
msg238378 - (view) Author: Paddy McCarthy (Paddy McCarthy) Date: 2015-03-18 05:10
Hmmm. It seems that the problem isn't to do with the fact that it works, or how to apply it; the problem is with *how* it works. 

Making it an idiom means that too many will use it without knowing why it works which could lead to later maintenance issues. I think a better description of how it works may be needed for the docs.

Unfortunately my description of the how at http://paddy3118.blogspot.co.uk/2012/12/that-grouping-by-zipiter-trick-explained.html was not written with the docs in mind, but you are welcome to any part or the whole, for the Python docs.
msg238453 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2015-03-18 15:33
I think it would be both helpful and sufficient to add a gloss, perhaps something like: "this passes zip ``n`` references to the *same* iterator, which means zip calls that single iterator ``n`` times for each tuple it creates; zip thus outputs tuples consisting of ``n`` length chunks from the iterator ``s``".
msg238465 - (view) Author: Paddy McCarthy (Paddy McCarthy) Date: 2015-03-18 19:01
I like R. David Murray's suggestion, but I am also aware of how it works and so cannot judge how it would look to the intermediate Python programmer who knows iterators and zip, but is new to this grouper; (who I think should be the target audience).
msg243064 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2015-05-13 09:34
New changeset f7d82e40e472 by Raymond Hettinger in branch 'default':
Issue #23695:  Explain the  zip() example for clustering a data series into n-length groups.
https://hg.python.org/cpython/rev/f7d82e40e472
History
Date User Action Args
2015-05-13 09:35:03rhettingersetstatus: open -> closed
resolution: fixed
2015-05-13 09:34:48python-devsetnosy: + python-dev
messages: + msg243064
2015-03-18 19:01:54Paddy McCarthysetmessages: + msg238465
2015-03-18 15:33:52r.david.murraysetnosy: + r.david.murray
messages: + msg238453
2015-03-18 05:10:00Paddy McCarthysetmessages: + msg238378
2015-03-18 04:03:30rhettingersetassignee: docs@python -> rhettinger

nosy: + rhettinger
2015-03-18 01:00:03ethan.furmansetnosy: + ethan.furman

messages: + msg238369
versions: - Python 3.2, Python 3.3
2015-03-18 00:18:02Paddy McCarthycreate