classification
Title: Adding itertools.pairwise to the standard library?
Type: enhancement Stage:
Components: Library (Lib) Versions: Python 3.9
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: rhettinger Nosy List: bbayles, della, rhettinger, tim.peters, veky
Priority: normal Keywords:

Created on 2019-09-17 16:00 by della, last changed 2019-12-01 04:01 by bbayles.

Messages (6)
msg352642 - (view) Author: Matteo Dell'Amico (della) Date: 2019-09-17 16:00
I use itertools.pairwise all the time and I wonder if the same happens to others. I'm thinking that others may be in the same situation, and having this simple recipe already included in the library would be definitely more convenient than copy/pasting the recipe. Also, it may improve its visibility...
msg352660 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2019-09-17 19:53
Can you show some examples of what you used it for?
msg352696 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2019-09-18 05:54
FWIW, pairwise() is in the more-itertools module.  A quick code search does show occasional use in the wild:  https://github.com/search?q=language%3Apython+more_itertools.pairwise&type=Code  

In my own code, I've had some cases that almost fit but they needed custom padding on one end or both ends:  zip([0.0] + data, data + [1.0]). Also, its unclear where anyone would want a wider sliding window.

Tim, do you have any thoughts about pairwise()?
msg352749 - (view) Author: Vedran Čačić (veky) * Date: 2019-09-18 18:25
I also use it all the time. Most recently in some numerical calculation for successive differences. My main problem is that I'm too often tempted to just zip l with l[1:], thereby restricting the code to sequences, when it would work perfectly well for any iterable. (Just as I sometimes write range(lots of nines) instead of itertools.count() *shame*;)
msg352752 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2019-09-18 19:15
There's an eternal culture clash here:  functional languages have a long history of building in just about everything of plausible use, regardless of how trivial to build on other stuff.  This started when LISP was barely released before (cadr x) was introduced as a shorthand for (car (cdr x)), and (caddr x) for (car (cdr (cdr x))), and so on.  Which more modern functional languages also supply (second x) and (third x) spellings for (_and_ nth(2, x) and nth(3, x) spellings).

This one is harder to get right than those, but not hard at all.  But it's not coincidence that itertoolz[1] (note the trailing 'z') also supplies it, spelled `sliding_window(width, iterable)` there.  Working with finite difference algorithms is probably "the most obvious" use case for a width of 2.

More esoterically, one of my favorite "technical indicators" for stock analysis is a dead simple 20-period simple moving average, which can be built very conveniently (although not very efficiently - but I don't usually care) by mapping a mean function over a sliding window of width 20.

BTW, if you want padding on each end, you can apply pairwise to `chain([first], iterable, [last])`.

A related function is breaking an iterable into _non_-overlapping chunks of a given width.  itertoolz spells that "partition".  For me that comes up more often than overlapping windows.

I like having these things around, but it's not a big deal.  Perhaps it would be an easier decision in Python if we gave up on believing that everything in itertools _must_ be coded in C.  In functional idioms sometimes speed isn't the point at all, but rather using conventional names for simple but compound functionality.  Like that "sliding window" is a concept in its own right.  If I'm _picturing_ an algorithm in terms of a sliding window, then - of course - the shortest distance to working code is to use a facility that already implements that concept.

Which is a long way of saying +0.

[1] https://toolz.readthedocs.io/en/latest/api.html
msg353655 - (view) Author: Matteo Dell'Amico (della) Date: 2019-10-01 09:22
Sorry for taking so long to answer, I didn't see notifications somehow.

Raymond, my use case is in general something that happens when I'm doing analytics on sequences of events (e.g., URLs visited by a browser) or paths in a graph. I look at pairs and do something based on the pair of events (e.g., did the user likely clicked an advertising link? did they go to a potentially risky webpage, possibly by clicking a link?)

I see the argument for generalizing to a sliding window, although that may lead people to choosing inefficient algorithms for sliding average or median.
History
Date User Action Args
2019-12-01 04:01:35bbaylessetnosy: + bbayles
2019-10-01 09:22:30dellasetmessages: + msg353655
2019-09-18 19:15:56tim.peterssetmessages: + msg352752
2019-09-18 18:25:55vekysetnosy: + veky
messages: + msg352749
2019-09-18 05:54:06rhettingersetnosy: + tim.peters
messages: + msg352696
2019-09-17 19:53:47rhettingersetassignee: rhettinger
messages: + msg352660
2019-09-17 16:06:44xtreaksetnosy: + rhettinger

type: enhancement
versions: + Python 3.9
2019-09-17 16:00:08dellacreate