Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding itertools.pairwise to the standard library? #82381

Closed
della mannequin opened this issue Sep 17, 2019 · 7 comments
Closed

Adding itertools.pairwise to the standard library? #82381

della mannequin opened this issue Sep 17, 2019 · 7 comments
Assignees
Labels
3.9 only security fixes stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@della
Copy link
Mannequin

della mannequin commented Sep 17, 2019

BPO 38200
Nosy @tim-one, @rhettinger, @vedgar, @bbayles
PRs
  • bpo-38200: Add itertools.pairwise() #23549
  • Files
  • pairwise.py: Sketch for a C implementation
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/rhettinger'
    closed_at = <Date 2020-12-01.04:43:46.901>
    created_at = <Date 2019-09-17.16:00:08.480>
    labels = ['type-feature', 'library', '3.9']
    title = 'Adding itertools.pairwise to the standard library?'
    updated_at = <Date 2020-12-01.04:43:46.901>
    user = 'https://bugs.python.org/della'

    bugs.python.org fields:

    activity = <Date 2020-12-01.04:43:46.901>
    actor = 'rhettinger'
    assignee = 'rhettinger'
    closed = True
    closed_date = <Date 2020-12-01.04:43:46.901>
    closer = 'rhettinger'
    components = ['Library (Lib)']
    creation = <Date 2019-09-17.16:00:08.480>
    creator = 'della'
    dependencies = []
    files = ['49633']
    hgrepos = []
    issue_num = 38200
    keywords = ['patch']
    message_count = 7.0
    messages = ['352642', '352660', '352696', '352749', '352752', '353655', '382214']
    nosy_count = 5.0
    nosy_names = ['tim.peters', 'rhettinger', 'della', 'veky', 'bbayles']
    pr_nums = ['23549']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue38200'
    versions = ['Python 3.9']

    @della
    Copy link
    Mannequin Author

    della mannequin commented Sep 17, 2019

    I use itertools.pairwise all the time and I wonder if the same happens to others. I'm thinking that others may be in the same situation, and having this simple recipe already included in the library would be definitely more convenient than copy/pasting the recipe. Also, it may improve its visibility...

    @della della mannequin added the stdlib Python modules in the Lib dir label Sep 17, 2019
    @tirkarthi tirkarthi added 3.9 only security fixes type-feature A feature request or enhancement labels Sep 17, 2019
    @rhettinger
    Copy link
    Contributor

    Can you show some examples of what you used it for?

    @rhettinger rhettinger self-assigned this Sep 17, 2019
    @rhettinger
    Copy link
    Contributor

    FWIW, pairwise() is in the more-itertools module. A quick code search does show occasional use in the wild: https://github.com/search?q=language%3Apython+more_itertools.pairwise&type=Code

    In my own code, I've had some cases that almost fit but they needed custom padding on one end or both ends: zip([0.0] + data, data + [1.0]). Also, its unclear where anyone would want a wider sliding window.

    Tim, do you have any thoughts about pairwise()?

    @vedgar
    Copy link
    Mannequin

    vedgar mannequin commented Sep 18, 2019

    I also use it all the time. Most recently in some numerical calculation for successive differences. My main problem is that I'm too often tempted to just zip l with l[1:], thereby restricting the code to sequences, when it would work perfectly well for any iterable. (Just as I sometimes write range(lots of nines) instead of itertools.count() *shame*;)

    @tim-one
    Copy link
    Member

    tim-one commented Sep 18, 2019

    There's an eternal culture clash here: functional languages have a long history of building in just about everything of plausible use, regardless of how trivial to build on other stuff. This started when LISP was barely released before (cadr x) was introduced as a shorthand for (car (cdr x)), and (caddr x) for (car (cdr (cdr x))), and so on. Which more modern functional languages also supply (second x) and (third x) spellings for (and nth(2, x) and nth(3, x) spellings).

    This one is harder to get right than those, but not hard at all. But it's not coincidence that itertoolz[1] (note the trailing 'z') also supplies it, spelled sliding_window(width, iterable) there. Working with finite difference algorithms is probably "the most obvious" use case for a width of 2.

    More esoterically, one of my favorite "technical indicators" for stock analysis is a dead simple 20-period simple moving average, which can be built very conveniently (although not very efficiently - but I don't usually care) by mapping a mean function over a sliding window of width 20.

    BTW, if you want padding on each end, you can apply pairwise to chain([first], iterable, [last]).

    A related function is breaking an iterable into _non_-overlapping chunks of a given width. itertoolz spells that "partition". For me that comes up more often than overlapping windows.

    I like having these things around, but it's not a big deal. Perhaps it would be an easier decision in Python if we gave up on believing that everything in itertools _must_ be coded in C. In functional idioms sometimes speed isn't the point at all, but rather using conventional names for simple but compound functionality. Like that "sliding window" is a concept in its own right. If I'm _picturing_ an algorithm in terms of a sliding window, then - of course - the shortest distance to working code is to use a facility that already implements that concept.

    Which is a long way of saying +0.

    [1] https://toolz.readthedocs.io/en/latest/api.html

    @della
    Copy link
    Mannequin Author

    della mannequin commented Oct 1, 2019

    Sorry for taking so long to answer, I didn't see notifications somehow.

    Raymond, my use case is in general something that happens when I'm doing analytics on sequences of events (e.g., URLs visited by a browser) or paths in a graph. I look at pairs and do something based on the pair of events (e.g., did the user likely clicked an advertising link? did they go to a potentially risky webpage, possibly by clicking a link?)

    I see the argument for generalizing to a sliding window, although that may lead people to choosing inefficient algorithms for sliding average or median.

    @rhettinger
    Copy link
    Contributor

    New changeset cc061d0 by Raymond Hettinger in branch 'master':
    bpo-38200: Add itertools.pairwise() (GH-23549)
    cc061d0

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.9 only security fixes stdlib Python modules in the Lib dir type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants