Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Structural Pattern Matching (PEP 634) #86294

Closed
brandtbucher opened this issue Oct 23, 2020 · 56 comments
Closed

Structural Pattern Matching (PEP 634) #86294

brandtbucher opened this issue Oct 23, 2020 · 56 comments
Assignees
Labels
3.10 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement

Comments

@brandtbucher
Copy link
Member

BPO 42128
Nosy @gvanrossum, @rhettinger, @willingc, @dmoisset, @lysnikolaou, @pablogsal, @tirkarthi, @brandtbucher, @isidentical, @gousaiyang, @Fidget-Spinner, @akulakov, @freundTech
PRs
  • bpo-42128: Structural Pattern Matching (PEP 634) #22917
  • bpo-42128: Add what's new section for PEP 634 (pattern matching) #24588
  • bpo-42128: Add documentation for pattern matching (PEP 634) #24664
  • bpo-42128: Add Pattern Matching to What's New #24667
  • bpo-42128: Add documentation for the new match-based AST nodes #24673
  • bpo-42128: Add __match_args__ to structseq-based classes #24732
  • bpo-42128: Add 'missing :' syntax error message to match statements #24733
  • bpo-42128: Update pattern matching docs to match new PEP changes #25185
  • bpo-42128: __match_args__ can't be a list anymore #25203
  • bpo-42128: Improve "Binding of names" doc section #25642
  • bpo-42182: stdtypes doc - update and fix links to several dunder methods  #27384
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/brandtbucher'
    closed_at = <Date 2021-04-21.03:24:32.247>
    created_at = <Date 2020-10-23.15:22:20.943>
    labels = ['interpreter-core', 'type-feature', '3.10']
    title = 'Structural Pattern Matching (PEP 634)'
    updated_at = <Date 2021-07-27.04:27:58.725>
    user = 'https://github.com/brandtbucher'

    bugs.python.org fields:

    activity = <Date 2021-07-27.04:27:58.725>
    actor = 'andrei.avk'
    assignee = 'brandtbucher'
    closed = True
    closed_date = <Date 2021-04-21.03:24:32.247>
    closer = 'brandtbucher'
    components = ['Interpreter Core']
    creation = <Date 2020-10-23.15:22:20.943>
    creator = 'brandtbucher'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 42128
    keywords = ['patch']
    message_count = 56.0
    messages = ['379442', '379443', '379831', '379833', '379834', '386673', '386702', '386712', '386717', '386720', '386727', '386748', '387362', '387364', '387365', '387369', '387371', '387381', '387384', '387385', '387386', '387389', '387390', '387392', '387409', '387410', '387753', '387769', '387773', '387801', '387815', '387817', '387818', '387819', '387827', '387836', '387968', '388040', '388044', '388046', '388066', '388554', '388555', '388567', '388580', '388583', '388584', '388772', '388775', '388784', '388785', '388790', '388792', '388979', '390270', '391472']
    nosy_count = 14.0
    nosy_names = ['gvanrossum', 'rhettinger', 'willingc', 'Daniel Moisset', 'lys.nikolaou', 'pablogsal', 'xtreak', 'brandtbucher', 'BTaskaya', 'gousaiyang', 'kj', 'andrei.avk', 'laurenjl', 'freundTech']
    pr_nums = ['22917', '24588', '24664', '24667', '24673', '24732', '24733', '25185', '25203', '25642', '27384']
    priority = 'normal'
    resolution = None
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue42128'
    versions = ['Python 3.10']

    @brandtbucher
    Copy link
    Member Author

    PEP-634 has not yet been accepted, but we'd like to hit the ground running and get this into alphas as soon as it (hopefully) is.

    Several people have volunteered to review the implementation, since it's so huge. Other reviews are very welcome, if anybody has a bit of time to pitch in. This touches tons of stuff: the parser, the compiler, the VM, the builtins, the stdlib, the tests... I'd like as many eyeballs as possible!

    I'll have a draft PR up against master in a few minutes. Let's try to keep all of the review comments there.

    @brandtbucher brandtbucher added the 3.10 only security fixes label Oct 23, 2020
    @brandtbucher brandtbucher self-assigned this Oct 23, 2020
    @brandtbucher brandtbucher added interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement labels Oct 23, 2020
    @brandtbucher
    Copy link
    Member Author

    Sorry, just resolving some changes with master. Are you parser people finished breaking my grammar yet? Sheesh. ;)

    @brandtbucher
    Copy link
    Member Author

    Thinking ahead...

    A lot of work has gone into writing these PEPs... we should see how much we can easily convert into actual docs. It seems to me:

    • Parts of PEP-634 and PEP-635 can be worked into the language reference.

    • Guido's overview (the appendix of PEP-636) can probably just be dropped verbatim into the What's New for 3.10.

    • The rest of PEP-636 can just be lifted into the docs as a proper tutorial and linked to from the What's New.

    Also, we need to document the stdlib changes (ast, collections, dataclasses, keyword, ...). The dis module docs are already part of the implementation.

    @gvanrossum
    Copy link
    Member

    If you feel up to it, you might see if you could open a new, separate
    (draft) PR that updates all those docs. (But you could also wait and see if
    someone volunteers. There are some good doc writers active ATM.)

    @brandtbucher
    Copy link
    Member Author

    I'll wait till the SC makes a ruling, then send a message to our docs list (I think we have one)? I'm fine coordinating/reviewing that, or making PRs myself if nobody else steps up.

    @gvanrossum
    Copy link
    Member

    The PEP has been accepted.

    I propose that we use separate PRs for docs (and incremental improvements to the code) but tie them all together using the same issue number (this will require adding "skip news" to later PRs).

    @Fidget-Spinner
    Copy link
    Member

    Guido and Brandt, may I take a stab at the docs for this? I'll probably base most of it off PEP-636's tutorial.

    Relevant parts of the docs that I've identified:
    reference/lexical_analysis.rst
    reference/compound_stmts.rst
    reference/grammar.rst

    # this probably needs a summarized version for non-technical users
    whatsnew/3.10.rst

    If I missed anything, please point them out :).

    @dmoisset
    Copy link
    Mannequin

    dmoisset mannequin commented Feb 9, 2021

    Hey Ken, I was about to take on this myself, but I wouldn't mind a bit of help.

    Your list seems ok, I would add to it a section about PM in the python tutorial (which was explicitly mentioned in the SC acceptance notice).

    Regarding the grammar, I believe that one is autogenerated from the parser, so it's likely that no changes are required there. Correct me if I'm wrong here.

    I'll set up a branch in github and add you as a collaborator so we can prepare a PR. I'll base the branch of Brandt's implementation PR. If you give me your github id I can add you as collaborator so we can work there.

    @Fidget-Spinner
    Copy link
    Member

    Hi Daniel, wow thanks for the offer! My GitHub username is Fidget-Spinner.
    I'm not sure about the grammar file either.
    Oh, thanks for mentioning the tutorial recommendation by the SC, I'll keep that in mind. 

    @lysnikolaou
    Copy link
    Contributor

    The grammar is indeed auto-generated by reading the actual Grammar file, python.gram. Here's the lexer Pablo had developed for doing the syntax-highlighting.

    https://github.com/python/cpython/blob/master/Doc/tools/extensions/peg_highlight.py

    @brandtbucher
    Copy link
    Member Author

    Also, see my msg379831 above. We can't entirely rely on the PEPs, of course, but I think we could still get some decent reuse out of them.

    BTW, has the new docs WG started up yet? I keep hearing about it every once in a while, but I'm not sure if it's fully-formed yet. Could be good to loop them in for review and such, if so.

    +nosy Carol since I'm pretty sure she's involved with that.

    @dmoisset
    Copy link
    Mannequin

    dmoisset mannequin commented Feb 9, 2021

    @ken: I've invited you to the repo. For tracking the WIP, https://github.com/dmoisset/cpython/tree/patma-docs

    Do you want to see if you can base the reference/compound_stmts.rst changes in PEP-634? I'll try to fit a variant of the Appendix A of PEP-636 into the tutorial and make something up for the reference/lexical_analysis.rst (which needs a section of soft keywords which are not in the PEP trilogy. Perhaps I can rescue something from PEP-622)

    @brandtbucher
    Copy link
    Member Author

    To the folks working on docs:

    Does it seem realistic to have something ready by the next alpha (March 1st)? I'd like to at least have a What's New entry and a rough draft tutorial by then, since we'll probably (hopefully?) have a bunch of users jumping on that release.

    I can help if needed, since it looks like I'm mostly waiting on more PR review through the weekend. Or I can stay out of the way. :)

    @gvanrossum
    Copy link
    Member

    I don't know who's really in charge of the docs. I suppose the PEP authors
    are, collectively, but that runs the risk that we're all expecting the
    person to our left in the circle to do it.

    Would people be okay if I added the tutorial from Appendix A of PEP-636 to
    Doc/whatsnew/3.10.rst? It's a bit shy of 200 lines, which is bigger than
    the largest section there currently ("Parenthesized Context Managers" is
    about 50 lines) but feels not unreasonable for something that's been the
    subject of at least 7 PEPs and reviewed by two consecutive Steering
    Councils. :-) The full whatsnew.rst is currently around 1000 lines. FWIW
    the longest what's new ever was for 2.7, at 3300 lines. 3.8 was 2200 lines.

    Okay, I'll just add the PR and we can take it from there.

    @brandtbucher
    Copy link
    Member Author

    Would people be okay if I added the tutorial from Appendix A of PEP-636 to Doc/whatsnew/3.10.rst?

    Yes please!

    It's not a huge deal, but I vote that we either drop or rework the "http_error" examples. I think it gives people a very wrong first impression that makes the rest of the behavior quite surprising.

    Can it be changed to build off of something familiar, like unpacking? I like 636's approach much more.

    @gvanrossum
    Copy link
    Member

    It's not a huge deal, but I vote that we either drop or rework the "http_error" examples. I think it gives people a very wrong first impression that makes the rest of the behavior quite surprising.

    Can it be changed to build off of something familiar, like unpacking? I like 636's approach much more.

    So 636 Appendix A is identical to the tutorial in the README of the patma repo. It uses http errors for the very first example only, which introduces literal matching. The main text of 636 is too long for what's new IMO, and also a bit unfinished. It is meant as a gentle intro. The Appendix is meant as an intro to pattern matching for experienced Python users (the kind of people for whom the what's new series of documents is written). This tutorial takes the user from the simplest forms of patterns (literals) via more complex ones (tuples, classes) to advanced concepts like nesting patterns, and then just lists a whole bunch of other features. I started with http errors because 404 must be the world's famous number by now, well before pi or even 0 and 1. :-)

    See for yourself: #24588

    @brandtbucher
    Copy link
    Member Author

    I understand. I would just like to see something that won't give new Python pattern-matching users (read: everybody) the very painful first impression that this is a switch. Can we rework it like:

    match input().split():
    case []:
    print("Got nothing!")
    case [first]:
    print(f"Got one word: {first}")
    case [first, last]:
    print(f"Got two words: {first} and {last}")
    case _:
    print("Got more than two words!")

    Or something? (Pardon the example, I don't write many tutorials...)

    I've seen too many knee-jerk reactions over the past weeks along the lines of "the new switch feature can't handle named constants!". My hope is something like the above might provide a more accurate, informative intro.

    @gvanrossum
    Copy link
    Member

    Let's continue this at #24588

    @laurenjl
    Copy link
    Mannequin

    laurenjl mannequin commented Feb 20, 2021

    I've seen too many knee-jerk reactions over the past weeks along the lines of "the new switch feature can't handle named constants!".

    Here are some that I found interesting:

    https://twitter.com/jakevdp/status/1359870794877132810

    https://twitter.com/brandon_rhodes/status/1360226108399099909

    @gvanrossum
    Copy link
    Member

    Okay, I see. Clearly we should have kept the "DWIM" option, aka "If it starts with a capital letter, it's a value pattern." :-)

    @gvanrossum
    Copy link
    Member

    But seriously, nothing good can come from social media. It's what drove me to retire in 2018. Just ignore it please.

    @willingc
    Copy link
    Contributor

    @brandt Bucher <brandtbucher@gmail.com> @guido van Rossum
    <guido@python.org> I'm
    finally getting some free time to get the docs workgroup up and running.
    I'm happy to help with any docs that you want for alpha and beyond.

    On Fri, Feb 19, 2021 at 8:27 PM Guido van Rossum <report@bugs.python.org>
    wrote:

    Guido van Rossum <guido@python.org> added the comment:

    But seriously, nothing good can come from social media. It's what drove me
    to retire in 2018. Just ignore it please.

    ----------


    Python tracker <report@bugs.python.org>
    <https://bugs.python.org/issue42128\>



    Python-bugs-list mailing list
    Unsubscribe:
    https://mail.python.org/mailman/options/python-bugs-list/willingc%40gmail.com

    @gvanrossum
    Copy link
    Member

    Carol, the most urgent thing we have going is to come up with text for
    what's new. I posted a PR that adds my standard "quick" tutorial (
    #24588) which is also found in
    Appendix A of PEP-636 (the tutorial PEP -- the specification PEP-634 has a
    different Appendix A :-), but we are worried that this is going to mislead
    people into thinking "Oh, this is a switch with a funny name" -- which is
    exactly what we don't want (see Jake vd Plas Tweet at
    https://twitter.com/jakevdp/status/1359870794877132810). Having stared at
    this for way too long already I think I'm not able to come up with a better
    way to present this *quickly* in a format that's appropriate for What's New
    (concise, highlights only, meant for existing fairly experienced Python
    users). Do you think you can help?

    @willingc
    Copy link
    Contributor

    Absolutely, I can help do that.

    On Fri, Feb 19, 2021 at 9:14 PM Guido van Rossum <report@bugs.python.org>
    wrote:

    Guido van Rossum <guido@python.org> added the comment:

    Carol, the most urgent thing we have going is to come up with text for
    what's new. I posted a PR that adds my standard "quick" tutorial (
    #24588) which is also found in
    Appendix A of PEP-636 (the tutorial PEP -- the specification PEP-634 has a
    different Appendix A :-), but we are worried that this is going to mislead
    people into thinking "Oh, this is a switch with a funny name" -- which is
    exactly what we don't want (see Jake vd Plas Tweet at
    https://twitter.com/jakevdp/status/1359870794877132810). Having stared at
    this for way too long already I think I'm not able to come up with a better
    way to present this *quickly* in a format that's appropriate for What's New
    (concise, highlights only, meant for existing fairly experienced Python
    users). Do you think you can help?

    ----------


    Python tracker <report@bugs.python.org>
    <https://bugs.python.org/issue42128\>



    Python-bugs-list mailing list
    Unsubscribe:
    https://mail.python.org/mailman/options/python-bugs-list/willingc%40gmail.com

    @dmoisset
    Copy link
    Mannequin

    dmoisset mannequin commented Feb 20, 2021

    Fyi, I'm looking into the documentation. I think I can have something basic for the middle of this week, so it should be ok for March 1st.

    @dmoisset
    Copy link
    Mannequin

    dmoisset mannequin commented Feb 20, 2021

    Note: I will NOT write something for "What's New in 3.10" seeing that's in progress

    @dmoisset
    Copy link
    Mannequin

    dmoisset mannequin commented Feb 27, 2021

    Thanks @XTreak, I'll make some changes in these sections too.

    The docs are coming along well, there's still some refinement to do in the compound statements section (it's there, but looks to big), I'll submit a draft PR during the weekend so interested people can review

    @willingc
    Copy link
    Contributor

    Folks, The What's New PR is open now. I've tried to focus more on the data type/shape examples over literal example.

    @brandtbucher
    Copy link
    Member Author

    @BTaskaya, do you think you'll have time to open a PR with your AST validator this weekend? It looks good to me (assuming tests pass).

    Also, we should add the AST docs to our documentation to-do list (should be just adding entries for ast.Match, ast.MatchAs, ast.MatchOr, and ast.match_case).

    I'll try to find time to review the open doc PRs today.

    @isidentical
    Copy link
    Sponsor Member

    @BTaskaya, do you think you'll have time to open a PR with your AST validator this weekend? It looks good to me (assuming tests pass).

    Unfortunately not. I believe it still lacks tests for invalid cases, but other than that should work. If you'd like to take it on feel free, if not I'll create a PR next weekend with tests (probably after release, though I believe it is not a blocker as is).

    @pablogsal
    Copy link
    Member

    I can pick up the AST docs PR

    @brandtbucher
    Copy link
    Member Author

    Thanks Pablo!

    If you'd like to take it on feel free, if not I'll create a PR next weekend with tests (probably after release, though I believe it is not a blocker as is).

    No problem, I'm pretty busy too. Next weekend is fine.

    @willingc
    Copy link
    Contributor

    willingc commented Mar 1, 2021

    New changeset a8e2615 by Pablo Galindo in branch 'master':
    bpo-42128: Add documentation for the new match-based AST nodes (GH-24673)
    a8e2615

    @gvanrossum
    Copy link
    Member

    New changeset a22bca6 by Daniel F Moisset in branch 'master':
    bpo-42128: Add documentation for pattern matching (PEP-634) (bpo-24664)
    a22bca6

    @brandtbucher
    Copy link
    Member Author

    I'm currently working on some performance benchmarks for PEP-634:

    https://github.com/brandtbucher/patmaperformance

    Hopefully they will help inform future improvements. I already have benchmarks for class patterns and mapping patterns, and am still searching for a good problem that really stresses all of the different variations of sequence patterns. Maybe a generalized Tower of Hanoi or something?

    @rhettinger
    Copy link
    Contributor

    Should __match_args__ be added to Objects/structseq.c ?

    @brandtbucher
    Copy link
    Member Author

    Yeah, probably.

    I'm not too familiar with the design of those objects... would it make more sense to have the implementation be a single descriptor shared by all derived types, or should we precompute the tuple of strings when each new type is defined?

    I'm also not entirely sure how "unnamed fields" work, or what they're used for. Presumably we'd want to end the __match_args__ tuple upon hitting one of these?

    @pablogsal
    Copy link
    Member

    Opened #24732

    I'm also not entirely sure how "unnamed fields" work, or what they're used for. Presumably, we'd want to end the __match_args__ tuple upon hitting one of these?

    AFAIK, they should be just excluded from __match_args__

    @pablogsal
    Copy link
    Member

    New changeset 0632b10 by Pablo Galindo in branch 'master':
    bpo-42128: Add __match_args__ to structseq-based classes (GH-24732)
    0632b10

    @freundTech
    Copy link
    Mannequin

    freundTech mannequin commented Mar 12, 2021

    For the last few days I've been working with pattern matching and it's ast for a bit, while trying to add support for it to mypy.

    During this I noticed an inconsistency in the ast:

    ast.MatchAs has an attribute name which is of type identifier (in C) and type str (in python).

    This seams really inconsistent with the rest of the ast, where identifiers are always wrapped in a ast.Name object. The only other exception to this is ast.Attribute.

    Judging from Grammar/python.gram this seems deliberate:

    as_pattern[expr_ty]:
    | pattern=or_pattern 'as' target=capture_pattern {
    _Py_MatchAs(pattern, target->v.Name.id, EXTRA) }

    Could someone shed some light on why MatchAs directly references an identifier instead of an ast.Name?

    @isidentical
    Copy link
    Sponsor Member

    This seams really inconsistent with the rest of the ast, where identifiers are always wrapped in a ast.Name object. The only other exception to this is ast.Attribute.

    import ... as <identifier>
    from ... import ... as <identifier>
    try:
       ...
    except ... as <identifier>:
       ...

    global <identifier>
    nonlocal <identifier>
    x(<identifier>=...)

    All <identifier>s above are represented as strings. If <identifier> is guaranteed to be a single name, then it makes sense to just use an identifier instead of wrapping it with Name(). The reason that other stuff like "with ... as <expr>" uses <expr> instead of <identifier> is that you can use extended assignment targets (such as unpacking a tuple) over there.

    This rule applies to all other stuff, except for NamedExprs. Which I believe it is because the restriction might lift in the future, but now they are limited to only keep Name()s.

    I don't personally know why Brandt choosed to use identifier, though I'm a +1 on the current approach.

    @freundTech
    Copy link
    Mannequin

    freundTech mannequin commented Mar 12, 2021

    Thanks for the response. Looks like I overlooked the imports, global, nonlocal, ... because I only searched for usages of identifier, but they use lists of identifiers.

    In that case I agree that it isn't inconsistent.

    @brandtbucher
    Copy link
    Member Author

    Not a much thought went into that decision, honestly.

    The main reason I chose an identifier there was because "as" always stores a simple name, so allowing it to potentially be other contexts or expressions (which are illegal) just seemed to create complexity. I saw we used simple identifiers in many similar places and just went with it.

    (As a related note, if mapping patterns had their own AST nodes, I would probably choose to represent "**rest" as an optional identifier, rather than a keyless value or optional Name.)

    If it causes problems for clients of the AST, it's easy to change. But I sort of like it a bit better how it is now.

    @gvanrossum
    Copy link
    Member

    This does point out that the ast structure is a liability, and we won't be
    able to easily change it, since mypy and IIRC many linters use the ast
    module. So I think that maybe we need to think more carefully about what
    the *proper* ast structure is, rather than what was most convenient for the
    initial implementation. (Brandt, I know you are reluctant to change this,
    and I don't blame you, but nevertheless I think we should at least think
    about this some more so we're not stuck with a sub-optimal "default"
    solution. We've already won the war. Let's not lose the battle. :-)

    @brandtbucher
    Copy link
    Member Author

    Yup, I agree we should explore this area further. See my comment here on Adrian’s mypy PR:

    python/mypy#10191 (comment)

    @brandtbucher
    Copy link
    Member Author

    @BTaskaya, do you have any interest in helping me iterate on new AST nodes?

    @isidentical
    Copy link
    Sponsor Member

    @BTaskaya, do you have any interest in helping me iterate on new AST nodes?

    Sure!

    @brandtbucher
    Copy link
    Member Author

    Cool. Today or tomorrow I'll create an issue in Guido's patma repo and we can move our discussion there. I'll also create a branch and add you to my fork.

    @brandtbucher
    Copy link
    Member Author

    Also, given that pattern matching is now shipped and documented, perhaps this issue should be closed? At this point I think dedicated issues are probably better for any new bugs/enhancements/etc.

    @gvanrossum
    Copy link
    Member

    Yeah, feel free to close this issue.

    @brandtbucher
    Copy link
    Member Author

    Actually, I didn't see that there are still 2 open linked PRs. I'll wait until those are merged.

    @pablogsal
    Copy link
    Member

    New changeset 08fb8ac by Pablo Galindo in branch 'master':
    bpo-42128: Add 'missing :' syntax error message to match statements (GH-24733)
    08fb8ac

    @brandtbucher
    Copy link
    Member Author

    New changeset f84d5a1 by Brandt Bucher in branch 'master':
    bpo-42128: __match_args__ can't be a list anymore (GH-25203)
    f84d5a1

    @isidentical
    Copy link
    Sponsor Member

    Actually, I didn't see that there are still 2 open linked PRs. I'll wait until those are merged.

    I've moved the AST validator to its own separate issue so feel free to close this Brandt!

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.10 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    9 participants