Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make lib2to3 grammar better match Python, support the := walrus #80722

Closed
thatch mannequin opened this issue Apr 6, 2019 · 21 comments
Closed

Make lib2to3 grammar better match Python, support the := walrus #80722

thatch mannequin opened this issue Apr 6, 2019 · 21 comments
Assignees
Labels
3.8 only security fixes 3.9 only security fixes 3.10 only security fixes topic-2to3 type-bug An unexpected behavior, bug, or error

Comments

@thatch
Copy link
Mannequin

thatch mannequin commented Apr 6, 2019

BPO 36541
Nosy @birkenfeld, @gpshead, @benjaminp, @thatch, @ambv, @fireattack, @lisroach, @pablogsal, @miss-islington, @isidentical
PRs
  • bpo-36541: lib2to3: Support named assignment expressions #12702
  • bpo-36541: lib2to3: Support complex expressions in *args and **kwargs. #12703
  • [3.8] bpo-36541: lib2to3: Support named assignment expressions (GH-12702) #19315
  • [3.7] bpo-36541: lib2to3: Support named assignment expressions (GH-12702) #19317
  • bpo-36541: Add lib2to3 grammar PEP-570 pos-only arg parsing #23759
  • [3.9] bpo-36541: Add lib2to3 grammar PEP-570 pos-only arg parsing (GH-23759) #23768
  • [3.8] bpo-36541: Add lib2to3 grammar PEP-570 pos-only arg parsing (GH-23759) #23769
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/gpshead'
    closed_at = <Date 2020-12-14.18:13:48.449>
    created_at = <Date 2019-04-06.01:41:20.878>
    labels = ['3.8', 'type-bug', 'expert-2to3', '3.9', '3.10']
    title = 'Make lib2to3 grammar better match Python, support the := walrus'
    updated_at = <Date 2020-12-14.18:13:48.449>
    user = 'https://github.com/thatch'

    bugs.python.org fields:

    activity = <Date 2020-12-14.18:13:48.449>
    actor = 'gregory.p.smith'
    assignee = 'gregory.p.smith'
    closed = True
    closed_date = <Date 2020-12-14.18:13:48.449>
    closer = 'gregory.p.smith'
    components = ['2to3 (2.x to 3.x conversion tool)']
    creation = <Date 2019-04-06.01:41:20.878>
    creator = 'thatch'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 36541
    keywords = ['patch']
    message_count = 21.0
    messages = ['339522', '339669', '339796', '340791', '340802', '340848', '355108', '355252', '365716', '365717', '365718', '379062', '382516', '382594', '382606', '382608', '382609', '382962', '382994', '382996', '382997']
    nosy_count = 11.0
    nosy_names = ['georg.brandl', 'gregory.p.smith', 'benjamin.peterson', 'thatch', 'lukasz.langa', 'fireattack', 'lisroach', 'pablogsal', 'miss-islington', 'BTaskaya', 'Peter Ludemann']
    pr_nums = ['12702', '12703', '19315', '19317', '23759', '23768', '23769']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'commit review'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue36541'
    versions = ['Python 3.8', 'Python 3.9', 'Python 3.10']

    @thatch
    Copy link
    Mannequin Author

    thatch mannequin commented Apr 6, 2019

    The grammar in lib2to3 is out of date and can't parse := nor f(**not x) from running on real code. I've done a cursory diff -uw [Grammar/Grammar](https://github.com/python/cpython/blob/main/Grammar/Grammar) [Lib/lib2to3/grammar.txt](https://github.com/python/cpython/blob/main/Lib/lib2to3/grammar.txt), and would like to fix lib2to3 so we can merge into both fissix and blib2to3, to avoid further divergence of the forks.

    I'm unsure if I need a separate bug per pull request, but need at least one to get started.

    @thatch thatch mannequin added 3.7 (EOL) end of life 3.8 only security fixes topic-2to3 type-bug An unexpected behavior, bug, or error labels Apr 6, 2019
    @thatch
    Copy link
    Mannequin Author

    thatch mannequin commented Apr 8, 2019

    jreese reminded me of PEP-570, which will make more grammar changes. I'm open to the idea of replacing the grammar with the live one, plus porting the 2isms forward like print, eval, except with comma.

    My sincere hope is that everyone that depends on this structure will have tests (mine and lib2to3 do); the only big user I'm aware of is probably libfuturize. Definitely worth a changelog entry if this is the way forward.

    @thatch
    Copy link
    Mannequin Author

    thatch mannequin commented Apr 9, 2019

    Here's approximately what it would look like to do the big change now: master...thatch:lib2to3-update-grammar (one test failing, and some helpers may need more test coverage)

    @lisroach
    Copy link
    Contributor

    I agree we should get lib2to3 up to date.

    Looks like for *args and **kwargs there is bpo-33348 (this has a PR) and bpo-32496 (no PR) and related closed bpo-24791 and bpo-24176.

    Adding := seems straighforward to me, as for the big change maybe @benjamin.peterson would be interested in commenting?

    @pablogsal
    Copy link
    Member

    For the changes of PEP-570, please wait until I merge the implementation to do the grammar changes in lib2to3 for that.

    @thatch
    Copy link
    Mannequin Author

    thatch mannequin commented Apr 25, 2019

    My strong preference would be getting the lib2to3 grammar to be the python grammar + additions, to make future changes easier to merge. The strongest argument against doing that is the backwards-incompatibility of patterns -- some won't compile, while others will compile but do something unexpected).

    It's good to hear (or at least infer) that parsing modern code is also a goal of lib2to3.

    @PeterLudemann
    Copy link
    Mannequin

    PeterLudemann mannequin commented Oct 21, 2019

    Re: breakage due to changes in structure (https://bugs.python.org/issue36541#msg339669) ... this has already happened in the past (e.g., type annotations and async).

    It's probably a good idea to add some documentation that structure changes can be expected with each release of Python.

    @PeterLudemann
    Copy link
    Mannequin

    PeterLudemann mannequin commented Oct 23, 2019

    Also the Grammar.txt diffs look about the same size as I've seen with other upgrades to lib2to3 when the Python grammar changed.

    @gpshead gpshead added the 3.9 only security fixes label Oct 24, 2019
    @gpshead gpshead self-assigned this Oct 24, 2019
    @gpshead gpshead changed the title Make lib2to3 grammar more closely match Python Make lib2to3 grammar better match Python, support the := walrus Oct 24, 2019
    @gpshead
    Copy link
    Member

    gpshead commented Apr 3, 2020

    New changeset 96c5f5a by Tim Hatch in branch '3.7':
    [3.7] bpo-36541: lib2to3: Support named assignment expressions (GH-12702) (GH-19317)
    96c5f5a

    @gpshead
    Copy link
    Member

    gpshead commented Apr 3, 2020

    master/3.9 changeset:
    3c3aa45

    3.8 changeset: 1098671

    @gpshead
    Copy link
    Member

    gpshead commented Apr 3, 2020

    Support for := is in, are we still lacking f(**not x) support?

    @gpshead
    Copy link
    Member

    gpshead commented Oct 19, 2020

    Parsing support for f(**mapping) support is indeed still missing.

    as lib2to3 is pending deprecation at this point, i'm not going to work on this. anyone is welcome to pick it up. modifying the lib2to3 grammar, and any related code, and adding a test is what's needed to parse that syntax.

    @gpshead gpshead added 3.10 only security fixes and removed 3.7 (EOL) end of life labels Oct 19, 2020
    @gpshead gpshead removed their assignment Oct 19, 2020
    @PeterLudemann
    Copy link
    Mannequin

    PeterLudemann mannequin commented Dec 4, 2020

    I made a suggestion for augmenting ast.parse with some of lib2to3's features; but nobody seemed interested.

    RIP lib2to3. Like many pieces of software, it was used for far more than for what it was originally intended.

    https://mail.python.org/archives/list/python-ideas@python.org/thread/X2HJ6I6XLIGRZDB27HRHIVQC3RXNZAY4/

    @isidentical
    Copy link
    Sponsor Member

    I don't see the point of augmenting the ast.parse, since we already have variants of proper CST implementations outside the core python. Such as github.com/davidhalter/parso/ or LibCST.

    Also for basic refactorings, it is so easy to use tokens for the refactoring and AST for the analysis! Even the ast.unparse() can be partially used (like first finding the related segment of the code through AST analysis, building the corresponding variant, unparsing it, finding the region of related tokens in the source code and replacing them). There are also quite a few libraries for using tokenize in different purposes (or wrappers) such as https://github.com/asottile/tokenize-rt or github.com/isidentical/brm.

    @PeterLudemann
    Copy link
    Mannequin

    PeterLudemann mannequin commented Dec 6, 2020

    Every piece of code that uses either lib2to3 or a parser derived from it (including parso and LibCST) will eventually not be able to upgrade the parser because PEG can handle grammars that LL(k) can't. That's why I proposed adding some functionality to ast.parse, to make the whitespace and token information easily available - this seems to be what @BTaskaya says is "easy" (maybe they mean it's easy using LibCST? It seems to be fiddly using ast.parse). The alternative is that all these projects (black, LibCST, yapf, etc.) will have to roll their own solutions, which doesn't seem a very productive use of people's time and makes version upgrades slow.

    If people are interested in using ast.parse extensions as a replacement for lib2to3, I suggest discussing at https://mail.python.org/archives/list/python-ideas@python.org/thread/X2HJ6I6XLIGRZDB27HRHIVQC3RXNZAY4/

    @isidentical
    Copy link
    Sponsor Member

    Every piece of code that uses either lib2to3 or a parser derived from it (including parso and LibCST) will eventually not be able to upgrade the parser because PEG can handle grammars that LL(k) can't.

    Since these projects are external, depending on the functionality they are free-to-roll their own parser implementations or make hacks to pass away things. Or fork the Grammar/python.gram to preserve all tokens and generate a Python parser from it.

    If people are interested in using ast.parse extensions as a replacement for lib2to3, I suggest discussing at

    I don't quite get what you are proposing here,

    I propose implementing an optional pass over the parse tree that records lib2to3's "prefix" with each leaf node. The interface would be something like:

    How would you do that? By augmenting the AST with the information retrieved from tokens? If so, check this out; https://github.com/leo-editor/leo-editor/blob/master/leo/core/leoAst.py and asttokens.

    Also, please move the discussion to somewhere else (like discuss.python.org etc.) since this is not the ideal place to discuss and people might be distracted! (feel free to cc me where you move the discussion)

    @isidentical
    Copy link
    Sponsor Member

    Parsing support for f(**mapping) support is indeed still missing.

    as lib2to3 is pending deprecation at this point, i'm not going to work on this. anyone is welcome to pick it up. modifying the lib2to3 grammar, and any related code, and adding a test is what's needed to parse that syntax.

    I'd also agree, and not supporting to add features from now on. If someone really needs this to be added [and backported], please re-open the issue.

    @gpshead
    Copy link
    Member

    gpshead commented Dec 14, 2020

    While I said i didn't care... and don't really want to... I found a reason to at least not omit pep-570 positional only arg parsing support give things like yapf still use it rather than forking their own copy. PR testing.

    @gpshead gpshead reopened this Dec 14, 2020
    @gpshead gpshead self-assigned this Dec 14, 2020
    @gpshead
    Copy link
    Member

    gpshead commented Dec 14, 2020

    New changeset 42c9f0f by Gregory P. Smith in branch 'master':
    bpo-36541: Add lib2to3 grammar PEP-570 pos-only arg parsing (GH-23759)
    42c9f0f

    @miss-islington
    Copy link
    Contributor

    New changeset 06bfd03 by Miss Islington (bot) in branch '3.8':
    bpo-36541: Add lib2to3 grammar PEP-570 pos-only arg parsing (GH-23759)
    06bfd03

    @miss-islington
    Copy link
    Contributor

    New changeset 20bc40e by Miss Islington (bot) in branch '3.9':
    bpo-36541: Add lib2to3 grammar PEP-570 pos-only arg parsing (GH-23759)
    20bc40e

    @gpshead gpshead closed this as completed Dec 14, 2020
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.8 only security fixes 3.9 only security fixes 3.10 only security fixes topic-2to3 type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants