This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author gvanrossum
Recipients BTaskaya, Peter Ludemann, carljm, corona10, eric.snow, gregory.p.smith, gvanrossum, hroncok, vstinner
Date 2020-07-06.23:55:00
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1594079700.84.0.0316466894372.issue40360@roundup.psfhosted.org>
In-reply-to
Content
There's no python-dev discussion; if you want more feedback I recommend starting on python-ideas first (on either forum you may expect pushback because this is not about a proposed change to Python or its workflow).

The Lib/ast.py module will continue to be the official API for the standard AST. It is a simple wrapper around the builtin parser (at least in CPython -- I don't actually know to what extent other Python implementations support it, but they certainly *could*). And in 3.9 and later the AST is already being produced using the *new* parser.

We want to deprecate lib2to3 because nobody is interested in maintaining it., Having it in the stdlib, with its strict backwards compatibility requirements, makes it difficult to do a good job at updating it. This is why it's been forked repeatedly -- once forked, the owner of the fork can make changes easily, preserving the API perfectly (if so desired) and maintaining compatibility with older Python versions.

My own thoughts are that libraries like LibCST and parso have two sides: an API for the AST, and a way to parse source code into an AST. Usually the parsing API is incredibly simple -- e.g. a function to parse a file and another function to parse a string. And there's no reason for the AST API to change just because the parsing algorithm has changed.

Finally, we already have a (rough) Python implementation of the PEG parser too -- in fact it's included in Tools/peg_generator (and used to regenerate the metaparser). This reads the same grammar format (i.e. Grammar/python.gram) and generates Python code instead of C code to do the parsing. It's easy to retarget the tokenizer of the generated Python code.

So a decent way forward might be to pick one of the 3rd party libraries (perhaps parso, which is itself a fork of lib2to3 and what LibCST builds on) and update its parser to use a PEG parser generated using the PEG generator from Tools/peg_generator (which people are welcome to fork).

This might be a summer-of-code-sized project.
History
Date User Action Args
2020-07-06 23:55:00gvanrossumsetrecipients: + gvanrossum, gregory.p.smith, vstinner, carljm, eric.snow, hroncok, corona10, BTaskaya, Peter Ludemann
2020-07-06 23:55:00gvanrossumsetmessageid: <1594079700.84.0.0316466894372.issue40360@roundup.psfhosted.org>
2020-07-06 23:55:00gvanrossumlinkissue40360 messages
2020-07-06 23:55:00gvanrossumcreate