Author pablogsal
Recipients A. Skrobov, benjamin.peterson, berker.peksag, brett.cannon, fdrake, giampaolo.rodola, gregory.p.smith, pablogsal, python-dev, serhiy.storchaka, xcombelle
Date 2019-03-11.19:36:17
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1552332977.83.0.32150825067.issue36256@roundup.psfhosted.org>
In-reply-to
Content
> I would be curious to hear what Pablo has to say with the new parser having landed and if there's something we should be exposing from that code to replace what's in 'parser' today? (Either w/o semantic change or a new API.)

:)

One small clarification is that the parser is the same, what has changed is the parser generator. What is exposed in the parser modules today is the parse trees (in a very raw form). 

One thing we can do is expose the parser component that lib2to3/pgen2 has as a substitute/complement to the parser module (which is not exposed as part of the new pgen - I know, is confusing). This is very useful and complementary to the AST (for example, black is using a forked version of this component to obtain the CST as it can do round tripping - code->CST->NEW_CST->code). This piece is in pure Python and can read the parser tables that pgen generates. It also will have the advantage of forcing us to synchronize to the current grammar (black had to fork it among other things because the one in lib2to3 was out of date). This idea and all the challenges are already been discussed here:

https://bugs.python.org/issue33337

The major problem with the parser module is that is unsynchronized with the actual parser, it has a very raw API and is moderately unmaintained (as this bug reveals). We would need to evaluate if we want to spend effort into synchronizing them, deprecating completely the parser module, substitute it with a new python version or wait until we have a completely new non-LL(1) C parser to ask these questions.

What do you think?

As a side note, the problem described in this bug was more or less foreseen. This is in the header of Modules/parsemodule.c:

*  To debug parser errors like
*      "parser.ParserError: Expected node type 12, got 333."
*  decode symbol numbers using the automatically-generated files
*  Lib/symbol.h and Include/token.h.
History
Date User Action Args
2019-03-11 19:36:17pablogsalsetrecipients: + pablogsal, fdrake, brett.cannon, gregory.p.smith, giampaolo.rodola, benjamin.peterson, python-dev, berker.peksag, serhiy.storchaka, xcombelle, A. Skrobov
2019-03-11 19:36:17pablogsalsetmessageid: <1552332977.83.0.32150825067.issue36256@roundup.psfhosted.org>
2019-03-11 19:36:17pablogsallinkissue36256 messages
2019-03-11 19:36:17pablogsalcreate