This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author BTaskaya
Recipients BTaskaya, pablogsal, pfalcon, serhiy.storchaka
Date 2020-12-24.10:34:15
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1608806055.68.0.139746238425.issue42729@roundup.psfhosted.org>
In-reply-to
Content
> I propose to close that gap, and establish an API which would allow to parse token stream (iterable) into an AST. An initial implementation for CPython can (and likely should) be naive, making a loop thru surface program representation. 

There is different aspects of this problem (like maintenance cost of either exposing the underlying tokenizer, or building something like Python-ast.c to convert these 2 different token types back and forth which I'm big -1 on both of them.) but the thing I don't quite get is the use case. 

What prevents you from using ast.parse(tokenize.untokenize(token_stream))? It is guaranteed that you won't miss anything (in terms of the position of tokens) (since it almost roundtrips every case). 

Also, tokens -> AST is not the only disconnected part in the underlying compiler. Stuff like AST -> Symbol Table / AST -> Optimized AST etc. is also not available, and apparently not needed (since nobody else, maybe except me [about the AST -> ST conversion], complained about these being missing). 

I'd also suggest moving the discussion to the Python-ideas, for a much greater audience.
History
Date User Action Args
2020-12-24 10:34:15BTaskayasetrecipients: + BTaskaya, pfalcon, serhiy.storchaka, pablogsal
2020-12-24 10:34:15BTaskayasetmessageid: <1608806055.68.0.139746238425.issue42729@roundup.psfhosted.org>
2020-12-24 10:34:15BTaskayalinkissue42729 messages
2020-12-24 10:34:15BTaskayacreate