Message383681
> I propose to close that gap, and establish an API which would allow to parse token stream (iterable) into an AST. An initial implementation for CPython can (and likely should) be naive, making a loop thru surface program representation.
There is different aspects of this problem (like maintenance cost of either exposing the underlying tokenizer, or building something like Python-ast.c to convert these 2 different token types back and forth which I'm big -1 on both of them.) but the thing I don't quite get is the use case.
What prevents you from using ast.parse(tokenize.untokenize(token_stream))? It is guaranteed that you won't miss anything (in terms of the position of tokens) (since it almost roundtrips every case).
Also, tokens -> AST is not the only disconnected part in the underlying compiler. Stuff like AST -> Symbol Table / AST -> Optimized AST etc. is also not available, and apparently not needed (since nobody else, maybe except me [about the AST -> ST conversion], complained about these being missing).
I'd also suggest moving the discussion to the Python-ideas, for a much greater audience. |
|
Date |
User |
Action |
Args |
2020-12-24 10:34:15 | BTaskaya | set | recipients:
+ BTaskaya, pfalcon, serhiy.storchaka, pablogsal |
2020-12-24 10:34:15 | BTaskaya | set | messageid: <1608806055.68.0.139746238425.issue42729@roundup.psfhosted.org> |
2020-12-24 10:34:15 | BTaskaya | link | issue42729 messages |
2020-12-24 10:34:15 | BTaskaya | create | |
|