This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author Anthony Sottile
Recipients Andrew.C, Anthony Sottile, Jim Fasarakis-Hilliard, amaury.forgeotdarc, berker.peksag, djmitche, effbot, kirkshorts, meador.inge, pablogsal, serhiy.storchaka, superluser
Date 2021-01-27.17:18:04
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1611767884.2.0.970913821693.issue3353@roundup.psfhosted.org>
In-reply-to
Content
you already have that right now because the `tokenize` module is exposed. (except that every change to the tokenization requires it to be implemented once in C and once in python)

it's much more frustrating when the two differ as well

I don't think all the internals of the C tokenization need to be exposed, my main goals would be:

- expose enough information to reimplement Lib/tokenize.py
- replace Lib/tokenize.py with the C tokenizer

and the reasons would be:

- eliminate the (potential) drift and complexity between the two
- get a fast tokenizer


Unlike the AST, the tokenization changes much less frequently (last major addition I can remember is the `@` operator


We can hide almost all of the details of the tokenization behind an opaque struct and getter functions
History
Date User Action Args
2021-01-27 17:18:04Anthony Sottilesetrecipients: + Anthony Sottile, effbot, amaury.forgeotdarc, djmitche, kirkshorts, meador.inge, berker.peksag, serhiy.storchaka, superluser, Andrew.C, Jim Fasarakis-Hilliard, pablogsal
2021-01-27 17:18:04Anthony Sottilesetmessageid: <1611767884.2.0.970913821693.issue3353@roundup.psfhosted.org>
2021-01-27 17:18:04Anthony Sottilelinkissue3353 messages
2021-01-27 17:18:04Anthony Sottilecreate