Message 342422 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	mbussonn
Recipients	hawkowl, mbussonn
Date	2019-05-14.02:54:30
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1557802470.22.0.231295767489.issue36911@roundup.psfhosted.org>
In-reply-to

Content
I believe this one is even before the ast, in the tokenizer. Though the AST is also doing some normalisation in identifiers (“ε” U+03B5 Greek Small Letter Epsilon Unicode Character , and “ϵ” U+03F5 Greek Lunate Epsilon Symbol Unicode Character get normalized to the same for example, which is problematic as the look different, but end up being same identifier). I'd be interested in an opt-in flag to not do this normalisation (I have a prototype with this for the identifier normalisation in ast, but I have not looked at the tokenizer), which might be useful for some linting tools.

I believe this one is even before the ast, in the tokenizer. Though the AST is also doing some normalisation in identifiers (“ε” U+03B5 Greek Small Letter Epsilon Unicode Character , and “ϵ” U+03F5 Greek Lunate Epsilon Symbol Unicode Character get normalized to the same for example, which is problematic as the look different, but end up being same identifier).

I'd be interested in an opt-in flag to not do this normalisation (I have a prototype with this for the identifier normalisation in ast, but I have not looked at the tokenizer), which might be useful for some linting tools.

History
Date	User	Action	Args
2019-05-14 02:54:30	mbussonn	set	recipients: + mbussonn, hawkowl
2019-05-14 02:54:30	mbussonn	set	messageid: <1557802470.22.0.231295767489.issue36911@roundup.psfhosted.org>
2019-05-14 02:54:30	mbussonn	link	issue36911 messages
2019-05-14 02:54:30	mbussonn	create