classification
Title: Unicode symbols crash lib2to3.parse
Type: behavior Stage: resolved
Components: 2to3 (2.x to 3.x conversion tool) Versions: Python 3.6
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: Yann Grisel, benjamin.peterson, terry.reedy, xiang.zhang
Priority: normal Keywords:

Created on 2017-05-12 21:26 by Yann Grisel, last changed 2017-05-19 19:06 by terry.reedy. This issue is now closed.

Messages (3)
msg293573 - (view) Author: Yann Grisel (Yann Grisel) Date: 2017-05-12 21:26
The code formatter YAPF relies on lib2to3 to parse the code before formatting it. The function "classify" from "lib2to3/pgen2/parse.py" returns a ParseError when encountering unicode variable names (like Δ), which it should not.
msg293601 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2017-05-13 07:59
Why? Unicode identifiers are not allowed in 2.x. I don't think lib2to3 is able or responsible to parse invalid syntax source codes.
msg293965 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2017-05-19 19:06
I agree.  Δ is not a 2.x identifier (variable name) or anything else (like 'binary operator') and it would be wrong for lib2to3.pgen2.parse.classify to classify it is such, or as anything else.  So I am closing this.  Benjamin can reopen if we are mistaken.
History
Date User Action Args
2017-05-19 19:06:27terry.reedysetstatus: open -> closed

type: behavior

nosy: + terry.reedy, benjamin.peterson
messages: + msg293965
resolution: not a bug
stage: resolved
2017-05-13 07:59:44xiang.zhangsetnosy: + xiang.zhang
messages: + msg293601
2017-05-12 21:26:00Yann Griselcreate