Title: Some Unicode in identifiers improperly rejected
Type: Stage: resolved
Components: Interpreter Core Versions: Python 3.2
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: Joshua.Landau, r.david.murray
Priority: normal Keywords:

Created on 2012-09-23 23:45 by Joshua.Landau, last changed 2012-09-25 21:08 by terry.reedy. This issue is now closed.

Messages
msg171082 - (view) Author: Joshua Landau (Joshua.Landau) * Date: 2012-09-23 23:45
"a¹ = None" is not valid, even though unicodedata.normalize("NFKC", "¹") == "1".

One would expect "a¹ = None" and "a1 = None" to be equivalent in this case, as with "aⁱ = None" and "ai = None".

I am not sure how many other characters exhibit the same problem.


"¹" === "\u00b9"
"ⁱ" === "\u2071"
msg171089 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012-09-24 01:40
I find it unexpected that aⁱ and ai name the same variable, but I suppose that is a consequence of the unicode normalization rules (meaning what I really find surprising is the normalization).

As for the '¹', its category is No, which does not appear in the list in the identifiers section you link to, while 'ⁱ' is Lm, which does.

So there is no bug here.
