This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: xid_start definition for Unicode identifiers refers to xid_continue
Type: Stage:
Components: Documentation Versions: Python 3.6
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: docs@python Nosy List: docs@python, loewis, ralph.corderoy, xiang.zhang
Priority: normal Keywords:

Created on 2017-04-21 14:27 by ralph.corderoy, last changed 2022-04-11 14:58 by admin.

Messages (2)
msg292049 - (view) Author: Ralph Corderoy (ralph.corderoy) Date: 2017-04-21 14:27
https://docs.python.org/3/reference/lexical_analysis.html#identifiers has a grammar.

    identifier   ::=  xid_start xid_continue*
    id_start     ::=  <all characters in general categories Lu, Ll, Lt, Lm, Lo, Nl, the underscore, and characters with the Other_ID_Start property>
    id_continue  ::=  <all characters in id_start, plus characters in the categories Mn, Mc, Nd, Pc and others with the Other_ID_Continue property>
    xid_start    ::=  <all characters in id_start whose NFKC normalization is in "id_start xid_continue*">
    xid_continue ::=  <all characters in id_continue whose NFKC normalization is in "id_continue*">

I struggle to make sense of it unless I remove `xid_continue*' from `xid_start's definition.
I suspect it ended up there due to cut and paste.
msg292592 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2017-04-29 12:24
Quoting from PEP3131:

XID_Start then closes this set under normalization, by removing all characters whose NFKC normalization is not of the form ID_Start ID_Continue* anymore.
History
Date User Action Args
2022-04-11 14:58:45adminsetgithub: 74314
2017-04-29 12:24:10xiang.zhangsetnosy: + loewis, xiang.zhang
messages: + msg292592
2017-04-21 14:27:21ralph.corderoycreate