Message369696
First note that 3.8.3 grammar.html is stated to be the actual grammar used by the old parser, and is a bit different from the more human readable grammar given in the reference manual. It is a bit different in 3.9 and I expect will be much more different in 3.10 with the new PEG parser.
In the grammar, the CAPITALIZED_NAMES are token names returned by the tokenizer/lexer. This is a standard convention.
I am pretty sure that the human readable lexing rules in lexical_analysis are not what the lexer uses. I presume the latter uses barely readable RE expressions, as does the tokenize module.
Compare the float grammar in https://docs.python.org/3/reference/lexical_analysis.html#floating-point-literals to the float REs in tokenize.py.
def group(*choices): return '(' + '|'.join(choices) + ')'
def maybe(*choices): return group(*choices) + '?'
# The above are reused for multiple REs.
Exponent = r'[eE][-+]?[0-9](?:_?[0-9])*'
Pointfloat = group(r'[0-9](?:_?[0-9])*\.(?:[0-9](?:_?[0-9])*)?',
r'\.[0-9](?:_?[0-9])*') + maybe(Exponent)
Expfloat = r'[0-9](?:_?[0-9])*' + Exponent
Floatnumber = group(Pointfloat, Expfloat)
Note that this is (python) code, not a text specification. You or someone else can look at what the C lexer does. But I think that the proposal should be rejected. |
|
Date |
User |
Action |
Args |
2020-05-23 07:19:59 | terry.reedy | set | recipients:
+ terry.reedy, gvanrossum, georg.brandl, cool-RR, docs@python |
2020-05-23 07:19:59 | terry.reedy | set | messageid: <1590218399.33.0.591431325033.issue40678@roundup.psfhosted.org> |
2020-05-23 07:19:59 | terry.reedy | link | issue40678 messages |
2020-05-23 07:19:58 | terry.reedy | create | |
|