Message 318547 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	davidhalter
Recipients	davidhalter
Date	2018-06-03.12:59:38
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1528030778.27.0.592728768989.issue33754@psf.upfronthosting.co.za>
In-reply-to

Content
Currently f-strings are a bit of a hack. They certainly work very well for users, but they are implemented in ast.c and therefore not part of the Python grammar and the tokenizer. I want to change this. I wrote an alternative implementation of f-strings in parso (http://parso.readthedocs.io/en/latest/). The idea I have is to modify the Python grammar slightly (https://github.com/davidhalter/parso/blob/master/parso/python/grammar37.txt#L149): fstring: FSTRING_START fstring_content* FSTRING_END fstring_content: FSTRING_STRING \| fstring_expr fstring_conversion: '!' NAME fstring_expr: '{' testlist [ fstring_conversion ] [ fstring_format_spec ] '}' fstring_format_spec: ':' fstring_content* We would push most of the hard work to the tokenizer. This obviously means that we have to add a lot of code there. I wrote a tokenizer in Python for parso here: in https://github.com/davidhalter/parso/blob/master/parso/python/tokenize.py. It is definitely working well. The biggest difference to the current tokenizer.c is that you have to work with stacks and be way more context-sensitive. There were attempts to change the Grammar of f-strings like https://www.python.org/dev/peps/pep-0536/. It hasn't caught on, because it tried to change the semantics of f-strings. The implementation in parso has not changed the semantics of f-strings. In a first step I would like to get this working for CPython and not tokenize.py. Modifying tokenize.py will not be part of my initial work here. I have discussed this with Łukasz Langa, so if you guys have no objections I will start working on it. Please let me know if you support this or not.

Currently f-strings are a bit of a hack. They certainly work very well for users, but they are implemented in ast.c and therefore not part of the Python grammar and the tokenizer.

I want to change this. I wrote an alternative implementation of f-strings in parso (http://parso.readthedocs.io/en/latest/). The idea I have is to modify the Python grammar slightly (https://github.com/davidhalter/parso/blob/master/parso/python/grammar37.txt#L149):

fstring: FSTRING_START fstring_content* FSTRING_END
fstring_content: FSTRING_STRING | fstring_expr
fstring_conversion: '!' NAME
fstring_expr: '{' testlist [ fstring_conversion ] [ fstring_format_spec ] '}'
fstring_format_spec: ':' fstring_content*

We would push most of the hard work to the tokenizer. This obviously means that we have to add a lot of code there. I wrote a tokenizer in Python for parso here: in https://github.com/davidhalter/parso/blob/master/parso/python/tokenize.py. It is definitely working well. The biggest difference to the current tokenizer.c is that you have to work with stacks and be way more context-sensitive.

There were attempts to change the Grammar of f-strings like https://www.python.org/dev/peps/pep-0536/. It hasn't caught on, because it tried to change the semantics of f-strings. The implementation in parso has not changed the semantics of f-strings.

In a first step I would like to get this working for CPython and not tokenize.py. Modifying tokenize.py will not be part of my initial work here.

I have discussed this with Łukasz Langa, so if you guys have no objections I will start working on it. Please let me know if you support this or not.

History
Date	User	Action	Args
2018-06-03 12:59:38	davidhalter	set	recipients: + davidhalter
2018-06-03 12:59:38	davidhalter	set	messageid: <1528030778.27.0.592728768989.issue33754@psf.upfronthosting.co.za>
2018-06-03 12:59:38	davidhalter	link	issue33754 messages
2018-06-03 12:59:38	davidhalter	create