Message 372086 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	lys.nikolaou
Recipients	eric.smith, gvanrossum, lys.nikolaou, pablogsal
Date	2020-06-22.12:54:04
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1592830444.28.0.715795634223.issue41076@roundup.psfhosted.org>
In-reply-to

Content
Inspired by bpo-41064, I sat down to try and find problems with f-string locations in the new parser. I was able to come up with a way to compute the locations of the f-string expressions that I think is more consistent and allows us to delete all the code that was fixing the expression locations after the actual parsing, which accounted for about 1/6 of string_parser.c. A high-level explanation of the change: Before this change we were pre-feeding the parser with the location of the f-string itself. The parser was then parsing the expression and was computing the locations of all the nodes based on the offset of the f-string. After the parsing was done, we were identifying the offset and the lineno of the expression within the fstring and were fixing the node locations accordingly. For example, for an f-string like `a = 0; f'irrelevant {a}'` we were doing the following: - Pre-feed the parser with lineno=0 and col_offset=7 (the offset of the f-string itself in the current line). - Parse the expression (adding 7 to the col_offset of each parsed node, lineno remains the same since it's 0). - Fix the node locations by shifting the Name node by 14, which is the number of characters in the f-string (counting the `f` and the opening quote) before the start of the expression. With this change we now pre-feed the parser with the exact lineno and offset of the expression itself, not the f-string. This allows us to completely skip the third step of shifting the node locations.

Inspired by bpo-41064, I sat down to try and find problems with f-string locations in the new parser. I was able to come up with a way to compute the locations of the f-string expressions that I think is more consistent and allows us to delete all the code that was fixing the expression locations after the actual parsing, which accounted for about 1/6 of string_parser.c.

A high-level explanation of the change:

Before this change we were pre-feeding the parser with the location of the f-string itself. The parser was then parsing the expression and was computing the locations of all the nodes based on the offset of the f-string. After the parsing was done, we were identifying the offset and the lineno of the expression *within* the fstring and were fixing the node locations accordingly. For example, for an f-string like `a = 0; f'irrelevant {a}'` we were doing the following:

- Pre-feed the parser with lineno=0 and col_offset=7 (the offset of the f-string itself in the current line).
- Parse the expression (adding 7 to the col_offset of each parsed node, lineno remains the same since it's 0).
- Fix the node locations by shifting the Name node by 14, which is the number of characters in the f-string (counting the `f` and the opening quote) before the start of the expression.

With this change we now pre-feed the parser with the exact lineno and offset of the expression itself, not the f-string. This allows us to completely skip the third step of shifting the node locations.

History
Date	User	Action	Args
2020-06-22 12:54:04	lys.nikolaou	set	recipients: + lys.nikolaou, gvanrossum, eric.smith, pablogsal
2020-06-22 12:54:04	lys.nikolaou	set	messageid: <1592830444.28.0.715795634223.issue41076@roundup.psfhosted.org>
2020-06-22 12:54:04	lys.nikolaou	link	issue41076 messages
2020-06-22 12:54:04	lys.nikolaou	create