Message 368360 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	gvanrossum
Recipients	BTaskaya, gvanrossum, lys.nikolaou, pablogsal
Date	2020-05-07.17:50:45
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1588873845.99.0.520524096878.issue40546@roundup.psfhosted.org>
In-reply-to

Content
I don't understand why the traceback module is implicated. It just formats the information in the SyntaxError object, same as the builtin printing for syntax errors. The key difference is always in what line/column/text is put in the SyntaxError object by whichever parser is being used. In the past there was some misunderstanding about whether column numbers are 0-based (the leftmost column is numbered 0) or 1-based (the leftmost column is numbered 1), and at some point we discovered there was an inconsistency -- certain parts of the code put 0-based offsets in the SyntaxError object and other parts put 1-based offsets. We then decided that the SyntaxError column offset should be 1-based and changed various bits of code to match. It's however possible that we forgot some. It's also still not clearly documented (e.g. the stdlib docs for SyntaxError don't mention it). What complicates matters further is that the lowest-level C code in the tokenizer definitely uses 0-based offsets, which means that whenever we create a SyntaxError we have to add 1 to the offset. (You can see this happening if you look at various calls to PyErr_SyntaxLocationObject().)

I don't understand why the traceback module is implicated. It just formats the information in the SyntaxError object, same as the builtin printing for syntax errors. The key difference is always in what line/column/text is put in the SyntaxError object by whichever parser is being used.

In the past there was some misunderstanding about whether column numbers are 0-based (the leftmost column is numbered 0) or 1-based (the leftmost column is numbered 1), and at some point we discovered there was an inconsistency -- certain parts of the code put 0-based offsets in the SyntaxError object and other parts put 1-based offsets.

We then decided that the SyntaxError column offset should be 1-based and changed various bits of code to match. It's however possible that we forgot some. It's also still not clearly documented (e.g. the stdlib docs for SyntaxError don't mention it).

What complicates matters further is that the lowest-level C code in the tokenizer definitely uses 0-based offsets, which means that whenever we create a SyntaxError we have to add 1 to the offset. (You can see this happening if you look at various calls to PyErr_SyntaxLocationObject().)

History
Date	User	Action	Args
2020-05-07 17:50:46	gvanrossum	set	recipients: + gvanrossum, lys.nikolaou, pablogsal, BTaskaya
2020-05-07 17:50:45	gvanrossum	set	messageid: <1588873845.99.0.520524096878.issue40546@roundup.psfhosted.org>
2020-05-07 17:50:45	gvanrossum	link	issue40546 messages
2020-05-07 17:50:45	gvanrossum	create