classification
Title: Location of SyntaxError with new parser missing (after continuation character)
Type: behavior Stage: resolved
Components: Interpreter Core Versions: Python 3.10, Python 3.9
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: ammar2, aroberge, gvanrossum, lys.nikolaou, miss-islington, pablogsal, terry.reedy
Priority: normal Keywords: patch

Created on 2021-03-19 11:18 by aroberge, last changed 2021-03-22 23:53 by ammar2. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 24939 merged pablogsal, 2021-03-19 22:50
PR 24975 merged miss-islington, 2021-03-22 17:28
Messages (6)
msg389071 - (view) Author: Andre Roberge (aroberge) * Date: 2021-03-19 11:18
Normally, for SyntaxErrors, the location of the error is indicated by a ^. There is at least one case where the location is missing for 3.9 and 3.10.0a6 where it was shown before. Using the old parser for 3.9, or with previous versions of Python, the location is shown.

Python 3.10.0a6 ... on win32
>>> a = 3 \ 4
  File "<stdin>", line 1
SyntaxError: unexpected character after line continuation character
>>>


Python 3.9.0 ... on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> a = 3 \ 4
  File "<stdin>", line 1
SyntaxError: unexpected character after line continuation character
>>>

Using the old parser with Python 3.9, the location of the error is shown *after* the unexpected character.

> python -X oldparser
Python 3.9.0 ... on win32
>>> a = 3 \ 4
  File "<stdin>", line 1
    a = 3 \ 4
             ^
SyntaxError: unexpected character after line continuation character
>>>

Using Python 3.8 (and 3.7, 3.6), the location is pointing at the unexpected character.


Python 3.8.4 ... on win32
>>> a = 3 \ 4
  File "<stdin>", line 1
    a = 3 \ 4
            ^
SyntaxError: unexpected character after line continuation character
>>>
msg389333 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-03-22 17:28
New changeset 96eeff516204b7cc751103fa33dcc665e387846e by Pablo Galindo in branch 'master':
bpo-43555: Report the column offset for invalid line continuation character (GH-24939)
https://github.com/python/cpython/commit/96eeff516204b7cc751103fa33dcc665e387846e
msg389335 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-03-22 19:07
New changeset 994a519915bff4901abaa7476e2b91682dea619a by Miss Islington (bot) in branch '3.9':
bpo-43555: Report the column offset for invalid line continuation character (GH-24939) (#24975)
https://github.com/python/cpython/commit/994a519915bff4901abaa7476e2b91682dea619a
msg389351 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2021-03-22 23:24
Before the patch, IDLE highlighted the \n endline with a red background,  which tk displays as red background from the blank space after the 4 to the right edge of the text widget, including in 3.8.8.

The 3.8 result, different from REPL, is due the the difference of using code._maybe_compile.  The latter catches the SyntaxError for "3 \\ 4", which has offset 5, recompiles "3 \\ 4\n", and raises the subsequent SyntaxError, which has offset 6.  (In this particular case where the message remains the same, the original SyntaxError instance with the original offset should have been kept and raised.)

I have occasionally made this typing mistake and found the long red line slightly annoying, but never thought to compare it to the REPL caret or report it as a bug.  The new parser made no difference in IDLE to newly annoy.

After the patch, the (1-based) offset stays 5, which IDLE knows means the 5th char and hence it now highlights the '4'.  Much better.  Thank you both for the report and fix.

Side question: https://docs.python.org/3/library/exceptions.html#SyntaxError says "Instances of this class have attributes filename, lineno, offset and text ...". Should it be documented that lineno and offset are both 1-based?  Are these CPython accidents or part of the language?  1-based line numbers can be expected as common across languages, tk and python included.  I believe that 1-based column offsets were viewed in a previous issue as a bug that we would not fix.
msg389353 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2021-03-22 23:50
We should definitely document the column offset being 1-based, if it hasn't been already. But be careful, there are some APIs that are 0-based and others that are 1-based, for column offsets.

I was quite surprised at some point to find out that we were inconsistent with the column offset, and then I looked at what Emacs and vim do, and I found that both interpret column offsets (in the familiar "filename:lineno:offset: message" format) as 1-based. IIRC I had to fix it in quite a few places because we were actually being inconsistent.
msg389354 - (view) Author: Ammar Askar (ammar2) * (Python committer) Date: 2021-03-22 23:53
> We should definitely document the column offset being 1-based

Yes please, I remember working on that issue to make it consistently 1-based a while ago and I remember that the tooling was relying on 1-based indexes for column offsets.
History
Date User Action Args
2021-03-22 23:53:31ammar2setnosy: + ammar2
messages: + msg389354
2021-03-22 23:50:43gvanrossumsetmessages: + msg389353
2021-03-22 23:24:12terry.reedysetnosy: + terry.reedy
messages: + msg389351
2021-03-22 19:07:24pablogsalsetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2021-03-22 19:07:16pablogsalsetmessages: + msg389335
2021-03-22 17:28:59miss-islingtonsetnosy: + miss-islington
pull_requests: + pull_request23734
2021-03-22 17:28:22pablogsalsetmessages: + msg389333
2021-03-19 22:50:16pablogsalsetkeywords: + patch
stage: patch review
pull_requests: + pull_request23700
2021-03-19 14:14:04xtreaksetnosy: + gvanrossum, lys.nikolaou, pablogsal
2021-03-19 11:18:52arobergecreate