This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: [PEP 617 new parser] Regression in multiline SyntaxError offsets
Type: behavior Stage: resolved
Components: Interpreter Core Versions: Python 3.9
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: Anthony Sottile, BTaskaya, gaborjbernat, lys.nikolaou, pablogsal, serhiy.storchaka, vstinner
Priority: normal Keywords: patch

Created on 2020-04-19 21:40 by Anthony Sottile, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 19619 merged pablogsal, 2020-04-20 10:27
Messages (13)
msg366805 - (view) Author: Anthony Sottile (Anthony Sottile) * Date: 2020-04-19 21:40
this was noticed in pyflakes's testsuite: https://github.com/PyCQA/pyflakes/pull/532#pullrequestreview-396059622

I've created a small script to reproduce the problem

```
import sys

SRC = b"""\
def foo():
    '''

def bar():
    pass

def baz():
    '''quux'''
"""

try:
    exec(SRC)
except SyntaxError as e:
    print(
        f'{sys.version}\n\n'
        f'{type(e).__name__}:\n'
        f'- line: {e.lineno}\n'
        f'- offset: {e.offset}\n'
        f'- text: {e.text!r}\n'
    )
```

This was "fixed" in python3.8, but has regressed:

```
$ python3.6 t2.py
3.6.9 (default, Nov  7 2019, 10:44:02) 
[GCC 8.3.0]

SyntaxError:
- line: 8
- offset: 52
- text: "    '''\n\ndef bar():\n    pass\n\ndef baz():\n    '''quux'''\n"

$ python3.8 t2.py
3.8.2 (default, Feb 26 2020, 02:56:10) 
[GCC 7.4.0]

SyntaxError:
- line: 8
- offset: 8
- text: "    '''\n\ndef bar():\n    pass\n\ndef baz():\n    '''quux'''\n"

$ ./python t2.py
3.9.0a5+ (heads/master:3955da8568, Apr 19 2020, 14:29:48) 
[GCC 7.5.0]

SyntaxError:
- line: 8
- offset: 52
- text: "    '''\n\ndef bar():\n    pass\n\ndef baz():\n    '''quux'''\n"

```
msg366807 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2020-04-19 22:37
The regression happened in:

commit 41d5b94af44e34ac05d4cd57460ed104ccf96628
Author: Lysandros Nikolaou <lisandrosnik@gmail.com>
Date:   Sun Apr 12 21:21:00 2020 +0300

    bpo-40246: Report a better error message for invalid string prefixes (GH-19476)

Will try to get a look at this shortly
msg366905 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2020-04-21 00:53
New changeset 11a7f158ef51b0edcde3c3d9215172354e385877 by Pablo Galindo in branch 'master':
bpo-40335: Correctly handle multi-line strings in tokenize error scenarios (GH-19619)
https://github.com/python/cpython/commit/11a7f158ef51b0edcde3c3d9215172354e385877
msg367220 - (view) Author: Anthony Sottile (Anthony Sottile) * Date: 2020-04-24 20:37
This seems to have regressed again

$ ./python --version --version
Python 3.9.0a5+ (heads/master:503de7149d, Apr 24 2020, 13:34:49) 
[GCC 7.5.0]
$ ./python t.py
  File "/home/asottile/workspace/cpython/t.py", line 8
    
                                        ^
SyntaxError: invalid string prefix
msg367224 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2020-04-24 20:55
This is likely due to the new parser (see PEP 617). Do you get the same problem running with -X oldparser ?
msg367226 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2020-04-24 21:00
Yeah, I can confirm after bisecting that the commit that introduced the regression is the new parser:

c5fc15685202cda73f7c3f5c6f299b0945f58508 is the first bad commit
commit c5fc15685202cda73f7c3f5c6f299b0945f58508
Author: Pablo Galindo <Pablogsal@gmail.com>
Date:   Wed Apr 22 23:29:27 2020 +0100

    bpo-40334: PEP 617 implementation: New PEG parser for CPython (GH-19503)

    Co-authored-by: Guido van Rossum <guido@python.org>
    Co-authored-by: Lysandros Nikolaou <lisandrosnik@gmail.com>
msg367227 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2020-04-24 21:02
With the old parser it works:

~/github/python/master master*
❯ ./python -X oldparser t.py
3.9.0a5+ (heads/master:503de7149d, Apr 24 2020, 22:02:28)
[GCC 9.3.0]

SyntaxError:
- line: 8
- offset: 8
- text: "    '''\n\ndef bar():\n    pass\n\ndef baz():\n    '''quux'''\n"


~/github/python/master master*
❯ ./python  t.py
3.9.0a5+ (heads/master:503de7149d, Apr 24 2020, 22:02:28)
[GCC 9.3.0]

SyntaxError:
- line: 8
- offset: 52
- text: "    '''"
msg367228 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2020-04-24 21:04
And the regression happens because we are ignoring that test currently due to the new parser not currently reporting the same offsets:

https://github.com/python/cpython/blob/master/Lib/test/test_exceptions.py#L181
msg367233 - (view) Author: Anthony Sottile (Anthony Sottile) * Date: 2020-04-24 21:45
pyflakes's testsuite has many failures under the new parser -- is the expectation that those will be fixed by 3.9 final?
msg367236 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2020-04-24 22:59
> pyflakes's testsuite has many failures under the new parser

Can you report this also on the PEP 617 issue?
msg367238 - (view) Author: Anthony Sottile (Anthony Sottile) * Date: 2020-04-24 23:09
cool, reported there as well!
msg367414 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-04-27 12:41
Until a fix is shipped, you can use -X oldparser command line option or PYTHONOLDPARSER=1 environment variable:
https://docs.python.org/dev/whatsnew/3.9.html#pep-617-new-parser
msg378965 - (view) Author: Lysandros Nikolaou (lys.nikolaou) * (Python committer) Date: 2020-10-19 17:20
I'm closing this since this specific issue seems to be fixed both in master and in 3.9. Anthony, feel free to re-open it, in case I've missed something.
History
Date User Action Args
2022-04-11 14:59:29adminsetgithub: 84515
2020-10-19 17:20:21lys.nikolaousetstatus: open -> closed
resolution: fixed
2020-10-19 17:20:04lys.nikolaousetmessages: + msg378965
2020-04-27 12:41:32vstinnersetnosy: + vstinner

messages: + msg367414
title: Regression in multiline SyntaxError offsets -> [PEP 617 new parser] Regression in multiline SyntaxError offsets
2020-04-24 23:09:40Anthony Sottilesetmessages: + msg367238
2020-04-24 22:59:07pablogsalsetmessages: + msg367236
2020-04-24 21:45:39Anthony Sottilesetmessages: + msg367233
2020-04-24 21:04:26pablogsalsetmessages: + msg367228
2020-04-24 21:02:53pablogsalsetmessages: + msg367227
2020-04-24 21:00:07pablogsalsetmessages: + msg367226
2020-04-24 20:59:46pablogsalsetmessages: - msg367225
2020-04-24 20:56:57pablogsalsetmessages: + msg367225
2020-04-24 20:55:00pablogsalsetmessages: + msg367224
2020-04-24 20:37:33Anthony Sottilesetstatus: closed -> open
resolution: fixed -> (no value)
messages: + msg367220
2020-04-21 00:53:24pablogsalsetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2020-04-21 00:53:11pablogsalsetmessages: + msg366905
2020-04-20 10:27:51pablogsalsetkeywords: + patch
stage: patch review
pull_requests: + pull_request18949
2020-04-19 22:37:25pablogsalsetnosy: + lys.nikolaou
messages: + msg366807
2020-04-19 22:01:25gaborjbernatsetnosy: + gaborjbernat
2020-04-19 21:58:52serhiy.storchakasetnosy: + serhiy.storchaka
2020-04-19 21:48:50Anthony Sottilesetnosy: + pablogsal
2020-04-19 21:45:48BTaskayasetnosy: + BTaskaya
2020-04-19 21:40:55Anthony Sottilecreate