classification
Title: New parser considers empty line following a backslash to be a syntax error, old parser didn't
Type: behavior Stage: resolved
Components: Interpreter Core Versions: Python 3.10, Python 3.9
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: adamwill, benjamin.peterson, christian.heimes, gvanrossum, lys.nikolaou, miss-islington, pablogsal, yselivanov
Priority: normal Keywords: patch

Created on 2020-06-02 19:03 by adamwill, last changed 2020-06-11 00:14 by miss-islington. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 20769 merged lys.nikolaou, 2020-06-09 23:51
PR 20795 merged miss-islington, 2020-06-10 23:56
Messages (7)
msg370618 - (view) Author: Adam Williamson (adamwill) Date: 2020-06-02 19:03
While debugging issues with the black test suite in Python 3.9, I found one which black upstream says is a Cpython issue, so I'm filing it here.

Reproduction is very easy. Just use this four-line tester:

    print("hello, world")
    \

    print("hello, world 2")

with that saved as `test.py`, check the results:

    <mock-chroot> sh-5.0# PYTHONOLDPARSER=1 python3 test.py
    hello, world
    hello, world 2
    <mock-chroot> sh-5.0# python3 test.py
      File "/builddir/build/BUILD/black-19.10b0/test.py", line 3
        
        ^
    SyntaxError: invalid syntax

The reason black has this test (well, a similar test - in black's test, the file *starts* with the backslash then the empty line, but the result is the same) is covered in https://github.com/psf/black/issues/922 and https://github.com/psf/black/pull/948 .
msg370809 - (view) Author: Lysandros Nikolaou (lys.nikolaou) * (Python committer) Date: 2020-06-06 01:23
This is limited to cases where the line continuation character is on an otherwise empty line. For example this works correctly:

$ cat t.py
print("hello world")
print("hello world 2") \

print("hello world 3")
$ ./python.exe t.py
hello world
hello world 2
hello world 3

The actual problem is at the tokenizer level, where a line with only a continuation character is not considered an empty line and thus two NEWLINE tokens get emitted, one after the other. The old parser was somehow working around this, probably by having this in the grammar:

file_input: (NEWLINE | stmt)* ENDMARKER

The PEG parser OTOH does not allow this.

The question now is, is it reasonable to change the tokenizer to consider a lone backslash an empty line? Do you also consider this a bug? Or should we change the new parser?
msg370812 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2020-06-06 04:18
Sure looks like a tokenizer issue to me. For example this is broken in both versions:

pass
    \

pass

It complains about an unexpected indent, but it should really be considered a blank line broken in two -- a backslash is supposed to just erase itself and the following newline.

https://docs.python.org/3/reference/lexical_analysis.html#explicit-line-joining
msg371056 - (view) Author: Adam Williamson (adamwill) Date: 2020-06-08 23:43
I'm not the best person to ask what I'd "consider" to be a bug or not, to be honest. I'm just a Fedora packaging guy trying to make our packages build with Python 3.9 :) If this is still an important question, I'd suggest asking the folks from the Black issue and PR I linked to, that's the "real world" case if any.
msg371058 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2020-06-08 23:47
To be clear, I consider it a bug.
msg371254 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2020-06-10 23:56
New changeset 896f4cf63f9ab93e30572d879a5719d5aa2499fb by Lysandros Nikolaou in branch 'master':
bpo-40847: Consider a line with only a LINECONT a blank line (GH-20769)
https://github.com/python/cpython/commit/896f4cf63f9ab93e30572d879a5719d5aa2499fb
msg371255 - (view) Author: miss-islington (miss-islington) Date: 2020-06-11 00:14
New changeset e3ce3bba9277a7c4cfde5aaf6269b6c68f334176 by Miss Islington (bot) in branch '3.9':
bpo-40847: Consider a line with only a LINECONT a blank line (GH-20769)
https://github.com/python/cpython/commit/e3ce3bba9277a7c4cfde5aaf6269b6c68f334176
History
Date User Action Args
2020-06-11 00:14:24miss-islingtonsetmessages: + msg371255
2020-06-10 23:56:48pablogsalsetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2020-06-10 23:56:22miss-islingtonsetnosy: + miss-islington
pull_requests: + pull_request19991
2020-06-10 23:56:12pablogsalsetmessages: + msg371254
2020-06-10 18:39:23brett.cannonsetnosy: - brett.cannon
2020-06-09 23:51:27lys.nikolaousetkeywords: + patch
stage: patch review
pull_requests: + pull_request19967
2020-06-08 23:47:04gvanrossumsetmessages: + msg371058
2020-06-08 23:43:44adamwillsetmessages: + msg371056
2020-06-06 04:18:43gvanrossumsetmessages: + msg370812
2020-06-06 01:23:28lys.nikolaousetmessages: + msg370809
2020-06-03 06:23:54pablogsalsetnosy: + gvanrossum, lys.nikolaou
2020-06-02 19:59:22christian.heimessetnosy: + christian.heimes
2020-06-02 19:58:09christian.heimessetnosy: + brett.cannon, benjamin.peterson, yselivanov, pablogsal

versions: + Python 3.10
2020-06-02 19:03:34adamwillcreate