Title: New parser considers empty line following a backslash to be a syntax error, old parser didn't
msg370618 - (view) Author: Adam Williamson (adamwill) Date: 2020-06-02 19:03
While debugging issues with the black test suite in Python 3.9, I found one which black upstream says is a Cpython issue, so I'm filing it here.

Reproduction is very easy. Just use this four-line tester:

    print("hello, world")

    print("hello, world 2")

with that saved as ``, check the results:

    <mock-chroot> sh-5.0# PYTHONOLDPARSER=1 python3
    hello, world
    hello, world 2
    <mock-chroot> sh-5.0# python3
      File "/builddir/build/BUILD/black-19.10b0/", line 3
    SyntaxError: invalid syntax

The reason black has this test (well, a similar test - in black's test, the file *starts* with the backslash then the empty line, but the result is the same) is covered in and .
msg370809 - (view) Author: Lysandros Nikolaou (lys.nikolaou) * (Python committer) Date: 2020-06-06 01:23
This is limited to cases where the line continuation character is on an otherwise empty line. For example this works correctly:

$ cat
print("hello world")
print("hello world 2") \

print("hello world 3")
$ ./python.exe
hello world
hello world 2
hello world 3

The actual problem is at the tokenizer level, where a line with only a continuation character is not considered an empty line and thus two NEWLINE tokens get emitted, one after the other. The old parser was somehow working around this, probably by having this in the grammar:

file_input: (NEWLINE | stmt)* ENDMARKER

The PEG parser OTOH does not allow this.

The question now is, is it reasonable to change the tokenizer to consider a lone backslash an empty line? Do you also consider this a bug? Or should we change the new parser?
msg370812 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2020-06-06 04:18
Sure looks like a tokenizer issue to me. For example this is broken in both versions:



It complains about an unexpected indent, but it should really be considered a blank line broken in two -- a backslash is supposed to just erase itself and the following newline.
msg371056 - (view) Author: Adam Williamson (adamwill) Date: 2020-06-08 23:43
I'm not the best person to ask what I'd "consider" to be a bug or not, to be honest. I'm just a Fedora packaging guy trying to make our packages build with Python 3.9 :) If this is still an important question, I'd suggest asking the folks from the Black issue and PR I linked to, that's the "real world" case if any.
msg371058 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2020-06-08 23:47
To be clear, I consider it a bug.
msg371254 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2020-06-10 23:56
New changeset 896f4cf63f9ab93e30572d879a5719d5aa2499fb by Lysandros Nikolaou in branch 'master':
bpo-40847: Consider a line with only a LINECONT a blank line (GH-20769)
msg371255 - (view) Author: miss-islington (miss-islington) Date: 2020-06-11 00:14
New changeset e3ce3bba9277a7c4cfde5aaf6269b6c68f334176 by Miss Islington (bot) in branch '3.9':
bpo-40847: Consider a line with only a LINECONT a blank line (GH-20769)
