Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New parser considers empty line following a backslash to be a syntax error, old parser didn't #85024

Closed
adamwill mannequin opened this issue Jun 2, 2020 · 7 comments
Closed
Labels
3.9 only security fixes 3.10 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error

Comments

@adamwill
Copy link
Mannequin

adamwill mannequin commented Jun 2, 2020

BPO 40847
Nosy @gvanrossum, @tiran, @benjaminp, @1st1, @AdamWill, @lysnikolaou, @pablogsal, @miss-islington
PRs
  • bpo-40847: Consider a line with only a LINECONT a blank line #20769
  • [3.9] bpo-40847: Consider a line with only a LINECONT a blank line (GH-20769) #20795
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2020-06-10.23:56:48.020>
    created_at = <Date 2020-06-02.19:03:34.691>
    labels = ['interpreter-core', 'type-bug', '3.9', '3.10']
    title = "New parser considers empty line following a backslash to be a syntax error, old parser didn't"
    updated_at = <Date 2020-06-11.00:14:24.272>
    user = 'https://github.com/adamwill'

    bugs.python.org fields:

    activity = <Date 2020-06-11.00:14:24.272>
    actor = 'miss-islington'
    assignee = 'none'
    closed = True
    closed_date = <Date 2020-06-10.23:56:48.020>
    closer = 'pablogsal'
    components = ['Interpreter Core']
    creation = <Date 2020-06-02.19:03:34.691>
    creator = 'adamwill'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 40847
    keywords = ['patch']
    message_count = 7.0
    messages = ['370618', '370809', '370812', '371056', '371058', '371254', '371255']
    nosy_count = 8.0
    nosy_names = ['gvanrossum', 'christian.heimes', 'benjamin.peterson', 'yselivanov', 'adamwill', 'lys.nikolaou', 'pablogsal', 'miss-islington']
    pr_nums = ['20769', '20795']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue40847'
    versions = ['Python 3.9', 'Python 3.10']

    @adamwill
    Copy link
    Mannequin Author

    adamwill mannequin commented Jun 2, 2020

    While debugging issues with the black test suite in Python 3.9, I found one which black upstream says is a Cpython issue, so I'm filing it here.

    Reproduction is very easy. Just use this four-line tester:

        print("hello, world")
        \
    
        print("hello, world 2")
    
    with that saved as `test.py`, check the results:
    <mock-chroot> sh-5.0# PYTHONOLDPARSER=1 python3 test.py
    hello, world
    hello, world 2
    <mock-chroot> sh-5.0# python3 test.py
      File "/builddir/build/BUILD/black-19.10b0/test.py", line 3
        
        ^
    SyntaxError: invalid syntax
    

    The reason black has this test (well, a similar test - in black's test, the file *starts* with the backslash then the empty line, but the result is the same) is covered in psf/black#922 and psf/black#948 .

    @adamwill adamwill mannequin added 3.9 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error labels Jun 2, 2020
    @tiran tiran added 3.10 only security fixes labels Jun 2, 2020
    @lysnikolaou
    Copy link
    Contributor

    This is limited to cases where the line continuation character is on an otherwise empty line. For example this works correctly:

    $ cat t.py
    print("hello world")
    print("hello world 2") \
    print("hello world 3")
    $ ./python.exe t.py
    hello world
    hello world 2
    hello world 3

    The actual problem is at the tokenizer level, where a line with only a continuation character is not considered an empty line and thus two NEWLINE tokens get emitted, one after the other. The old parser was somehow working around this, probably by having this in the grammar:

    file_input: (NEWLINE | stmt)* ENDMARKER

    The PEG parser OTOH does not allow this.

    The question now is, is it reasonable to change the tokenizer to consider a lone backslash an empty line? Do you also consider this a bug? Or should we change the new parser?

    @gvanrossum
    Copy link
    Member

    Sure looks like a tokenizer issue to me. For example this is broken in both versions:

    pass
    \

    pass

    It complains about an unexpected indent, but it should really be considered a blank line broken in two -- a backslash is supposed to just erase itself and the following newline.

    https://docs.python.org/3/reference/lexical_analysis.html#explicit-line-joining

    @adamwill
    Copy link
    Mannequin Author

    adamwill mannequin commented Jun 8, 2020

    I'm not the best person to ask what I'd "consider" to be a bug or not, to be honest. I'm just a Fedora packaging guy trying to make our packages build with Python 3.9 :) If this is still an important question, I'd suggest asking the folks from the Black issue and PR I linked to, that's the "real world" case if any.

    @gvanrossum
    Copy link
    Member

    To be clear, I consider it a bug.

    @pablogsal
    Copy link
    Member

    New changeset 896f4cf by Lysandros Nikolaou in branch 'master':
    bpo-40847: Consider a line with only a LINECONT a blank line (GH-20769)
    896f4cf

    @miss-islington
    Copy link
    Contributor

    New changeset e3ce3bb by Miss Islington (bot) in branch '3.9':
    bpo-40847: Consider a line with only a LINECONT a blank line (GH-20769)
    e3ce3bb

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.9 only security fixes 3.10 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants