Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestion for better syntax errors in tokenizer errors #88483

Closed
wyz23x2 mannequin opened this issue Jun 5, 2021 · 7 comments
Closed

Suggestion for better syntax errors in tokenizer errors #88483

wyz23x2 mannequin opened this issue Jun 5, 2021 · 7 comments
Labels
3.10 only security fixes 3.11 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement

Comments

@wyz23x2
Copy link
Mannequin

wyz23x2 mannequin commented Jun 5, 2021

BPO 44317
Nosy @serhiy-storchaka, @lysnikolaou, @pablogsal, @miss-islington, @wyz23x2
PRs
  • bpo-44317: Improve tokenizer errors with more informative locations #26555
  • [3.10] bpo-44317: Improve tokenizer errors with more informative locations (GH-26555) #27079
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2021-07-10.00:47:54.504>
    created_at = <Date 2021-06-05.11:01:10.180>
    labels = ['interpreter-core', 'type-feature', '3.10', '3.11']
    title = 'Suggestion for better syntax errors in tokenizer errors'
    updated_at = <Date 2021-07-10.00:47:54.504>
    user = 'https://github.com/wyz23x2'

    bugs.python.org fields:

    activity = <Date 2021-07-10.00:47:54.504>
    actor = 'pablogsal'
    assignee = 'none'
    closed = True
    closed_date = <Date 2021-07-10.00:47:54.504>
    closer = 'pablogsal'
    components = ['Parser']
    creation = <Date 2021-06-05.11:01:10.180>
    creator = 'wyz23x2'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 44317
    keywords = ['patch']
    message_count = 7.0
    messages = ['395161', '395179', '395182', '395188', '395250', '397233', '397234']
    nosy_count = 5.0
    nosy_names = ['serhiy.storchaka', 'lys.nikolaou', 'pablogsal', 'miss-islington', 'wyz23x2']
    pr_nums = ['26555', '27079']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue44317'
    versions = ['Python 3.10', 'Python 3.11']

    @wyz23x2
    Copy link
    Mannequin Author

    wyz23x2 mannequin commented Jun 5, 2021

    Python 3.10.0b2 (tags/v3.10.0b2:3173141, Jun  1 2021, 09:05:29) [MSC v.1928 64 bit (AMD64)] on win32 
    Type "help", "copyright", "credits" or "license" for more information.
    >>> 0777
      File "<stdin>", line 1
        0777
           ^
    SyntaxError: leading zeros in decimal integer literals are not permitted; use an 0o prefix for octal integers
    >>> 000123
      File "<stdin>", line 1
        000123
             ^
    SyntaxError: leading zeros in decimal integer literals are not permitted; use an 0o prefix for octal integers

    The ^ is placed below the last digit.
    However, this is misleading. The error is "leading zeros" and "prefix". So I would expect this:

    >>> 0777
      File "<stdin>", line 1
        0777
        ^
    SyntaxError: leading zeros in decimal integer literals are not permitted; use an 0o prefix for octal integers
    >>> 000123
      File "<stdin>", line 1
        000123
        ^^^
    SyntaxError: leading zeros in decimal integer literals are not permitted; use an 0o prefix for octal integers

    Opinions?

    @wyz23x2 wyz23x2 mannequin added type-bug An unexpected behavior, bug, or error 3.10 only security fixes 3.11 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) labels Jun 5, 2021
    @wyz23x2
    Copy link
    Mannequin Author

    wyz23x2 mannequin commented Jun 5, 2021

    Another 2 problems:
    1.
    >>> 0b1112
      File "<stdin>", line 1
        0b1112
             ^
    SyntaxError: invalid digit '2' in binary literal
    >>> 0o5780
      File "<stdin>", line 1
        0o5780
            ^
    SyntaxError: invalid digit '8' in octal literal
    But:
    >>> 0x2fag
      File "<stdin>", line 1
        0x2fag
        ^^^^^^
    SyntaxError: invalid syntax. Perhaps you forgot a comma?
    >>> 
    Is this expected?
    
    2.
    >>> 0o91
      File "<stdin>", line 1
        0o91
         ^
    SyntaxError: invalid digit '9' in octal literal
    >>> 0b21
      File "<stdin>", line 1
        0b21
         ^
    SyntaxError: invalid digit '2' in binary literal

    The ^ is misplaced again, even though, say the 0b1112 example above works.

    @wyz23x2 wyz23x2 mannequin changed the title Misleading mark of octal SyntaxErrors Problems of int literal SyntaxErrors Jun 5, 2021
    @pablogsal
    Copy link
    Member

    Is this expected?

    Yes, is an edge case of python identifiying two tokens together except that there is no space:

    >>> 3 4
      File "<stdin>", line 1
        3 4
        ^^^
    SyntaxError: invalid syntax. Perhaps you forgot a comma?

    I honestly don't share your concerns that these things are "misleading". The caret is pointing to the token that is incorrect 0777. The tokenizer errors always point at the end of the token (we still have not implemented ranged errors for the tokenizer).

    This is true in all the cases you present.

    @pablogsal pablogsal changed the title Problems of int literal SyntaxErrors Suggestion for better syntax errors in tokenizer errors Jun 5, 2021
    @pablogsal
    Copy link
    Member

    PR 26555 does some improvements to your examples:

    >>> 0777
      File "<stdin>", line 1
        0777
        ^
    SyntaxError: leading zeros in decimal integer literals are not permitted; use an 0o prefix for octal integers
    >>> 000007777
      File "<stdin>", line 1
        000007777
        ^^^^^
    SyntaxError: leading zeros in decimal integer literals are not permitted; use an 0o prefix for octal integers
    >>> 0b1112
      File "<stdin>", line 1
        0b1112
             ^
    SyntaxError: invalid digit '2' in binary literal
    >>> 0o91
      File "<stdin>", line 1
        0o91
          ^
    SyntaxError: invalid digit '9' in octal literal
    >>> 0b21
      File "<stdin>", line 1
        0b21
          ^
    SyntaxError: invalid digit '2' in binary literal
    >>>

    @wyz23x2 wyz23x2 mannequin added type-feature A feature request or enhancement and removed type-bug An unexpected behavior, bug, or error labels Jun 6, 2021
    @serhiy-storchaka
    Copy link
    Member

    See also bpo-43833.

    @pablogsal
    Copy link
    Member

    New changeset f24777c by Pablo Galindo Salgado in branch 'main':
    bpo-44317: Improve tokenizer errors with more informative locations (GH-26555)
    f24777c

    @pablogsal
    Copy link
    Member

    New changeset 2a722d4 by Miss Islington (bot) in branch '3.10':
    bpo-44317: Improve tokenizer errors with more informative locations (GH-26555) (GH-27079)
    2a722d4

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.10 only security fixes 3.11 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants