classification
Title: Suggestion for better syntax errors in tokenizer errors
Type: enhancement Stage: patch review
Components: Parser Versions: Python 3.11, Python 3.10
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: lys.nikolaou, pablogsal, serhiy.storchaka, wyz23x2
Priority: normal Keywords: patch

Created on 2021-06-05 11:01 by wyz23x2, last changed 2021-06-07 07:33 by serhiy.storchaka.

Pull Requests
URL Status Linked Edit
PR 26555 open pablogsal, 2021-06-05 23:29
Messages (5)
msg395161 - (view) Author: wyz23x2 (wyz23x2) * Date: 2021-06-05 11:01
Python 3.10.0b2 (tags/v3.10.0b2:3173141, Jun  1 2021, 09:05:29) [MSC v.1928 64 bit (AMD64)] on win32 
Type "help", "copyright", "credits" or "license" for more information.
>>> 0777
  File "<stdin>", line 1
    0777
       ^
SyntaxError: leading zeros in decimal integer literals are not permitted; use an 0o prefix for octal integers
>>> 000123
  File "<stdin>", line 1
    000123
         ^
SyntaxError: leading zeros in decimal integer literals are not permitted; use an 0o prefix for octal integers

The ^ is placed below the last digit.
However, this is misleading. The error is "leading zeros" and "prefix". So I would expect this:

>>> 0777
  File "<stdin>", line 1
    0777
    ^
SyntaxError: leading zeros in decimal integer literals are not permitted; use an 0o prefix for octal integers
>>> 000123
  File "<stdin>", line 1
    000123
    ^^^
SyntaxError: leading zeros in decimal integer literals are not permitted; use an 0o prefix for octal integers

Opinions?
msg395179 - (view) Author: wyz23x2 (wyz23x2) * Date: 2021-06-05 18:50
Another 2 problems:
1.
>>> 0b1112
  File "<stdin>", line 1
    0b1112
         ^
SyntaxError: invalid digit '2' in binary literal
>>> 0o5780
  File "<stdin>", line 1
    0o5780
        ^
SyntaxError: invalid digit '8' in octal literal
But:
>>> 0x2fag
  File "<stdin>", line 1
    0x2fag
    ^^^^^^
SyntaxError: invalid syntax. Perhaps you forgot a comma?
>>> 
Is this expected?

2.
>>> 0o91
  File "<stdin>", line 1
    0o91
     ^
SyntaxError: invalid digit '9' in octal literal
>>> 0b21
  File "<stdin>", line 1
    0b21
     ^
SyntaxError: invalid digit '2' in binary literal

The ^ is misplaced again, even though, say the 0b1112 example above works.
msg395182 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-06-05 19:46
> Is this expected?

Yes, is an edge case of python identifiying two tokens together except that there is no space:

>>> 3 4
  File "<stdin>", line 1
    3 4
    ^^^
SyntaxError: invalid syntax. Perhaps you forgot a comma?


I honestly don't share your concerns that these things are "misleading". The caret is pointing to the token that is incorrect 0777. The tokenizer errors always point at the end of the token (we still have not implemented ranged errors for the tokenizer).

This is true in all the cases you present.
msg395188 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-06-05 23:30
PR 26555 does some improvements to your examples:

>>> 0777
  File "<stdin>", line 1
    0777
    ^
SyntaxError: leading zeros in decimal integer literals are not permitted; use an 0o prefix for octal integers
>>> 000007777
  File "<stdin>", line 1
    000007777
    ^^^^^
SyntaxError: leading zeros in decimal integer literals are not permitted; use an 0o prefix for octal integers
>>> 0b1112
  File "<stdin>", line 1
    0b1112
         ^
SyntaxError: invalid digit '2' in binary literal
>>> 0o91
  File "<stdin>", line 1
    0o91
      ^
SyntaxError: invalid digit '9' in octal literal
>>> 0b21
  File "<stdin>", line 1
    0b21
      ^
SyntaxError: invalid digit '2' in binary literal
>>>
msg395250 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2021-06-07 07:33
See also issue43833.
History
Date User Action Args
2021-06-07 07:33:49serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg395250
2021-06-06 10:51:36wyz23x2settype: behavior -> enhancement
2021-06-05 23:30:31pablogsalsetmessages: + msg395188
2021-06-05 23:29:33pablogsalsetkeywords: + patch
stage: patch review
pull_requests: + pull_request25142
2021-06-05 19:46:22pablogsalsettitle: Problems of int literal SyntaxErrors -> Suggestion for better syntax errors in tokenizer errors
2021-06-05 19:46:04pablogsalsetmessages: + msg395182
2021-06-05 18:50:44wyz23x2setmessages: + msg395179
title: Misleading mark of octal SyntaxErrors -> Problems of int literal SyntaxErrors
2021-06-05 11:01:10wyz23x2create