This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Suggestion for better syntax errors in tokenizer errors
Type: enhancement Stage: resolved
Components: Parser Versions: Python 3.11, Python 3.10
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: lys.nikolaou, miss-islington, pablogsal, serhiy.storchaka, wyz23x2
Priority: normal Keywords: patch

Created on 2021-06-05 11:01 by wyz23x2, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 26555 merged pablogsal, 2021-06-05 23:29
PR 27079 merged miss-islington, 2021-07-10 00:29
Messages (7)
msg395161 - (view) Author: wyz23x2 (wyz23x2) * Date: 2021-06-05 11:01
Python 3.10.0b2 (tags/v3.10.0b2:3173141, Jun  1 2021, 09:05:29) [MSC v.1928 64 bit (AMD64)] on win32 
Type "help", "copyright", "credits" or "license" for more information.
>>> 0777
  File "<stdin>", line 1
    0777
       ^
SyntaxError: leading zeros in decimal integer literals are not permitted; use an 0o prefix for octal integers
>>> 000123
  File "<stdin>", line 1
    000123
         ^
SyntaxError: leading zeros in decimal integer literals are not permitted; use an 0o prefix for octal integers

The ^ is placed below the last digit.
However, this is misleading. The error is "leading zeros" and "prefix". So I would expect this:

>>> 0777
  File "<stdin>", line 1
    0777
    ^
SyntaxError: leading zeros in decimal integer literals are not permitted; use an 0o prefix for octal integers
>>> 000123
  File "<stdin>", line 1
    000123
    ^^^
SyntaxError: leading zeros in decimal integer literals are not permitted; use an 0o prefix for octal integers

Opinions?
msg395179 - (view) Author: wyz23x2 (wyz23x2) * Date: 2021-06-05 18:50
Another 2 problems:
1.
>>> 0b1112
  File "<stdin>", line 1
    0b1112
         ^
SyntaxError: invalid digit '2' in binary literal
>>> 0o5780
  File "<stdin>", line 1
    0o5780
        ^
SyntaxError: invalid digit '8' in octal literal
But:
>>> 0x2fag
  File "<stdin>", line 1
    0x2fag
    ^^^^^^
SyntaxError: invalid syntax. Perhaps you forgot a comma?
>>> 
Is this expected?

2.
>>> 0o91
  File "<stdin>", line 1
    0o91
     ^
SyntaxError: invalid digit '9' in octal literal
>>> 0b21
  File "<stdin>", line 1
    0b21
     ^
SyntaxError: invalid digit '2' in binary literal

The ^ is misplaced again, even though, say the 0b1112 example above works.
msg395182 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-06-05 19:46
> Is this expected?

Yes, is an edge case of python identifiying two tokens together except that there is no space:

>>> 3 4
  File "<stdin>", line 1
    3 4
    ^^^
SyntaxError: invalid syntax. Perhaps you forgot a comma?


I honestly don't share your concerns that these things are "misleading". The caret is pointing to the token that is incorrect 0777. The tokenizer errors always point at the end of the token (we still have not implemented ranged errors for the tokenizer).

This is true in all the cases you present.
msg395188 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-06-05 23:30
PR 26555 does some improvements to your examples:

>>> 0777
  File "<stdin>", line 1
    0777
    ^
SyntaxError: leading zeros in decimal integer literals are not permitted; use an 0o prefix for octal integers
>>> 000007777
  File "<stdin>", line 1
    000007777
    ^^^^^
SyntaxError: leading zeros in decimal integer literals are not permitted; use an 0o prefix for octal integers
>>> 0b1112
  File "<stdin>", line 1
    0b1112
         ^
SyntaxError: invalid digit '2' in binary literal
>>> 0o91
  File "<stdin>", line 1
    0o91
      ^
SyntaxError: invalid digit '9' in octal literal
>>> 0b21
  File "<stdin>", line 1
    0b21
      ^
SyntaxError: invalid digit '2' in binary literal
>>>
msg395250 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2021-06-07 07:33
See also issue43833.
msg397233 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-07-10 00:29
New changeset f24777c2b329974b69d2a3bf5cfc37e0fcace36c by Pablo Galindo Salgado in branch 'main':
bpo-44317: Improve tokenizer errors with more informative locations (GH-26555)
https://github.com/python/cpython/commit/f24777c2b329974b69d2a3bf5cfc37e0fcace36c
msg397234 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-07-10 00:47
New changeset 2a722d4fab6a9656f3c03cfdaf6d1684277b8af5 by Miss Islington (bot) in branch '3.10':
bpo-44317: Improve tokenizer errors with more informative locations (GH-26555) (GH-27079)
https://github.com/python/cpython/commit/2a722d4fab6a9656f3c03cfdaf6d1684277b8af5
History
Date User Action Args
2022-04-11 14:59:46adminsetgithub: 88483
2021-07-10 00:47:54pablogsalsetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2021-07-10 00:47:41pablogsalsetmessages: + msg397234
2021-07-10 00:29:58miss-islingtonsetnosy: + miss-islington
pull_requests: + pull_request25629
2021-07-10 00:29:45pablogsalsetmessages: + msg397233
2021-06-07 07:33:49serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg395250
2021-06-06 10:51:36wyz23x2settype: behavior -> enhancement
2021-06-05 23:30:31pablogsalsetmessages: + msg395188
2021-06-05 23:29:33pablogsalsetkeywords: + patch
stage: patch review
pull_requests: + pull_request25142
2021-06-05 19:46:22pablogsalsettitle: Problems of int literal SyntaxErrors -> Suggestion for better syntax errors in tokenizer errors
2021-06-05 19:46:04pablogsalsetmessages: + msg395182
2021-06-05 18:50:44wyz23x2setmessages: + msg395179
title: Misleading mark of octal SyntaxErrors -> Problems of int literal SyntaxErrors
2021-06-05 11:01:10wyz23x2create