Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve error message when source code contains invisible control characters #89969

Closed
stevendaprano opened this issue Nov 15, 2021 · 3 comments
Labels
3.11 bug and security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement

Comments

@stevendaprano
Copy link
Member

BPO 45811
Nosy @terryjreedy, @aroberge, @stevendaprano, @pablogsal
PRs
  • bpo-45811: Improve error message when source code contains invisible control characters #29654
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2021-11-20.18:28:37.084>
    created_at = <Date 2021-11-15.23:52:37.287>
    labels = ['interpreter-core', 'type-feature', '3.11']
    title = 'Improve error message when source code contains invisible control characters'
    updated_at = <Date 2021-11-20.18:28:38.442>
    user = 'https://github.com/stevendaprano'

    bugs.python.org fields:

    activity = <Date 2021-11-20.18:28:38.442>
    actor = 'pablogsal'
    assignee = 'none'
    closed = True
    closed_date = <Date 2021-11-20.18:28:37.084>
    closer = 'pablogsal'
    components = ['Interpreter Core']
    creation = <Date 2021-11-15.23:52:37.287>
    creator = 'steven.daprano'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 45811
    keywords = ['patch']
    message_count = 3.0
    messages = ['406379', '406630', '406682']
    nosy_count = 4.0
    nosy_names = ['terry.reedy', 'aroberge', 'steven.daprano', 'pablogsal']
    pr_nums = ['29654']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue45811'
    versions = ['Python 3.11']

    @stevendaprano
    Copy link
    Member Author

    Invisible control characters (aside from white space) are not permitted in source code, but the syntax error we get is confusing and lacks information:

    >>> s = 'print\x17("Hello")'
    >>> eval(s)
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "<string>", line 1
        print("Hello")
             ^
    SyntaxError: invalid syntax

    The caret points to an invisible character. The offending control character is not visible in the traceback, or the source code unless you use a hex editor. Copying and pasting the string from the traceback, or the source code, may remove the control character (depending on the tools you use), making it even harder to track down the problem.

    I suggest that the syntax error should state that the problem is an invisible control character, and display it as a standard human-readable code together with its hex code:

    SyntaxError: invisible control character ^W (0x17)

    Just in case it isn't obvious what the mapping between controls and the human visible string is:

    def control(char):
        n = ord(char)
        if 0 <= n <= 0x1F:
            # C0 control codes
            return '^' + chr(ord('@')+n)
        elif n == 0x7F:
            # DEL
            return '^?'
        elif 0x80 <= n <= 0x9F:
            # C1 control codes
            return 'Esc+' + chr(ord('@')+n-0x80)
        else:
            raise ValueError('Not a control character.')

    https://en.wikipedia.org/wiki/C0_and_C1_control_codes

    @stevendaprano stevendaprano added 3.11 bug and security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement labels Nov 15, 2021
    @terryjreedy
    Copy link
    Member

    I agree.

    @pablogsal
    Copy link
    Member

    New changeset 81f4e11 by Pablo Galindo Salgado in branch 'main':
    bpo-45811: Improve error message when source code contains invisible control characters (GH-29654)
    81f4e11

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.11 bug and security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants