Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nonexisting encoding specified in Tix.py #73109

Closed
native-api mannequin opened this issue Dec 9, 2016 · 8 comments
Closed

Nonexisting encoding specified in Tix.py #73109

native-api mannequin opened this issue Dec 9, 2016 · 8 comments
Assignees
Labels
stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@native-api
Copy link
Mannequin

native-api mannequin commented Dec 9, 2016

BPO 28923
Nosy @terryjreedy, @serhiy-storchaka, @native-api
PRs
  • [Do Not Merge] Convert Misc/NEWS so that it is managed by towncrier #552
  • Files
  • 105052.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/terryjreedy'
    closed_at = <Date 2016-12-22.05:08:07.913>
    created_at = <Date 2016-12-09.17:21:10.243>
    labels = ['type-bug', 'library']
    title = 'Nonexisting encoding specified in Tix.py'
    updated_at = <Date 2017-03-31.16:36:26.143>
    user = 'https://github.com/native-api'

    bugs.python.org fields:

    activity = <Date 2017-03-31.16:36:26.143>
    actor = 'dstufft'
    assignee = 'terry.reedy'
    closed = True
    closed_date = <Date 2016-12-22.05:08:07.913>
    closer = 'terry.reedy'
    components = ['Library (Lib)']
    creation = <Date 2016-12-09.17:21:10.243>
    creator = 'Ivan.Pozdeev'
    dependencies = []
    files = ['45819']
    hgrepos = []
    issue_num = 28923
    keywords = ['patch']
    message_count = 8.0
    messages = ['282791', '283064', '283108', '283132', '283133', '283135', '283809', '283812']
    nosy_count = 4.0
    nosy_names = ['terry.reedy', 'python-dev', 'serhiy.storchaka', 'Ivan.Pozdeev']
    pr_nums = ['552']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue28923'
    versions = ['Python 2.7']

    @native-api
    Copy link
    Mannequin Author

    native-api mannequin commented Dec 9, 2016

    $ head 'c:\Py\Lib\lib-tk\Tix.py' -n 1
    # -*-mode: python; fill-column: 75; tab-width: 8; coding: iso-latin-1-unix -*-

    There's no "iso-latin-1-unix" encoding in Python, so this declaration produces an error in some code analysis tools (I have it in PyScripter), as it should according to PEP-263 .

    In 3.x, this was fixed in changeset d63344ba187888b6792ba8362a0dd09e06ed2f9a .

    @native-api native-api mannequin added build The build process and cross-build stdlib Python modules in the Lib dir labels Dec 9, 2016
    @berkerpeksag berkerpeksag added type-bug An unexpected behavior, bug, or error and removed build The build process and cross-build labels Dec 12, 2016
    @terryjreedy
    Copy link
    Member

    I am a little puzzled as to how a file rename changed the content, but the annotation history seems to show that. Anyway, ...

    When I load the file in IDLE 2.7, I get a warning. I am a bit surprised as this is not a proper encoding declaration. IDLE's re must be a bit loose.

    In 3.x, the file starts with

    # --mode: python; fill-column: 75; tab-width: 8 --

    # $Id$

    This is all ancient, obsolete, junk specific to some editor. (The file itself not used 4 space indents.) I think it should be removed from all current versions. As near as I can tell, there are no non-ascii chars in the file.

    @native-api
    Copy link
    Mannequin Author

    native-api mannequin commented Dec 13, 2016

    I'm more puzzled how noone has noticed this until now if it's supposed to produce an error upon compilation. (Well, it doesn't. I couldn't quite figure out how the encoding declaration is parsed, but it's clear the line _isn't_ matched as a regex like the docs say.)

    @terryjreedy
    Copy link
    Member

    I reread
    https://docs.python.org/27/reference/lexical_analysis.html#encoding-declarations
    A first or second line must be a comment matching "coding[=:]\s*([-\w.]+)" (which IDLE uses) and the captured name "must be recognized by Python".

    I also did some experiments. Apparently, "iso-latin-1-unix" is recognized by Python. On Windows, from an IDLE editor,
    # coding: iso-latin-1-unix
    runs, while
    # coding: xiso-latin-1-unix
    raises, during the compile(..., 'file', 'exec') call:
    SyntaxError: unknown encoding: xiso-latin-1-unix

    Since codecs.lookup() returns the same error for both lines:
    LookupError: unknown encoding: iso-latin-1-unix
    compile() must be doing something other than simply calling codecs.lookup. I suspect it somehow recognizes 'iso', 'latin-1', and 'unix' as valid chunks of an ecoding name. (The last might even be an obsolete legacy item.) Whatever it is, it is not obviously available to tools written in Python.

    Note that 'recognized as a legitimate encoding name' and 'available on a particular installation' are different concepts. I believe codecs.lookup implements the latter.

    @terryjreedy
    Copy link
    Member

    Serhiy, if you agree with the proposed removal, but want me to do it, I will.

    @serhiy-storchaka
    Copy link
    Member

    Yes, CPython tokenizer recognizes encoding starting with "iso-latin-1-" as "iso-8859-1" (see get_normal_name() in Parser/tokenizer.c:228).

    I agreed that coding cookie or all line can be removed from Tix.py. Please do that.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Dec 22, 2016

    New changeset ef03aff3b195 by Terry Jan Reedy in branch '2.7':
    bpo-28923: Remove editor artifacts from Tix.py,
    https://hg.python.org/cpython/rev/ef03aff3b195

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Dec 22, 2016

    New changeset eb8667196f93 by Terry Jan Reedy in branch '3.5':
    bpo-28923: Remove editor artifacts from Tix.py.
    https://hg.python.org/cpython/rev/eb8667196f93

    New changeset 4a82412a3c51 by Terry Jan Reedy in branch '3.6':
    bpo-28923: Remove editor artifacts from Tix.py,
    https://hg.python.org/cpython/rev/4a82412a3c51

    New changeset 41031fdc924a by Terry Jan Reedy in branch 'default':
    bpo-28923: Remove editor artifacts from Tix.py,
    https://hg.python.org/cpython/rev/41031fdc924a

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants