classification
Title: Nonexisting encoding specified in Tix.py
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: terry.reedy Nosy List: Ivan.Pozdeev, python-dev, serhiy.storchaka, terry.reedy
Priority: normal Keywords: patch

Created on 2016-12-09 17:21 by Ivan.Pozdeev, last changed 2017-03-31 16:36 by dstufft. This issue is now closed.

Files
File name Uploaded Description Edit
105052.patch Ivan.Pozdeev, 2016-12-09 17:21 review
Pull Requests
URL Status Linked Edit
PR 552 closed dstufft, 2017-03-31 16:36
Messages (8)
msg282791 - (view) Author: Ivan Pozdeev (Ivan.Pozdeev) * Date: 2016-12-09 17:21
$ head 'c:\Py\Lib\lib-tk\Tix.py' -n 1
# -*-mode: python; fill-column: 75; tab-width: 8; coding: iso-latin-1-unix -*-

There's no "iso-latin-1-unix" encoding in Python, so this declaration produces an error in some code analysis tools (I have it in PyScripter), as it should according to PEP263 .

In 3.x, this was fixed in changeset d63344ba187888b6792ba8362a0dd09e06ed2f9a .
msg283064 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2016-12-12 23:05
I am a little puzzled as to how a file rename changed the content, but the annotation history seems to show that.  Anyway, ...

When I load the file in IDLE 2.7, I get a warning.  I am a bit surprised as this is not a proper encoding declaration.  IDLE's re must be a bit loose.

In 3.x, the file starts with

# -*-mode: python; fill-column: 75; tab-width: 8 -*-
#
# $Id$
#

This is all ancient, obsolete, junk specific to some editor.  (The file itself not used 4 space indents.) I think it should be removed from all current versions.  As near as I can tell, there are no non-ascii chars in the file.
msg283108 - (view) Author: Ivan Pozdeev (Ivan.Pozdeev) * Date: 2016-12-13 14:28
I'm more puzzled how noone has noticed this until now if it's supposed to produce an error upon compilation. (Well, it doesn't. I couldn't quite figure out how the encoding declaration is parsed, but it's clear the line _isn't_ matched as a regex like the docs say.)
msg283132 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2016-12-13 19:05
I reread
https://docs.python.org/27/reference/lexical_analysis.html#encoding-declarations
A first or second line must be a comment matching "coding[=:]\s*([-\w.]+)" (which IDLE uses) and the captured name "must be recognized by Python".

I also did some experiments.  Apparently, "iso-latin-1-unix" is recognized by Python.  On Windows, from an IDLE editor,
  # coding: iso-latin-1-unix
runs, while 
  # coding: xiso-latin-1-unix
raises, during the compile(..., 'file', 'exec') call:
  SyntaxError: unknown encoding: xiso-latin-1-unix

Since codecs.lookup() returns the same error for both lines:
  LookupError: unknown encoding: iso-latin-1-unix
compile() must be doing something other than simply calling codecs.lookup.  I suspect it somehow recognizes 'iso', 'latin-1', and 'unix' as valid chunks of an ecoding name.  (The last might even be an obsolete legacy item.)  Whatever it is, it is not obviously available to tools written in Python.

Note that 'recognized as a legitimate encoding name' and 'available on a particular installation' are different concepts. I believe codecs.lookup implements the latter.
msg283133 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2016-12-13 19:06
Serhiy, if you agree with the proposed removal, but want me to do it, I will.
msg283135 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-12-13 19:27
Yes, CPython tokenizer recognizes encoding starting with "iso-latin-1-" as "iso-8859-1" (see get_normal_name() in Parser/tokenizer.c:228).

I agreed that coding cookie or all line can be removed from Tix.py. Please do that.
msg283809 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2016-12-22 04:44
New changeset ef03aff3b195 by Terry Jan Reedy in branch '2.7':
Issue 28923: Remove editor artifacts from Tix.py,
https://hg.python.org/cpython/rev/ef03aff3b195
msg283812 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2016-12-22 05:04
New changeset eb8667196f93 by Terry Jan Reedy in branch '3.5':
Issue 28923: Remove editor artifacts from Tix.py.
https://hg.python.org/cpython/rev/eb8667196f93

New changeset 4a82412a3c51 by Terry Jan Reedy in branch '3.6':
Issue 28923: Remove editor artifacts from Tix.py,
https://hg.python.org/cpython/rev/4a82412a3c51

New changeset 41031fdc924a by Terry Jan Reedy in branch 'default':
Issue 28923: Remove editor artifacts from Tix.py,
https://hg.python.org/cpython/rev/41031fdc924a
History
Date User Action Args
2017-03-31 16:36:26dstufftsetpull_requests: + pull_request997
2016-12-22 05:08:07terry.reedysetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2016-12-22 05:04:27python-devsetmessages: + msg283812
2016-12-22 04:44:06python-devsetnosy: + python-dev
messages: + msg283809
2016-12-13 19:27:33serhiy.storchakasetassignee: terry.reedy
messages: + msg283135
2016-12-13 19:06:32terry.reedysetmessages: + msg283133
2016-12-13 19:05:06terry.reedysetmessages: + msg283132
2016-12-13 14:28:36Ivan.Pozdeevsetmessages: + msg283108
2016-12-12 23:05:58terry.reedysetmessages: + msg283064
2016-12-12 21:27:46berker.peksagsetnosy: + terry.reedy, serhiy.storchaka

type: compile error -> behavior
stage: patch review
2016-12-09 17:21:10Ivan.Pozdeevcreate