Author terry.reedy
Recipients Paul.Bonser, armicron, benjamin.peterson, georg.brandl, kbk, loewis, meador.inge, roger.serwy, serhiy.storchaka, terry.reedy
Date 2013-09-16.20:41:15
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1379364075.93.0.0869178956685.issue18873@psf.upfronthosting.co.za>
In-reply-to
Content
One of the problem with encoding recognition is that the same logic is more-or-less reproduced multiple places, so any fix needs to be applied multiple places. From the detect_encoding_in_comments_only.patch:
Lib/idlelib/IOBinding.py
Lib/lib2to3/pgen2/tokenize.py
Lib/tokenize.py
Tools/scripts/findnocoding.py
Any fix for issues *18960 and *18961 may also need multiple applications.

If there is not now, it would be nice if there were just one python-coded function in Lib/tokenize.py that could be imported and used by the other python code. (I was going to suggest exposing the function in tokenize.c, but I believe the point of tokenize.py is to not be dependent on CPython.)

I believe the Idle support for \r became obsolete when support for MacOS9 was dropped in 2.4. I notice that it is not part of io universal newline support.
History
Date User Action Args
2013-09-16 20:41:15terry.reedysetrecipients: + terry.reedy, loewis, georg.brandl, kbk, benjamin.peterson, roger.serwy, meador.inge, serhiy.storchaka, Paul.Bonser, armicron
2013-09-16 20:41:15terry.reedysetmessageid: <1379364075.93.0.0869178956685.issue18873@psf.upfronthosting.co.za>
2013-09-16 20:41:15terry.reedylinkissue18873 messages
2013-09-16 20:41:15terry.reedycreate