This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author vstinner
Recipients loewis, vstinner
Date 2008-10-02.21:49:10
SpamBayes Score 0.0
Marked as misclassified No
Message-id <1222984154.8.0.04402572573.issue4008@psf.upfronthosting.co.za>
In-reply-to
Content
loewis wrote:
> Notice that there is also IOBinding.coding_spec.
> Not sure whether this or the one in tokenize is more correct.

Oh! IOBinding reimplement many features now available in Python like 
universal new line or function to write unicode strings to a file. But 
I don't want to rewrite IDLE, I just want to fix the initial problem: 
IDLE is unable to open a non-ASCII file using "#coding:" header.

So IDLE reimplemented coding detection twice: once in IOBinding and 
once in ScriptBinding. So I wrote a new version of my patch removing 
all the code to reuse tokenize.detect_encoding().

I changed IDLE behaviour: IOBinding._decode() used the locale encoding 
if it's unable to detect the encoding using UTF-8 BOM and/or if the 
#coding: header is missing. Since I also read "Finally, try the 
locale's encoding. This is deprecated", I prefer to remove it. If you 
want to keep the current behaviour, use:
-------------------------
def detect_encoding(filename, default=None):
    with open(filename, 'rb') as f:
        encoding, line = tokenize.detect_encoding(f.readline)
    if (not line) and default:
        return default
    return encoding
...
            encoding = detect_encoding(filename, locale_encoding)
-------------------------

Please review and test my patch (which becomes longer and longer) :-)
History
Date User Action Args
2008-10-02 21:49:15vstinnersetrecipients: + vstinner, loewis
2008-10-02 21:49:14vstinnersetmessageid: <1222984154.8.0.04402572573.issue4008@psf.upfronthosting.co.za>
2008-10-02 21:49:13vstinnerlinkissue4008 messages
2008-10-02 21:49:13vstinnercreate