This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author vstinner
Recipients vstinner
Date 2010-11-06.10:49:34
SpamBayes Score 8.9079374e-05
Marked as misclassified No
Message-id <1289040579.7.0.473143443383.issue10335@psf.upfronthosting.co.za>
In-reply-to
Content
In Python3, the following pattern becomes common:

        with open(fullname, 'rb') as fp:
            coding, line = tokenize.detect_encoding(fp.readline)
        with open(fullname, 'r', encoding=coding) as fp:
            ...

It opens the file is opened twice, whereas it is unnecessary: it's possible to reuse the raw buffer to create a text file. And I don't like the detect_encoding() API: pass the readline function is not intuitive.

I propose to create tokenize.open_python() function with a very simple API: just one argument, the filename. This function calls detect_encoding() and only open the file once.

Attached python adds the function with an unit test and a patch on the documentation. It patchs also functions currently using detect_encoding().

open_python() only supports read mode. I suppose that it is enough.
History
Date User Action Args
2010-11-06 10:49:39vstinnersetrecipients: + vstinner
2010-11-06 10:49:39vstinnersetmessageid: <1289040579.7.0.473143443383.issue10335@psf.upfronthosting.co.za>
2010-11-06 10:49:37vstinnerlinkissue10335 messages
2010-11-06 10:49:37vstinnercreate