Message120600
In Python3, the following pattern becomes common:
with open(fullname, 'rb') as fp:
coding, line = tokenize.detect_encoding(fp.readline)
with open(fullname, 'r', encoding=coding) as fp:
...
It opens the file is opened twice, whereas it is unnecessary: it's possible to reuse the raw buffer to create a text file. And I don't like the detect_encoding() API: pass the readline function is not intuitive.
I propose to create tokenize.open_python() function with a very simple API: just one argument, the filename. This function calls detect_encoding() and only open the file once.
Attached python adds the function with an unit test and a patch on the documentation. It patchs also functions currently using detect_encoding().
open_python() only supports read mode. I suppose that it is enough. |
|
Date |
User |
Action |
Args |
2010-11-06 10:49:39 | vstinner | set | recipients:
+ vstinner |
2010-11-06 10:49:39 | vstinner | set | messageid: <1289040579.7.0.473143443383.issue10335@psf.upfronthosting.co.za> |
2010-11-06 10:49:37 | vstinner | link | issue10335 messages |
2010-11-06 10:49:37 | vstinner | create | |
|