This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author ysj.ray
Recipients meatballhat, ysj.ray
Date 2010-05-20.09:08:50
SpamBayes Score 1.3668899e-08
Marked as misclassified No
Message-id <1274346534.56.0.254674868282.issue8774@psf.upfronthosting.co.za>
In-reply-to
Content
This is the problem with module tabnanny, it always tries to read the py source file as a platform-dependent encoded text module, that is, open the file with builtin function "open()", and with no encoding parameters. It doesn't parse the encoding cookie at the beginning of the fource file! So if a python source file contains some character not encoded in that platform-dependent encoding, the tabnanny module will fail on checking that source file. Not only heapq.py, but also several other stander modules.

That platform-dependent encoding is judged as following orders:
1. os.device_encoding(fd)
2. locale.preferredencoding()
3. ascii.

I wonder why tabnanny works in this way. Is this the intended behaviour?  On my flatform, if I use tabnanny to check a source file which contains some chinese characters and encoded in 'gbk', the UnicodeDecodedError will raise.

If this is not the intended behaviour, I guess if we want to fix this problem, we have to change the way tabnanny read the source file. Just like the way python compiler works. First, open the file in "rb" module, then try to detect the encoding use tokenize.detect_encoding() method, then use the dected encoding to open the source file again in text module.
History
Date User Action Args
2010-05-20 09:08:55ysj.raysetrecipients: + ysj.ray, meatballhat
2010-05-20 09:08:54ysj.raysetmessageid: <1274346534.56.0.254674868282.issue8774@psf.upfronthosting.co.za>
2010-05-20 09:08:52ysj.raylinkissue8774 messages
2010-05-20 09:08:50ysj.raycreate