This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author r.david.murray
Recipients Sworddragon, r.david.murray
Date 2013-06-22.18:43:25
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1371926605.49.0.401187705011.issue18282@psf.upfronthosting.co.za>
In-reply-to
Content
In python we have a saying that we follow most of the time: if you don't know, refuse the temptation to guess.  So currently this is all working as designed: you have to know the encoding of the file you are trying to read as unicode.

Adding a 'guess' function that could be called explicitly is a possibility, but if we were to go that route we'd probably really want something general to guess the encoding of strings, such as (I think) ICU has.  This larger topic is a topic more suited to python-ideas, probably followed, if response is positive, by a PEP.

So I'm closing this issue as rejected, but feel free to bring it up on python-ideas.  (Search for existing threads about it first, please.)
History
Date User Action Args
2013-06-22 18:43:25r.david.murraysetrecipients: + r.david.murray, Sworddragon
2013-06-22 18:43:25r.david.murraysetmessageid: <1371926605.49.0.401187705011.issue18282@psf.upfronthosting.co.za>
2013-06-22 18:43:25r.david.murraylinkissue18282 messages
2013-06-22 18:43:25r.david.murraycreate