This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author martin.panter
Recipients Julian, akira, cvrebert, ezio.melotti, jleedev, martin.panter, ncoghlan, pitrou, rhettinger, serhiy.storchaka, vstinner
Date 2014-10-27.01:06:14
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <>
If you adjusted the detect_encoding() logic according to Pete Cordell’s table at the bottom of <>, it might work for standalone strings.

However since the RFC encourages UTF-8 for best interoperability, I wonder if any of this autodetection is necessary. It might be simpler to just assume UTF-8, or use the “utf-8-sig” codec. Or are there real cases where detecting UTF-16 or -32 would be useful?
Date User Action Args
2014-10-27 01:06:14martin.pantersetrecipients: + martin.panter, rhettinger, ncoghlan, pitrou, vstinner, ezio.melotti, cvrebert, akira, Julian, serhiy.storchaka, jleedev
2014-10-27 01:06:14martin.pantersetmessageid: <>
2014-10-27 01:06:14martin.panterlinkissue17909 messages
2014-10-27 01:06:14martin.pantercreate