Author ezio.melotti
Recipients Anoop.Thomas.Mathew, Gallaecio, docs@python, ezio.melotti, ncoghlan, serhiy.storchaka, vajrasky
Date 2013-10-19.03:47:00
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1382154421.18.0.141729313063.issue18958@psf.upfronthosting.co.za>
In-reply-to
Content
I'm not sure this should be documented in json.load/loads, and I'm not sure people will look there once they get this exception.
The error is raised because the wrong codec is used (either by open() before passing the file object to json.load or by json.loads), so it's a user error rather than a problem with the json module.  The error turns out to be particularly misleading because the decoding is successful even though it produces a wrong result, and the problem becomes apparent only once it reaches json.
ISTM that the documentation is already clear enough that json doesn't auto-detect encodings and uses UTF-8 by default, and that different encodings should be specified explicitly.
I think that a better and backward-compatible solution would be to detect the UTF-8 BOM and provide a better error message hinting at utf-8-sig.
History
Date User Action Args
2013-10-19 03:47:01ezio.melottisetrecipients: + ezio.melotti, ncoghlan, docs@python, serhiy.storchaka, Anoop.Thomas.Mathew, vajrasky, Gallaecio
2013-10-19 03:47:01ezio.melottisetmessageid: <1382154421.18.0.141729313063.issue18958@psf.upfronthosting.co.za>
2013-10-19 03:47:01ezio.melottilinkissue18958 messages
2013-10-19 03:47:00ezio.melotticreate