Author serhiy.storchaka
Recipients bob.ippolito, ezio.melotti, pitrou, rhettinger, serhiy.storchaka
Date 2013-05-05.11:45:05
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1367754306.11.0.846039834679.issue17906@psf.upfronthosting.co.za>
In-reply-to
Content
After investigating the problem deeper, I see that new parameter is not needed. RFC 4627 does not make exceptions for the range 0xD800-0xDFFF, and the decoder must accept lone surrogates, both escaped and unescaped. Non-BMP characters may be represented as escaped surrogate pair, so escaped surrogate pair may be decoded as non-BMP character, while unescaped surrogate pair shouldn't.

Here is a patch, with which JSON decoder accepts encoded lone surrogates. Also fixed a bug when Python implementation decodes "\\ud834\\u0079x" as "\U0001d179".
History
Date User Action Args
2013-05-05 11:45:06serhiy.storchakasetrecipients: + serhiy.storchaka, rhettinger, bob.ippolito, pitrou, ezio.melotti
2013-05-05 11:45:06serhiy.storchakasetmessageid: <1367754306.11.0.846039834679.issue17906@psf.upfronthosting.co.za>
2013-05-05 11:45:06serhiy.storchakalinkissue17906 messages
2013-05-05 11:45:05serhiy.storchakacreate