Message188437
After investigating the problem deeper, I see that new parameter is not needed. RFC 4627 does not make exceptions for the range 0xD800-0xDFFF, and the decoder must accept lone surrogates, both escaped and unescaped. Non-BMP characters may be represented as escaped surrogate pair, so escaped surrogate pair may be decoded as non-BMP character, while unescaped surrogate pair shouldn't.
Here is a patch, with which JSON decoder accepts encoded lone surrogates. Also fixed a bug when Python implementation decodes "\\ud834\\u0079x" as "\U0001d179". |
|
Date |
User |
Action |
Args |
2013-05-05 11:45:06 | serhiy.storchaka | set | recipients:
+ serhiy.storchaka, rhettinger, bob.ippolito, pitrou, ezio.melotti |
2013-05-05 11:45:06 | serhiy.storchaka | set | messageid: <1367754306.11.0.846039834679.issue17906@psf.upfronthosting.co.za> |
2013-05-05 11:45:06 | serhiy.storchaka | link | issue17906 messages |
2013-05-05 11:45:05 | serhiy.storchaka | create | |
|