Message148603
Python 3.3 has a strange behaviour:
>>> '\uDBFF\uDFFF'.encode('utf-16-le').decode('utf-16-le')
'\U0010ffff'
>>> '\U0010ffff'.encode('utf-16-le').decode('utf-16-le')
'\U0010ffff'
I would expect text.decode(encoding).encode(encoding)==text or an encode or decode error.
So I agree that the encoder should reject lone surogates. |
|
Date |
User |
Action |
Args |
2011-11-29 20:42:30 | vstinner | set | recipients:
+ vstinner, lemburg, gvanrossum, loewis, ezio.melotti, tchrist |
2011-11-29 20:42:30 | vstinner | set | messageid: <1322599350.13.0.163750536411.issue12892@psf.upfronthosting.co.za> |
2011-11-29 20:42:29 | vstinner | link | issue12892 messages |
2011-11-29 20:42:29 | vstinner | create | |
|