Message36798
Logged In: YES
user_id=89016
For encoding it's always (end-start)*u"?":
>>> u"ää".encode("ascii", "replace")
'??'
But for decoding, it is neither nor:
>>> "\\Ux\\U".decode("unicode-escape", "replace")
u'\ufffd\ufffd'
i.e. a sequence of 5 illegal characters was replace by two
replacement characters. This might mean that decoders can't
collect all the illegal characters and call the callback
once. They might have to call the callback for every single
illegal byte sequence to get the old behaviour.
(It seems that this patch would be much, much simpler, if
we only change the encoders) |
|
Date |
User |
Action |
Args |
2007-08-23 15:06:07 | admin | link | issue432401 messages |
2007-08-23 15:06:07 | admin | create | |
|