This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author loewis
Recipients Arfrever, Henri.Salo, Huzaifa.Sidhpurwala, asvetlov, benjamin.peterson, ezio.melotti, loewis, pitrou, serhiy.storchaka, vstinner
Date 2012-04-26.18:46:13
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <4F9997F4.2090409@v.loewis.de>
In-reply-to <1335441165.3421.1.camel@localhost.localdomain>
Content
> UTF-16 units are 16-bit words, not bytes, so '\uffffd' sounds correct to
> me. You resynchronize on the word boundary: the invalid word is skipped.

I agree. The only odd case is when the number of bytes is not even
(pun intended). In that case, anybody can guess which of the bytes is
extra. The most natural (IMO) assumption is that the data is truncated,
so it would be the last byte which is extra.
History
Date User Action Args
2012-04-26 18:46:14loewissetrecipients: + loewis, pitrou, vstinner, benjamin.peterson, ezio.melotti, Arfrever, asvetlov, Henri.Salo, Huzaifa.Sidhpurwala, serhiy.storchaka
2012-04-26 18:46:13loewislinkissue14579 messages
2012-04-26 18:46:13loewiscreate