> UTF-16 units are 16-bit words, not bytes, so '\uffffd' sounds correct to
> me. You resynchronize on the word boundary: the invalid word is skipped.

I agree. The only odd case is when the number of bytes is not even
(pun intended). In that case, anybody can guess which of the bytes is
extra. The most natural (IMO) assumption is that the data is truncated,
so it would be the last byte which is extra.
