This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author lemburg
Recipients dangra, ezio.melotti, lemburg, sjmachin
Date 2010-04-01.15:01:36
SpamBayes Score 4.0029646e-12
Marked as misclassified No
Message-id <4BB4B54E.6060408@egenix.com>
In-reply-to <1270133413.07.0.728583781422.issue8271@psf.upfronthosting.co.za>
Content
Ezio Melotti wrote:
> 
> Ezio Melotti <ezio.melotti@gmail.com> added the comment:
> 
> Even if they are not valid they still "eat" all the 4/5/6 bytes, so they should be fixed too. I haven't see anything about these bytes in chapter 3 so far, but there are at least two possibilities:
> 1) consider all the bytes in range F5-FD as invalid without looking for the other bytes;
> 2) try to read the next 4/5/6 bytes and fail if they are no continuation bytes.
> We can also look at what others do (e.g. browsers and other languages).

By marking those entries as 0 in the length table, they would only
use one byte, however, compared to the current state, that would
produce more replacement code points in the output, so perhaps applying
the same logic as for the other sequences is a better strategy.
History
Date User Action Args
2010-04-01 15:01:39lemburgsetrecipients: + lemburg, sjmachin, ezio.melotti, dangra
2010-04-01 15:01:37lemburglinkissue8271 messages
2010-04-01 15:01:36lemburgcreate