This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author lemburg
Recipients dangra, ezio.melotti, lemburg, sjmachin
Date 2010-04-01.13:19:03
SpamBayes Score 0.0029944375
Marked as misclassified No
Message-id <4BB49D46.7010209@egenix.com>
In-reply-to <1270123022.09.0.351284182872.issue8271@psf.upfronthosting.co.za>
Content
John Machin wrote:
> 
> John Machin <sjmachin@users.sourceforge.net> added the comment:
> 
> Unicode has been frozen at 0x10FFFF. That's it. There is no such thing as a valid 5-byte or 6-byte UTF-8 string.

The UTF-8 codec was written at a time when UTF-8 still included
the possibility to have 5 or 6 bytes:

http://www.rfc-editor.org/rfc/rfc2279.txt

Use of those encodings has always raised an error, though. For error
handling purposes it still has to support those possibilities.
History
Date User Action Args
2010-04-01 13:19:05lemburgsetrecipients: + lemburg, sjmachin, ezio.melotti, dangra
2010-04-01 13:19:04lemburglinkissue8271 messages
2010-04-01 13:19:03lemburgcreate