This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author doerwalter
Recipients doerwalter, ezio.melotti, serhiy.storchaka, vstinner
Date 2018-10-08.16:48:24
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1539017304.94.0.545547206417.issue34935@psf.upfronthosting.co.za>
In-reply-to
Content
OK, I see, http://www.unicode.org/versions/Unicode5.2.0/ch03.pdf (Table 3-7 on page 93) states that the only valid 3-bytes UTF-8 sequences starting with the byte 0xED have a value for the second byte in the range 0x80 to 0x9F. 0xA0 is just beyond that range (as that would result in an encoded surrogate). Python handles all invalid sequences according to that table with the same error message. I think this issue can be closed.
History
Date User Action Args
2018-10-08 16:48:24doerwaltersetrecipients: + doerwalter, vstinner, ezio.melotti, serhiy.storchaka
2018-10-08 16:48:24doerwaltersetmessageid: <1539017304.94.0.545547206417.issue34935@psf.upfronthosting.co.za>
2018-10-08 16:48:24doerwalterlinkissue34935 messages
2018-10-08 16:48:24doerwaltercreate