Message 279712 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	xiang.zhang
Recipients	serhiy.storchaka, vstinner, xiang.zhang
Date	2016-10-30.07:45:50
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1477813551.36.0.0620162768049.issue28561@psf.upfronthosting.co.za>
In-reply-to

Content
In utf8_encoder, when a codecs returns a string with non-ascii characters, it raises encodeerror but the start and end position are not perfect. This seems like an oversight during evolution. Before, utf8_encoder only recognize one surrogate character a time. After 2b5357b38366, it tries to recognize as much as possible a time. Patch also includes some cleanup.

In utf8_encoder, when a codecs returns a string with non-ascii characters, it raises encodeerror but the start and end position are not perfect. This seems like an oversight during evolution. Before, utf8_encoder only recognize one surrogate character a time. After 2b5357b38366, it tries to recognize as much as possible a time. Patch also includes some cleanup.

History
Date	User	Action	Args
2016-10-30 07:45:51	xiang.zhang	set	recipients: + xiang.zhang, vstinner, serhiy.storchaka
2016-10-30 07:45:51	xiang.zhang	set	messageid: <1477813551.36.0.0620162768049.issue28561@psf.upfronthosting.co.za>
2016-10-30 07:45:51	xiang.zhang	link	issue28561 messages
2016-10-30 07:45:50	xiang.zhang	create