Message 199190 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	serhiy.storchaka
Recipients	ezio.melotti, gvanrossum, kennyluck, lemburg, loewis, pitrou, serhiy.storchaka, tchrist, vstinner
Date	2013-10-08.10:28:06
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1381228087.0.0.494864640397.issue12892@psf.upfronthosting.co.za>
In-reply-to

Content
I repeat myself. Even with the patch, UTF-16 codec is faster than UTF-8 codec (except ASCII-only data). This is fastest Unicode codec in Python (perhaps UTF-32 can be made faster, but this is another issue). > The real question is: Can the UTF-16/32 codecs be made fast > while still detecting lone surrogates ? Not whether UTF-16 > is widely used or not. Yes, they can. But let defer this to other issues.

I repeat myself. Even with the patch, UTF-16 codec is faster than UTF-8 codec (except ASCII-only data). This is fastest Unicode codec in Python (perhaps UTF-32 can be made faster, but this is another issue).

> The real question is: Can the UTF-16/32 codecs be made fast
> while still detecting lone surrogates ? Not whether UTF-16
> is widely used or not.

Yes, they can. But let defer this to other issues.

History
Date	User	Action	Args
2013-10-08 10:28:07	serhiy.storchaka	set	recipients: + serhiy.storchaka, lemburg, gvanrossum, loewis, pitrou, vstinner, ezio.melotti, tchrist, kennyluck
2013-10-08 10:28:07	serhiy.storchaka	set	messageid: <1381228087.0.0.494864640397.issue12892@psf.upfronthosting.co.za>
2013-10-08 10:28:06	serhiy.storchaka	link	issue12892 messages
2013-10-08 10:28:06	serhiy.storchaka	create