Message 174170 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	vstinner
Recipients	Arfrever, asvetlov, ezio.melotti, georg.brandl, pitrou, serhiy.storchaka, vstinner
Date	2012-10-30.00:59:28
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1351558768.81.0.0171848183601.issue14625@psf.upfronthosting.co.za>
In-reply-to

Content
> I suggest apply patch A to 3.3 as it fixes performance > regression (2x) and is very simple. ASCII and UTF-8 are the two most common codecs in the world, so it's justified to have heavily optimized encoders and decoders. I don't know any application using UTF-32-LE or UTF-32-BE. So I don't want to waste Python memory/code size with a heavily optimized decoder. The patch A looks to be enough. -- 32 bit units is commonly used with wchar_t, but this format already has a fast decoder, PyUnicode_FromWideChar(), which uses memcpy() or _PyUnicode_CONVERT_BYTES().

> I suggest apply patch A to 3.3 as it fixes performance
> regression (2x) and is very simple.

ASCII and UTF-8 are the two most common codecs in the world, so it's justified to have heavily optimized encoders and decoders.

I don't know any application using UTF-32-LE or UTF-32-BE. So I don't want to waste Python memory/code size with a heavily optimized decoder. The patch A looks to be enough.

--

32 bit units is commonly used with wchar_t, but this format already has a fast decoder, PyUnicode_FromWideChar(), which uses memcpy() or _PyUnicode_CONVERT_BYTES().

History
Date	User	Action	Args
2012-10-30 00:59:28	vstinner	set	recipients: + vstinner, georg.brandl, pitrou, ezio.melotti, Arfrever, asvetlov, serhiy.storchaka
2012-10-30 00:59:28	vstinner	set	messageid: <1351558768.81.0.0171848183601.issue14625@psf.upfronthosting.co.za>
2012-10-30 00:59:28	vstinner	link	issue14625 messages
2012-10-30 00:59:28	vstinner	create