This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author serhiy.storchaka
Recipients pitrou, serhiy.storchaka, vstinner
Date 2012-04-23.21:04:06
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1335215047.86.0.0172027590975.issue14654@psf.upfronthosting.co.za>
In-reply-to
Content
The utf-8 decoder is already well optimized. I propose a patch, which accelerates the utf-8 decoder for some of the frequent cases even more (+10-30%). In particular, for 2-bites non-latin1 codes will get about +30%.

This is not the final result of optimization. It may be possible to optimize the decoding of the ascii and mostly-ascii text (up to the speed of memcpy), decoding of text with occasional errors, reduce code duplication. But I'm not sure of the success.

Related issues:
[issue4868] Faster utf-8 decoding
[issue13417] faster utf-8 decoding
[issue14419] Faster ascii decoding
[issue14624] Faster utf-16 decoder
[issue14625] Faster utf-32 decoder
History
Date User Action Args
2012-04-23 21:04:08serhiy.storchakasetrecipients: + serhiy.storchaka, pitrou, vstinner
2012-04-23 21:04:07serhiy.storchakasetmessageid: <1335215047.86.0.0172027590975.issue14654@psf.upfronthosting.co.za>
2012-04-23 21:04:06serhiy.storchakalinkissue14654 messages
2012-04-23 21:04:06serhiy.storchakacreate