This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author pitrou
Recipients amaury.forgeotdarc, lemburg, loewis, pitrou
Date 2009-01-08.15:22:31
SpamBayes Score 0.010872197
Marked as misclassified No
Message-id <1231428165.11860.28.camel@localhost>
In-reply-to <1231420282.49.0.531733594931.issue4868@psf.upfronthosting.co.za>
Content
> Attached patch
> (utf8decode4.patch) changes this and may enter the fast loop on the
> first character.

Thanks!

> Does this idea apply to the encode function as well?

Probably, although with less efficiency (a long can hold 1, 2 or 4
unicode characters depending on the build).
The unrolling part also applies to simple codecs such as latin1.
Unrolling PyUnicode_DecodeLatin1 a bit (4 copies per iteration) makes it
twice faster on non-tiny strings. I'll experiment with utf16.
History
Date User Action Args
2009-01-08 15:22:32pitrousetrecipients: + pitrou, lemburg, loewis, amaury.forgeotdarc
2009-01-08 15:22:31pitroulinkissue4868 messages
2009-01-08 15:22:31pitroucreate