This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author serhiy.storchaka
Recipients Arfrever, asvetlov, ezio.melotti, pitrou, serhiy.storchaka, vstinner
Date 2012-06-07.13:57:29
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1339077451.46.0.686170991807.issue15027@psf.upfronthosting.co.za>
In-reply-to
Content
In pair to issue14625 here is a patch than speed up UTF-32 encoding in several times. In addition, it fixes an unsafe check of an integer overflow.

Here are the results of benchmarking. See benchmark tools in https://bitbucket.org/storchaka/cpython-stuff repository.

On 32-bit Linux, AMD Athlon 64 X2 4600+ @ 2.4GHz:

Py2.7        Py3.2        Py3.3        patched

541 (+1032%) 541 (+1032%) 844 (+626%)  6125   encode  utf-32le  'A'*10000
543 (+1056%) 541 (+1060%) 844 (+643%)  6275   encode  utf-32le  '\x80'*10000
544 (+1010%) 542 (+1014%) 843 (+616%)  6037   encode  utf-32le    '\x80'+'A'*9999
541 (+799%)  542 (+797%)  764 (+537%)  4864   encode  utf-32le  '\u0100'*10000
544 (+781%)  542 (+784%)  767 (+525%)  4793   encode  utf-32le    '\u0100'+'A'*9999
544 (+789%)  542 (+792%)  766 (+531%)  4834   encode  utf-32le    '\u0100'+'\x80'*9999
542 (+799%)  541 (+801%)  764 (+538%)  4874   encode  utf-32le  '\u8000'*10000
544 (+779%)  542 (+782%)  767 (+523%)  4780   encode  utf-32le    '\u8000'+'A'*9999
544 (+793%)  542 (+796%)  766 (+534%)  4859   encode  utf-32le    '\u8000'+'\x80'*9999
544 (+819%)  542 (+823%)  766 (+553%)  5001   encode  utf-32le    '\u8000'+'\u0100'*9999
430 (+867%)  427 (+874%)  860 (+383%)  4157   encode  utf-32le  '\U00010000'*10000
543 (+655%)  543 (+655%)  861 (+376%)  4101   encode  utf-32le    '\U00010000'+'A'*9999
543 (+658%)  543 (+658%)  861 (+378%)  4116   encode  utf-32le    '\U00010000'+'\x80'*9999
543 (+670%)  543 (+670%)  859 (+387%)  4180   encode  utf-32le    '\U00010000'+'\u0100'*9999
543 (+666%)  543 (+666%)  860 (+383%)  4158   encode  utf-32le    '\U00010000'+'\u8000'*9999

541 (+880%)  543 (+876%)  844 (+528%)  5300   encode  utf-32be  'A'*10000
541 (+872%)  542 (+870%)  844 (+523%)  5256   encode  utf-32be  '\x80'*10000
544 (+843%)  542 (+846%)  843 (+509%)  5130   encode  utf-32be    '\x80'+'A'*9999
541 (+363%)  542 (+362%)  764 (+228%)  2505   encode  utf-32be  '\u0100'*10000
544 (+366%)  542 (+368%)  766 (+231%)  2534   encode  utf-32be    '\u0100'+'A'*9999
544 (+363%)  542 (+365%)  766 (+229%)  2519   encode  utf-32be    '\u0100'+'\x80'*9999
542 (+363%)  541 (+364%)  764 (+228%)  2509   encode  utf-32be  '\u8000'*10000
544 (+366%)  542 (+368%)  766 (+231%)  2534   encode  utf-32be    '\u8000'+'A'*9999
544 (+363%)  542 (+364%)  766 (+229%)  2517   encode  utf-32be    '\u8000'+'\x80'*9999
544 (+372%)  542 (+374%)  766 (+235%)  2568   encode  utf-32be    '\u8000'+'\u0100'*9999
430 (+428%)  427 (+432%)  860 (+164%)  2270   encode  utf-32be  '\U00010000'*10000
543 (+317%)  541 (+318%)  861 (+163%)  2262   encode  utf-32be    '\U00010000'+'A'*9999
543 (+320%)  541 (+321%)  861 (+165%)  2279   encode  utf-32be    '\U00010000'+'\x80'*9999
543 (+322%)  541 (+323%)  859 (+167%)  2290   encode  utf-32be    '\U00010000'+'\u0100'*9999
543 (+322%)  541 (+324%)  860 (+167%)  2292   encode  utf-32be    '\U00010000'+'\u8000'*9999
History
Date User Action Args
2012-06-07 13:57:31serhiy.storchakasetrecipients: + serhiy.storchaka, pitrou, vstinner, ezio.melotti, Arfrever, asvetlov
2012-06-07 13:57:31serhiy.storchakasetmessageid: <1339077451.46.0.686170991807.issue15027@psf.upfronthosting.co.za>
2012-06-07 13:57:30serhiy.storchakalinkissue15027 messages
2012-06-07 13:57:29serhiy.storchakacreate