This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author serhiy.storchaka
Recipients Arfrever, asvetlov, ezio.melotti, pitrou, serhiy.storchaka, vstinner
Date 2012-06-07.13:56:09
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1339077374.41.0.500256039615.issue15026@psf.upfronthosting.co.za>
In-reply-to
Content
In pair to issue14624 here is a patch than speed up UTF-16 encoding in several times. In addition, it fixes an unsafe check of an integer overflow.

Here are the results of benchmarking. See benchmark tools in https://bitbucket.org/storchaka/cpython-stuff repository.

On 32-bit Linux, AMD Athlon 64 X2 4600+ @ 2.4GHz:

Py2.7        Py3.2        Py3.3        patched

457 (+575%)  458 (+573%)  1077 (+186%) 3083   encode  utf-16le  'A'*10000
457 (+579%)  493 (+529%)  1084 (+186%) 3102   encode  utf-16le  '\x80'*10000
489 (+534%)  458 (+577%)  1081 (+187%) 3102   encode  utf-16le    '\x80'+'A'*9999
457 (+1261%) 493 (+1161%) 1116 (+457%) 6219   encode  utf-16le  '\u0100'*10000
489 (+1266%) 458 (+1358%) 1126 (+493%) 6678   encode  utf-16le    '\u0100'+'A'*9999
489 (+1263%) 458 (+1355%) 1129 (+490%) 6666   encode  utf-16le    '\u0100'+'\x80'*9999
457 (+1240%) 493 (+1142%) 1118 (+448%) 6125   encode  utf-16le  '\u8000'*10000
489 (+1271%) 458 (+1363%) 1127 (+495%) 6702   encode  utf-16le    '\u8000'+'A'*9999
489 (+1271%) 458 (+1364%) 1129 (+494%) 6705   encode  utf-16le    '\u8000'+'\x80'*9999
489 (+1135%) 458 (+1218%) 1136 (+432%) 6038   encode  utf-16le    '\u8000'+'\u0100'*9999
498 (+128%)  505 (+125%)  630 (+80%)   1137   encode  utf-16le  '\U00010000'*10000
489 (+35%)   458 (+44%)   360 (+83%)   659    encode  utf-16le    '\U00010000'+'A'*9999
489 (+35%)   458 (+44%)   359 (+84%)   660    encode  utf-16le    '\U00010000'+'\x80'*9999
489 (+36%)   458 (+45%)   361 (+84%)   663    encode  utf-16le    '\U00010000'+'\u0100'*9999
489 (+36%)   458 (+45%)   361 (+84%)   663    encode  utf-16le    '\U00010000'+'\u8000'*9999

447 (+507%)  493 (+450%)  1086 (+150%) 2712   encode  utf-16be  'A'*10000
447 (+513%)  493 (+456%)  1080 (+154%) 2739   encode  utf-16be  '\x80'*10000
489 (+458%)  458 (+496%)  1079 (+153%) 2729   encode  utf-16be    '\x80'+'A'*9999
447 (+498%)  494 (+441%)  1118 (+139%) 2672   encode  utf-16be  '\u0100'*10000
489 (+464%)  458 (+502%)  1128 (+144%) 2756   encode  utf-16be    '\u0100'+'A'*9999
489 (+463%)  458 (+502%)  1131 (+144%) 2755   encode  utf-16be    '\u0100'+'\x80'*9999
447 (+500%)  493 (+444%)  1119 (+139%) 2680   encode  utf-16be  '\u8000'*10000
489 (+463%)  458 (+502%)  1126 (+145%) 2755   encode  utf-16be    '\u8000'+'A'*9999
489 (+464%)  458 (+502%)  1129 (+144%) 2757   encode  utf-16be    '\u8000'+'\x80'*9999
489 (+479%)  458 (+518%)  1137 (+149%) 2829   encode  utf-16be    '\u8000'+'\u0100'*9999
499 (+102%)  506 (+99%)   630 (+60%)   1009   encode  utf-16be  '\U00010000'*10000
489 (+6%)    458 (+13%)   360 (+44%)   519    encode  utf-16be    '\U00010000'+'A'*9999
489 (+6%)    458 (+13%)   359 (+44%)   518    encode  utf-16be    '\U00010000'+'\x80'*9999
489 (+6%)    458 (+13%)   361 (+44%)   519    encode  utf-16be    '\U00010000'+'\u0100'*9999
489 (+6%)    458 (+13%)   361 (+44%)   519    encode  utf-16be    '\U00010000'+'\u8000'*9999
History
Date User Action Args
2012-06-07 13:56:14serhiy.storchakasetrecipients: + serhiy.storchaka, pitrou, vstinner, ezio.melotti, Arfrever, asvetlov
2012-06-07 13:56:14serhiy.storchakasetmessageid: <1339077374.41.0.500256039615.issue15026@psf.upfronthosting.co.za>
2012-06-07 13:56:13serhiy.storchakalinkissue15026 messages
2012-06-07 13:56:11serhiy.storchakacreate