classification
Title: Faster UTF-32 encoding
Type: performance Stage: resolved
Components: Interpreter Core, Unicode Versions: Python 3.5
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: serhiy.storchaka Nosy List: Arfrever, BreamoreBoy, asvetlov, ezio.melotti, gregory.p.smith, haypo, kmike, larry, neologix, pitrou, python-dev, serhiy.storchaka
Priority: normal Keywords: needs review, patch

Created on 2012-06-07 13:57 by serhiy.storchaka, last changed 2015-05-18 19:22 by serhiy.storchaka. This issue is now closed.

Files
File name Uploaded Description Edit
encode_utf32_2.patch serhiy.storchaka, 2012-10-20 19:05 review
encode_utf32_3.patch serhiy.storchaka, 2013-12-11 22:17 review
Messages (21)
msg162474 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-06-07 13:57
In pair to issue14625 here is a patch than speed up UTF-32 encoding in several times. In addition, it fixes an unsafe check of an integer overflow.

Here are the results of benchmarking. See benchmark tools in https://bitbucket.org/storchaka/cpython-stuff repository.

On 32-bit Linux, AMD Athlon 64 X2 4600+ @ 2.4GHz:

Py2.7        Py3.2        Py3.3        patched

541 (+1032%) 541 (+1032%) 844 (+626%)  6125   encode  utf-32le  'A'*10000
543 (+1056%) 541 (+1060%) 844 (+643%)  6275   encode  utf-32le  '\x80'*10000
544 (+1010%) 542 (+1014%) 843 (+616%)  6037   encode  utf-32le    '\x80'+'A'*9999
541 (+799%)  542 (+797%)  764 (+537%)  4864   encode  utf-32le  '\u0100'*10000
544 (+781%)  542 (+784%)  767 (+525%)  4793   encode  utf-32le    '\u0100'+'A'*9999
544 (+789%)  542 (+792%)  766 (+531%)  4834   encode  utf-32le    '\u0100'+'\x80'*9999
542 (+799%)  541 (+801%)  764 (+538%)  4874   encode  utf-32le  '\u8000'*10000
544 (+779%)  542 (+782%)  767 (+523%)  4780   encode  utf-32le    '\u8000'+'A'*9999
544 (+793%)  542 (+796%)  766 (+534%)  4859   encode  utf-32le    '\u8000'+'\x80'*9999
544 (+819%)  542 (+823%)  766 (+553%)  5001   encode  utf-32le    '\u8000'+'\u0100'*9999
430 (+867%)  427 (+874%)  860 (+383%)  4157   encode  utf-32le  '\U00010000'*10000
543 (+655%)  543 (+655%)  861 (+376%)  4101   encode  utf-32le    '\U00010000'+'A'*9999
543 (+658%)  543 (+658%)  861 (+378%)  4116   encode  utf-32le    '\U00010000'+'\x80'*9999
543 (+670%)  543 (+670%)  859 (+387%)  4180   encode  utf-32le    '\U00010000'+'\u0100'*9999
543 (+666%)  543 (+666%)  860 (+383%)  4158   encode  utf-32le    '\U00010000'+'\u8000'*9999

541 (+880%)  543 (+876%)  844 (+528%)  5300   encode  utf-32be  'A'*10000
541 (+872%)  542 (+870%)  844 (+523%)  5256   encode  utf-32be  '\x80'*10000
544 (+843%)  542 (+846%)  843 (+509%)  5130   encode  utf-32be    '\x80'+'A'*9999
541 (+363%)  542 (+362%)  764 (+228%)  2505   encode  utf-32be  '\u0100'*10000
544 (+366%)  542 (+368%)  766 (+231%)  2534   encode  utf-32be    '\u0100'+'A'*9999
544 (+363%)  542 (+365%)  766 (+229%)  2519   encode  utf-32be    '\u0100'+'\x80'*9999
542 (+363%)  541 (+364%)  764 (+228%)  2509   encode  utf-32be  '\u8000'*10000
544 (+366%)  542 (+368%)  766 (+231%)  2534   encode  utf-32be    '\u8000'+'A'*9999
544 (+363%)  542 (+364%)  766 (+229%)  2517   encode  utf-32be    '\u8000'+'\x80'*9999
544 (+372%)  542 (+374%)  766 (+235%)  2568   encode  utf-32be    '\u8000'+'\u0100'*9999
430 (+428%)  427 (+432%)  860 (+164%)  2270   encode  utf-32be  '\U00010000'*10000
543 (+317%)  541 (+318%)  861 (+163%)  2262   encode  utf-32be    '\U00010000'+'A'*9999
543 (+320%)  541 (+321%)  861 (+165%)  2279   encode  utf-32be    '\U00010000'+'\x80'*9999
543 (+322%)  541 (+323%)  859 (+167%)  2290   encode  utf-32be    '\U00010000'+'\u0100'*9999
543 (+322%)  541 (+324%)  860 (+167%)  2292   encode  utf-32be    '\U00010000'+'\u8000'*9999
msg162823 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-06-14 20:30
On 32-bit Linux, Intel Atom N570 @ 1.66GHz:

Py2.7        Py3.2        Py3.3        patched

214 (+718%)  215 (+714%)  363 (+382%)  1750   encode  utf-32le  'A'*10000
214 (+704%)  214 (+704%)  362 (+375%)  1720   encode  utf-32le  '\x80'*10000
214 (+712%)  215 (+708%)  363 (+379%)  1738   encode  utf-32le    '\x80'+'A'*9999
214 (+698%)  214 (+698%)  342 (+399%)  1707   encode  utf-32le  '\u0100'*10000
214 (+688%)  215 (+684%)  343 (+392%)  1686   encode  utf-32le    '\u0100'+'A'*9999
214 (+699%)  215 (+695%)  342 (+400%)  1710   encode  utf-32le    '\u0100'+'\x80'*9999
214 (+694%)  214 (+694%)  342 (+397%)  1699   encode  utf-32le  '\u8000'*10000
214 (+688%)  215 (+685%)  343 (+392%)  1687   encode  utf-32le    '\u8000'+'A'*9999
214 (+700%)  214 (+700%)  342 (+401%)  1713   encode  utf-32le    '\u8000'+'\x80'*9999
214 (+682%)  215 (+679%)  342 (+389%)  1674   encode  utf-32le    '\u8000'+'\u0100'*9999
121 (+2237%) 121 (+2237%) 333 (+749%)  2828   encode  utf-32le  '\U00010000'*10000
214 (+1108%) 214 (+1108%) 333 (+676%)  2585   encode  utf-32le    '\U00010000'+'A'*9999
214 (+1112%) 214 (+1112%) 333 (+679%)  2594   encode  utf-32le    '\U00010000'+'\x80'*9999
214 (+1208%) 214 (+1208%) 333 (+741%)  2799   encode  utf-32le    '\U00010000'+'\u0100'*9999
214 (+1214%) 215 (+1208%) 333 (+745%)  2813   encode  utf-32le    '\U00010000'+'\u8000'*9999

214 (+556%)  214 (+556%)  363 (+287%)  1404   encode  utf-32be  'A'*10000
214 (+558%)  214 (+558%)  363 (+288%)  1408   encode  utf-32be  '\x80'*10000
214 (+550%)  214 (+550%)  363 (+283%)  1390   encode  utf-32be    '\x80'+'A'*9999
214 (+224%)  214 (+224%)  342 (+103%)  693    encode  utf-32be  '\u0100'*10000
214 (+229%)  214 (+229%)  343 (+105%)  703    encode  utf-32be    '\u0100'+'A'*9999
214 (+221%)  214 (+221%)  342 (+101%)  688    encode  utf-32be    '\u0100'+'\x80'*9999
214 (+224%)  214 (+224%)  342 (+103%)  694    encode  utf-32be  '\u8000'*10000
215 (+227%)  214 (+229%)  343 (+105%)  704    encode  utf-32be    '\u8000'+'A'*9999
214 (+221%)  214 (+221%)  342 (+101%)  686    encode  utf-32be    '\u8000'+'\x80'*9999
214 (+222%)  214 (+222%)  341 (+102%)  690    encode  utf-32be    '\u8000'+'\u0100'*9999
121 (+387%)  121 (+387%)  333 (+77%)   589    encode  utf-32be  '\U00010000'*10000
214 (+174%)  215 (+173%)  333 (+76%)   587    encode  utf-32be    '\U00010000'+'A'*9999
214 (+183%)  214 (+183%)  333 (+82%)   606    encode  utf-32be    '\U00010000'+'\x80'*9999
214 (+184%)  214 (+184%)  333 (+82%)   607    encode  utf-32be    '\U00010000'+'\u0100'*9999
214 (+183%)  214 (+183%)  333 (+82%)   605    encode  utf-32be    '\U00010000'+'\u8000'*9999
msg173404 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-10-20 19:05
Patch updated to 3.4.

Is anyone interested in 7x speedup of UTF-32 encoder?
msg205912 - (view) Author: Mark Lawrence (BreamoreBoy) * Date: 2013-12-11 18:28
From http://kmike.ru/python-data-structures/ under heading DATrie "Python wrapper uses utf_32_le codec internally; this codec is currently slow and it is the bottleneck for datrie. There is a ticket with a patch in the CPython bug tracker (http://bugs.python.org/issue15027) that should make this codec fast, so there is a hope datrie will become faster with future Pythons."
msg205934 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-12-11 22:17
Here is updated patch, synchronized with trunk. UTF-32 encoder now checks surrogates and therefore speedup is less (only up to 5 times). But this compensates regression in 3.4.

On 32-bit Linux, Intel Atom N570 @ 1.66GHz:

Py3.3        Py3.4        patched

531 (+245%)  489 (+274%)  1831   encode  utf-32le  'A'*10000
383 (+158%)  223 (+344%)  990    encode  utf-32le  '\u0100'*10000
325 (+262%)  229 (+414%)  1177   encode  utf-32le  '\U00010000'*10000

544 (+166%)  494 (+193%)  1448   encode  utf-32be  'A'*10000
384 (+67%)   223 (+188%)  642    encode  utf-32be  '\u0100'*10000
323 (+108%)  229 (+193%)  671    encode  utf-32be  '\U00010000'*10000
msg205940 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2013-12-11 23:05
one comment to address on the review, otherwise after addressing that I believe this is ready to go in for 3.4.
msg207292 - (view) Author: Roundup Robot (python-dev) Date: 2014-01-04 17:26
New changeset b72c5573c5e7 by Serhiy Storchaka in branch 'default':
Issue #15027: Rewrite the UTF-32 encoder.  It is now 1.6x to 3.5x faster.
http://hg.python.org/cpython/rev/b72c5573c5e7
msg207294 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-01-04 17:32
Thank you Gregory for your review.
msg207302 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2014-01-04 18:41
Isn't this a new feature?
msg207305 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-01-04 19:59
Sorry if I have missed. Should I revert changeset b72c5573c5e7?

This patch doesn't introduce new functions and doesn't change behavior. Without this patch the UTF-32 encoder is up to 2.5x slower in 3.4 than in 3.3 (due to issue12892).
msg207306 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2014-01-04 20:10
Would you describe it as a "bug fix" or a "security fix"?  If it's neither of those things, then you need special permission to add it during beta.  And given that this patch has the possibility of causing bugs, I'd prefer to not accept it for 3.4.

Please revert it for now.  If you think it should go in to 3.4, you may ask on python-dev that it be considered and take a poll.  (Note that the poll is not binding on me; this is still solely my decision.  However if there was an uproar of support for your patch, that would certainly cause me to reconsider.)
msg207311 - (view) Author: Roundup Robot (python-dev) Date: 2014-01-04 20:51
New changeset 1e345924f7ea by Serhiy Storchaka in branch 'default':
Reverted changeset b72c5573c5e7 (issue #15027).
http://hg.python.org/cpython/rev/1e345924f7ea
msg210147 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2014-02-03 16:02
BreamoreBoy: why did you remove Arfrever from this issue?
msg210148 - (view) Author: Charles-Fran├žois Natali (neologix) * (Python committer) Date: 2014-02-03 16:28
> BreamoreBoy: why did you remove Arfrever from this issue?

Noisy lists members are sorted by alphabetical order: since Arfrever comes just before BreamoreBoy, I assume his fingers tripped ;-)
msg242871 - (view) Author: Mark Lawrence (BreamoreBoy) * Date: 2015-05-10 23:21
As this appears to be a performance improvement only can it go into 3.5 or do we wait for 3.x?
msg242954 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-05-12 10:22
Can I commit the patch now Larry?
msg242981 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2015-05-12 15:43
We're still in alpha, so it's fine for 3.5 right now.  The cutoff for new features for 3.5 will be May 23.
msg243005 - (view) Author: Roundup Robot (python-dev) Date: 2015-05-12 20:13
New changeset 80cf7723c4cf by Serhiy Storchaka in branch 'default':
Issue #15027: The UTF-32 encoder is now 3x to 7x faster.
https://hg.python.org/cpython/rev/80cf7723c4cf
msg243008 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-05-12 20:26
And that's not all...
msg243523 - (view) Author: Arfrever Frehtes Taifersar Arahesis (Arfrever) * Date: 2015-05-18 19:14
In Objects/stringlib/codecs.h in 2 comments U+DC800 should be changed into U+D800 (from definition of Py_UNICODE_IS_SURROGATE) or U+DC80 (from result of b"\x80".decode(errors="surrogateescape")).
msg243524 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-05-18 19:22
Thank you Arfrever. That was copy-pasted old typo. Fixed in 3d5bf6174c4b and bc6ed8360312.
History
Date User Action Args
2015-05-18 19:22:44serhiy.storchakasetmessages: + msg243524
2015-05-18 19:14:25Arfreversetmessages: + msg243523
2015-05-12 20:26:46serhiy.storchakasetstatus: open -> closed
resolution: fixed
messages: + msg243008

stage: patch review -> resolved
2015-05-12 20:13:08python-devsetmessages: + msg243005
2015-05-12 15:43:59larrysetmessages: + msg242981
2015-05-12 10:22:08serhiy.storchakasetmessages: + msg242954
2015-05-10 23:21:30BreamoreBoysetnosy: + BreamoreBoy
messages: + msg242871
2014-02-03 17:00:17BreamoreBoysetnosy: - BreamoreBoy
2014-02-03 16:28:02neologixsetnosy: + Arfrever, neologix
messages: + msg210148
2014-02-03 16:02:10larrysetmessages: + msg210147
2014-02-03 15:38:19BreamoreBoysetnosy: - Arfrever
2014-01-04 20:55:35serhiy.storchakasetstatus: closed -> open
stage: resolved -> patch review
resolution: fixed -> (no value)
versions: + Python 3.5, - Python 3.4
2014-01-04 20:51:11python-devsetmessages: + msg207311
2014-01-04 20:10:40larrysetmessages: + msg207306
2014-01-04 19:59:30serhiy.storchakasetmessages: + msg207305
2014-01-04 18:41:17larrysetnosy: + larry
messages: + msg207302
2014-01-04 17:32:35serhiy.storchakasetstatus: open -> closed
resolution: fixed
messages: + msg207294

stage: patch review -> resolved
2014-01-04 17:26:00python-devsetnosy: + python-dev
messages: + msg207292
2013-12-11 23:05:08gregory.p.smithsetpriority: low -> normal
nosy: + gregory.p.smith
messages: + msg205940

2013-12-11 22:17:02serhiy.storchakasetfiles: + encode_utf32_3.patch

messages: + msg205934
2013-12-11 18:28:28BreamoreBoysetnosy: + BreamoreBoy
messages: + msg205912
2013-01-07 17:51:10serhiy.storchakasetpriority: normal -> low
assignee: serhiy.storchaka
2012-10-24 09:02:58serhiy.storchakasetstage: patch review
2012-10-20 19:05:07serhiy.storchakasetkeywords: + needs review
files: + encode_utf32_2.patch
messages: + msg173404

versions: + Python 3.4, - Python 3.3
2012-10-20 19:03:19serhiy.storchakasetfiles: - encode-utf32.patch
2012-07-17 20:44:37kmikesetnosy: + kmike
2012-06-14 20:30:11serhiy.storchakasetmessages: + msg162823
2012-06-07 13:57:30serhiy.storchakacreate