msg160103 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2012-05-06 18:00 |
I propose a complex patch, which significantly speeds up UTF-8 decoding. Now decoder faster even decoder in 3.2 (except in a few unreal patological cases).
Also the decoder code reduced and simplified (formerly decoding code was repeated in at least three places).
As a side effect ASCII decoding now faster on some platforms (issue14419).
Related issues:
[issue4868] Faster utf-8 decoding
[issue13417] faster utf-8 decoding
[issue14419] Faster ascii decoding
[issue14624] Faster utf-16 decoder
[issue14625] Faster utf-32 decoder
[issue14654] Faster utf-8 decoding
Here are the results of benchmarking (numbers is speed in MB/s).
On 32-bit Linux, AMD Athlon 64 X2 4600+ @ 2.4GHz:
3.2 3.3(vanilla) patched
utf-8 'A'*10000 1199 (+69%) 1721 (+18%) 2032
utf-8 'A'*9999+'\x80' 1189 (+25%) 996 (+49%) 1488
utf-8 'A'*9999+'\u0100' 1192 (-25%) 887 (+1%) 894
utf-8 'A'*9999+'\u8000' 1178 (-24%) 888 (+0%) 890
utf-8 'A'*9999+'\U00010000' 1177 (-29%) 872 (-4%) 837
utf-8 '\x80'*10000 220 (+74%) 172 (+122%) 382
utf-8 '\x80'+'A'*9999 1192 (+5%) 376 (+232%) 1250
utf-8 '\x80'*9999+'\u0100' 220 (+54%) 160 (+112%) 339
utf-8 '\x80'*9999+'\u8000' 220 (+54%) 160 (+112%) 339
utf-8 '\x80'*9999+'\U00010000' 221 (+49%) 176 (+88%) 330
utf-8 '\u0100'*10000 220 (+74%) 163 (+134%) 382
utf-8 '\u0100'+'A'*9999 1177 (+4%) 382 (+219%) 1220
utf-8 '\u0100'+'\x80'*9999 220 (+74%) 163 (+134%) 382
utf-8 '\u0100'*9999+'\u8000' 220 (+74%) 163 (+134%) 382
utf-8 '\u0100'*9999+'\U00010000' 220 (+50%) 180 (+83%) 330
utf-8 '\u8000'*10000 261 (+66%) 191 (+126%) 432
utf-8 '\u8000'+'A'*9999 1197 (+1%) 384 (+216%) 1212
utf-8 '\u8000'+'\x80'*9999 216 (+77%) 163 (+134%) 382
utf-8 '\u8000'+'\u0100'*9999 215 (+77%) 164 (+132%) 381
utf-8 '\u8000'*9999+'\U00010000' 261 (+46%) 201 (+89%) 380
utf-8 '\U00010000'*10000 248 (+44%) 198 (+80%) 357
utf-8 '\U00010000'+'A'*9999 1192 (-5%) 383 (+196%) 1135
utf-8 '\U00010000'+'\x80'*9999 220 (+73%) 180 (+111%) 380
utf-8 '\U00010000'+'\u0100'*9999 220 (+73%) 180 (+111%) 380
utf-8 '\U00010000'+'\u8000'*9999 261 (+54%) 201 (+100%) 403
ascii 'A'*10000 233 (+971%) 1876 (+33%) 2496
On 32-bit Linux, Intel Atom N570 @ 1.66GHz:
3.2 3.3(vanilla) patched
utf-8 'A'*10000 345 (+81%) 596 (+5%) 623
utf-8 'A'*9999+'\x80' 335 (+41%) 303 (+56%) 474
utf-8 'A'*9999+'\u0100' 336 (-23%) 123 (+110%) 258
utf-8 'A'*9999+'\u8000' 337 (-24%) 123 (+108%) 256
utf-8 'A'*9999+'\U00010000' 336 (-24%) 261 (-3%) 254
utf-8 '\x80'*10000 88 (+66%) 65 (+125%) 146
utf-8 '\x80'+'A'*9999 334 (+8%) 124 (+190%) 360
utf-8 '\x80'*9999+'\u0100' 88 (+43%) 65 (+94%) 126
utf-8 '\x80'*9999+'\u8000' 88 (+43%) 65 (+94%) 126
utf-8 '\x80'*9999+'\U00010000' 89 (+40%) 65 (+92%) 125
utf-8 '\u0100'*10000 88 (+85%) 65 (+151%) 163
utf-8 '\u0100'+'A'*9999 336 (+2%) 77 (+345%) 343
utf-8 '\u0100'+'\x80'*9999 88 (+86%) 65 (+152%) 164
utf-8 '\u0100'*9999+'\u8000' 88 (+86%) 65 (+152%) 164
utf-8 '\u0100'*9999+'\U00010000' 88 (+57%) 65 (+112%) 138
utf-8 '\u8000'*10000 98 (+79%) 69 (+154%) 175
utf-8 '\u8000'+'A'*9999 339 (+3%) 77 (+353%) 349
utf-8 '\u8000'+'\x80'*9999 89 (+84%) 66 (+148%) 164
utf-8 '\u8000'+'\u0100'*9999 88 (+86%) 65 (+152%) 164
utf-8 '\u8000'*9999+'\U00010000' 98 (+58%) 69 (+125%) 155
utf-8 '\U00010000'*10000 104 (+46%) 79 (+92%) 152
utf-8 '\U00010000'+'A'*9999 339 (-5%) 124 (+160%) 323
utf-8 '\U00010000'+'\x80'*9999 88 (+84%) 68 (+138%) 162
utf-8 '\U00010000'+'\u0100'*9999 88 (+83%) 68 (+137%) 161
utf-8 '\U00010000'+'\u8000'*9999 98 (+63%) 72 (+122%) 160
ascii 'A'*10000 132 (+499%) 758 (+4%) 791
|
msg160107 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2012-05-06 20:01 |
64-bit Linux, Intel Core i5 2500K:
3.2 3.3 patched
utf-8 'A'*10000 2550 (+198%) 6828 (+11%) 7607
utf-8 'A'*9999+'\x80' 2501 (+118%) 2415 (+126%) 5456
utf-8 'A'*9999+'\u0100' 2501 (-20%) 2297 (-13%) 1996
utf-8 'A'*9999+'\u8000' 2494 (-14%) 2291 (-7%) 2133
utf-8 'A'*9999+'\U00010000' 2494 (-11%) 2293 (-3%) 2219
utf-8 '\x80'*10000 422 (+135%) 517 (+92%) 991
utf-8 '\x80'+'A'*9999 2513 (+12%) 860 (+228%) 2820
utf-8 '\x80'*9999+'\u0100' 426 (+102%) 525 (+64%) 862
utf-8 '\x80'*9999+'\u8000' 426 (+104%) 538 (+62%) 871
utf-8 '\x80'*9999+'\U00010000' 428 (+105%) 523 (+68%) 878
utf-8 '\u0100'*10000 425 (+140%) 517 (+97%) 1019
utf-8 '\u0100'+'A'*9999 2488 (+2%) 820 (+211%) 2549
utf-8 '\u0100'+'\x80'*9999 426 (+139%) 517 (+97%) 1019
utf-8 '\u0100'*9999+'\u8000' 426 (+139%) 529 (+93%) 1019
utf-8 '\u0100'*9999+'\U00010000' 426 (+106%) 509 (+72%) 876
utf-8 '\u8000'*10000 573 (+28%) 490 (+50%) 733
utf-8 '\u8000'+'A'*9999 2500 (+1%) 822 (+208%) 2528
utf-8 '\u8000'+'\x80'*9999 426 (+139%) 530 (+92%) 1018
utf-8 '\u8000'+'\u0100'*9999 428 (+138%) 509 (+100%) 1018
utf-8 '\u8000'*9999+'\U00010000' 573 (+17%) 447 (+51%) 673
utf-8 '\U00010000'*10000 562 (+24%) 552 (+26%) 696
utf-8 '\U00010000'+'A'*9999 2512 (+3%) 939 (+175%) 2584
utf-8 '\U00010000'+'\x80'*9999 423 (+140%) 553 (+84%) 1017
utf-8 '\U00010000'+'\u0100'*9999 426 (+139%) 549 (+85%) 1017
utf-8 '\U00010000'+'\u8000'*9999 572 (+18%) 479 (+41%) 674
|
msg160110 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2012-05-06 21:48 |
Thank your, Antoine. Finally Intel Core is defeated!
If someone wants to repeat tests, see benchmark tools in issue14624.
|
msg160112 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2012-05-06 22:11 |
The patch updated in accordance with Antoine cosmetic comments.
|
msg160305 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2012-05-09 16:50 |
There's a Mac-specific portion in the patch, it would be nice if someone could check that it works.
|
msg160306 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2012-05-09 18:05 |
It would be good if someone checked on Macs work with command line arguments, including non-valid utf8. The difficulty is that you need to check on both Macs with 16-bit and with 32-bit wchar_t.
|
msg160307 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2012-05-09 18:32 |
Issue4388 is related to this Mac-specific portion of the patch.
|
msg160308 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2012-05-09 18:41 |
> It would be good if someone checked on Macs work with command line
> arguments, including non-valid utf8. The difficulty is that you need
> to check on both Macs with 16-bit and with 32-bit wchar_t.
Actually, it should be enough to run the test suite, since we should
have tests for this.
As for different wchar_t widths, that's the kind of thing we can leave
to the buildbots (assuming our OS X buildbots come back alive some
day :-)).
|
msg160309 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2012-05-09 19:29 |
I hacked the code (commented out "#if __APPLE__" in
Objects/unicodeobject.c and Modules/python.c) to start this branch on
Linux and ran the test (test_cmd_line) with C locale. It passed. Then I
broke decoder and ran the test again to get the error. I can now confirm
that the code works correctly on a platform with a 32-bit wchar_t.
|
msg160311 - (view) |
Author: Mark Dickinson (mark.dickinson) * |
Date: 2012-05-09 20:13 |
> Actually, it should be enough to run the test suite, since we should
> have tests for this.
I just ran the test suite ("python -m test") on OS X 10.6.8 with 'decode_utf8_5.patch' applied. (64-bit --with-pydebug build of Python.) No test failures.
test header:
== CPython 3.3.0a3+ (default:840cb46d0395+, May 9 2012, 20:55:18) [GCC 4.2.1 (Apple Inc. build 5664)]
== Darwin-10.8.0-i386-64bit little-endian
== /Users/mdickinson/Python/cpython/build/test_python_39794
Fragment of configure output relevant to wchar looked like this:
checking wchar.h usability... yes
checking wchar.h presence... yes
checking for wchar.h... yes
checking size of wchar_t... 4
checking for UCS-4 tcl... no
checking whether wchar_t is signed... yes
no usable wchar_t found
|
msg160312 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2012-05-09 20:18 |
> The difficulty is that you need to check on both Macs
> with 16-bit and with 32-bit wchar_t.
I don't think that the size of wchar_t is configurable: it should always be 32 bits on Mac OS X.
|
msg160346 - (view) |
Author: Roundup Robot (python-dev) |
Date: 2012-05-10 14:38 |
New changeset e08c3791f035 by Antoine Pitrou in branch 'default':
Issue #14738: Speed-up UTF-8 decoding on non-ASCII data. Patch by Serhiy Storchaka.
http://hg.python.org/cpython/rev/e08c3791f035
|
msg160347 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2012-05-10 14:38 |
The patch is now committed. Well done and thanks for your contribution.
|
msg160447 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2012-05-11 19:45 |
Thanks Martin for review, which has allowed me to make a quality patch, and for promotion of further research. Thanks Antoine for review, benchmarks, commit, and for the original optimization, which served as the basis for my patch.
|
msg160462 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2012-05-12 07:09 |
If the commit makes Python 3.3 faster than Python 3.2, it is an
optimisation that should be documented in the What's New in Python 3.3
document.
|
|
Date |
User |
Action |
Args |
2022-04-11 14:57:29 | admin | set | github: 58943 |
2012-05-12 07:09:09 | vstinner | set | messages:
+ msg160462 |
2012-05-11 21:58:22 | pitrou | link | issue14419 superseder |
2012-05-11 21:58:22 | pitrou | unlink | issue14419 dependencies |
2012-05-11 21:58:14 | pitrou | link | issue14419 dependencies |
2012-05-11 19:45:44 | serhiy.storchaka | set | messages:
+ msg160447 |
2012-05-10 14:38:47 | pitrou | set | status: open -> closed resolution: fixed messages:
+ msg160347
stage: patch review -> resolved |
2012-05-10 14:38:11 | python-dev | set | nosy:
+ python-dev messages:
+ msg160346
|
2012-05-09 20:18:21 | vstinner | set | messages:
+ msg160312 |
2012-05-09 20:13:57 | mark.dickinson | set | nosy:
+ mark.dickinson messages:
+ msg160311
|
2012-05-09 19:29:53 | serhiy.storchaka | set | messages:
+ msg160309 |
2012-05-09 18:41:36 | pitrou | set | nosy:
+ janssen
|
2012-05-09 18:41:16 | pitrou | set | messages:
+ msg160308 |
2012-05-09 18:32:09 | serhiy.storchaka | set | messages:
+ msg160307 |
2012-05-09 18:05:08 | serhiy.storchaka | set | messages:
+ msg160306 |
2012-05-09 16:50:50 | pitrou | set | nosy:
+ ronaldoussoren, ned.deily messages:
+ msg160305
|
2012-05-06 22:11:07 | serhiy.storchaka | set | files:
+ decode_utf8_5.patch
messages:
+ msg160112 |
2012-05-06 21:48:10 | serhiy.storchaka | set | messages:
+ msg160110 |
2012-05-06 20:01:02 | pitrou | set | messages:
+ msg160107 |
2012-05-06 18:30:06 | ezio.melotti | set | nosy:
+ ezio.melotti
components:
+ Unicode stage: patch review |
2012-05-06 18:00:54 | serhiy.storchaka | create | |