Message175129
Oh, I forgot my benchmark results.
decodebench.py result results on Linux 32 bits:
(Linux-3.2.0-32-generic-pae-i686-with-debian-wheezy-sid)
$ ./python bench-diff.py original writer
ascii 'A'*10000 4109 (-3%) 3974
latin1 'A'*10000 3851 (-5%) 3644
latin1 '\x80'*10000 14832 (-3%) 14430
utf-8 'A'*10000 3747 (-4%) 3608
utf-8 '\x80'*10000 976 (-2%) 961
utf-8 '\u0100'*10000 974 (-2%) 959
utf-8 '\u8000'*10000 804 (-14%) 694
utf-8 '\U00010000'*10000 666 (-5%) 635
utf-16le 'A'*10000 4154 (-1%) 4117
utf-16le '\x80'*10000 4055 (-2%) 3988
utf-16le '\u0100'*10000 4047 (-2%) 3974
utf-16le '\u8000'*10000 917 (-1%) 912
utf-16le '\U00010000'*10000 872 (-0%) 870
utf-16be 'A'*10000 3218 (-1%) 3185
utf-16be '\x80'*10000 3163 (-2%) 3114
utf-16be '\u0100'*10000 2591 (-1%) 2556
utf-16be '\u8000'*10000 979 (-1%) 974
utf-16be '\U00010000'*10000 928 (-0%) 925
utf-32le 'A'*10000 1681 (+12%) 1885
utf-32le '\x80'*10000 1697 (+10%) 1865
utf-32le '\u0100'*10000 2224 (+1%) 2254
utf-32le '\u8000'*10000 2224 (+2%) 2269
utf-32le '\U00010000'*10000 2234 (+1%) 2260
utf-32be 'A'*10000 1685 (+11%) 1868
utf-32be '\x80'*10000 1684 (+10%) 1860
utf-32be '\u0100'*10000 2223 (+1%) 2253
utf-32be '\u8000'*10000 2222 (+1%) 2255
utf-32be '\U00010000'*10000 2243 (+1%) 2257
decodebench.py result results on Linux 64 bits:
(Linux-3.4.9-2.fc16.x86_64-x86_64-with-fedora-16-Verne)
ascii 'A'*10000 10043 (+1%) 10144
latin1 'A'*10000 8351 (-1%) 8258
latin1 '\x80'*10000 19184 (+2%) 19560
utf-8 'A'*10000 8083 (+5%) 8461
utf-8 '\x80'*10000 982 (+1%) 993
utf-8 '\u0100'*10000 984 (+1%) 992
utf-8 '\u8000'*10000 806 (+31%) 1053
utf-8 '\U00010000'*10000 639 (+12%) 718
utf-16le 'A'*10000 5547 (-2%) 5422
utf-16le '\x80'*10000 5205 (+1%) 5271
utf-16le '\u0100'*10000 4900 (-4%) 4695
utf-16le '\u8000'*10000 1062 (+9%) 1154
utf-16le '\U00010000'*10000 1040 (+4%) 1078
utf-16be 'A'*10000 5416 (-5%) 5157
utf-16be '\x80'*10000 5077 (-1%) 5011
utf-16be '\u0100'*10000 4261 (-1%) 4218
utf-16be '\u8000'*10000 1146 (+0%) 1147
utf-16be '\U00010000'*10000 1125 (-1%) 1119
utf-32le 'A'*10000 1743 (+8%) 1880
utf-32le '\x80'*10000 1751 (+5%) 1842
utf-32le '\u0100'*10000 2114 (+29%) 2721
utf-32le '\u8000'*10000 2120 (+28%) 2718
utf-32le '\U00010000'*10000 2065 (+30%) 2690
utf-32be 'A'*10000 1761 (+6%) 1860
utf-32be '\x80'*10000 1749 (+6%) 1856
utf-32be '\u0100'*10000 2101 (+29%) 2715
utf-32be '\u8000'*10000 2083 (+30%) 2715
utf-32be '\U00010000'*10000 2058 (+31%) 2689
Most significant changes:
* -14% to decode '\u8000'*10000 from UTF-8 on Linux 32 bits
* +31% to decode '\u8000'*10000 from UTF-8 on Linux 32 bits
* +28% to +31% to decode UCS-2 and UCS-4 characters from UTF-8 on Linux 32 bits
@Serhiy Storchaka: If you feel able to tune _PyUnicodeWriter to
improve its performance, please open a new issue.
I consider the performance changes acceptable and I don't plan to work
on this topic. |
|
Date |
User |
Action |
Args |
2012-11-07 22:53:54 | vstinner | set | recipients:
+ vstinner, loewis, python-dev, serhiy.storchaka |
2012-11-07 22:53:54 | vstinner | link | issue16311 messages |
2012-11-07 22:53:52 | vstinner | create | |
|