Message279419
Currently utf7 encoder uses an aggressive memory allocation strategy: use the worst case 8. We can tighten the worst case.
For 1 byte and 2 byte unicodes, the worst case could be 3*n + 2. For 4 byte unicodes, the worst case could be 6*n + 2.
There are 2 cases. First, all characters needs to be encoded, the result length should be upper_round(2.67*n) + 2 <= 3*n + 2. Second, encode and not encode characters appear one by one. For even length, it's 3n < 3n + 2. For odd length, it's exactly 3n + 2.
This won't benefit much when the string is short. But when the string is long, it speeds up.
Without patch:
[bin]$ ./python3 -m perf timeit -s 's = "abc"*10' 's.encode("utf7")'
....................
Median +- std dev: 2.79 us +- 0.09 us
[bin]$ ./python3 -m perf timeit -s 's = "abc"*100' 's.encode("utf7")'
....................
Median +- std dev: 4.55 us +- 0.13 us
[bin]$ ./python3 -m perf timeit -s 's = "abc"*1000' 's.encode("utf7")'
....................
Median +- std dev: 14.0 us +- 0.4 us
[bin]$ ./python3 -m perf timeit -s 's = "abc"*10000' 's.encode("utf7")'
....................
Median +- std dev: 178 us +- 1 us
With patch:
[bin]$ ./python3 -m perf timeit -s 's = "abc"*10' 's.encode("utf7")'
....................
Median +- std dev: 2.87 us +- 0.09 us
[bin]$ ./python3 -m perf timeit -s 's = "abc"*100' 's.encode("utf7")'
....................
Median +- std dev: 4.50 us +- 0.23 us
[bin]$ ./python3 -m perf timeit -s 's = "abc"*1000' 's.encode("utf7")'
....................
Median +- std dev: 13.3 us +- 0.4 us
[bin]$ ./python3 -m perf timeit -s 's = "abc"*10000' 's.encode("utf7")'
....................
Median +- std dev: 102 us +- 1 us
The patch also removes a check, base64bits can only be not 0 when inShift is not 0. |
|
Date |
User |
Action |
Args |
2016-10-25 16:16:46 | xiang.zhang | set | recipients:
+ xiang.zhang, vstinner, serhiy.storchaka |
2016-10-25 16:16:46 | xiang.zhang | set | messageid: <1477412206.88.0.583380682603.issue28531@psf.upfronthosting.co.za> |
2016-10-25 16:16:46 | xiang.zhang | link | issue28531 messages |
2016-10-25 16:16:46 | xiang.zhang | create | |
|