classification
Title: improve performance of binascii.unhexlify() by using conversion table
Type: performance Stage: resolved
Components: Library (Lib) Versions: Python 3.8
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: cheryl.sabella, meador.inge, pitrou, serhiy.storchaka, sir-sigurd
Priority: normal Keywords: patch

Created on 2017-11-27 12:14 by sir-sigurd, last changed 2019-03-20 04:02 by serhiy.storchaka. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 4586 merged sir-sigurd, 2017-11-27 12:16
Messages (11)
msg307053 - (view) Author: Sergey Fedoseev (sir-sigurd) * Date: 2017-11-27 12:14
Before:

$ ./python -m timeit -s "from binascii import unhexlify; b = b'aa'*2**20" "unhexlify(b)"
50 loops, best of 5: 5.68 msec per loop

After:

$ ./python -m timeit -s "from binascii import unhexlify; b = b'aa'*2**20" "unhexlify(b)"
100 loops, best of 5: 2.06 msec per loop
msg307368 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-12-01 07:21
I can't reproduce the performance difference.
msg307378 - (view) Author: Sergey Fedoseev (sir-sigurd) * Date: 2017-12-01 10:06
Serhiy, did you use the same benchmark as mentioned here?
msg307379 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-12-01 10:13
Yes. And I can't reproduce a slowdown with a simplified a2b_qp(). Maybe this depends on the compiler and on the CPU. What are your OS, compiler and CPU? Do you build 32- or 64-bit Python? Do you build in a debug or release mode?
msg307381 - (view) Author: Sergey Fedoseev (sir-sigurd) * Date: 2017-12-01 10:37
> OS
x86_64 GNU/Linux

> compiler
gcc version 7.2.0 (Debian 7.2.0-16)

> CPU
Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              4
On-line CPU(s) list: 0-3
Thread(s) per core:  2
Core(s) per socket:  2
Socket(s):           1
NUMA node(s):        1
Vendor ID:           GenuineIntel
CPU family:          6
Model:               58
Model name:          Intel(R) Core(TM) i5-3210M CPU @ 2.50GHz
Stepping:            9
CPU MHz:             2494.521
CPU max MHz:         3100,0000
CPU min MHz:         1200,0000
BogoMIPS:            4989.04
Virtualization:      VT-x
L1d cache:           32K
L1i cache:           32K
L2 cache:            256K
L3 cache:            3072K
NUMA node0 CPU(s):   0-3
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer xsave avx f16c rdrand lahf_lm cpuid_fault epb tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt dtherm ida arat pln pts

> Do you build 32- or 64-bit Python?
I'm not sure about that, I guess 64 is default on 64 OS?

> Do you build in a debug or release mode?
I tried with --enable-optimizations, --with-pydebug and without any flags. Numbers are different, but magnitude of a change is the same.
msg307504 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2017-12-03 11:04
Here are the results here:

- Before:
$ ./python -m timeit -s "from binascii import unhexlify; b = b'aa'*2**20" "unhexlify(b)"
50 loops, best of 5: 4.37 msec per loop

- After:
$ ./python -m timeit -s "from binascii import unhexlify; b = b'aa'*2**20" "unhexlify(b)"
200 loops, best of 5: 1.16 msec per loop
msg307505 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2017-12-03 11:05
(platform is Ubuntu 16.04, 64-bit, on a Core i5-2500K CPU)
msg308491 - (view) Author: Sergey Fedoseev (sir-sigurd) * Date: 2017-12-17 18:14
Is there anything I can do to push this forward?

BTW, Serhiy, what are your OS, compiler and CPU?
msg308498 - (view) Author: Meador Inge (meador.inge) * (Python committer) Date: 2017-12-17 20:19
FWIW, I see a win on OS X 10.12.6:

λ:[master !?](~/Code/src/python/cpython)=> cc --version
Apple LLVM version 8.1.0 (clang-802.0.42)
Target: x86_64-apple-darwin16.7.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin
λ:[master !?](~/Code/src/python/cpython)=> uname -a
Darwin ripley.attlocal.net 16.7.0 Darwin Kernel Version 16.7.0: Wed Oct  4 00:17:00 PDT 2017; root:xnu-3789.71.6~1/RELEASE_X86_64 x86_64

- Before:
λ:[master ?](~/Code/src/python/cpython)=> ./python.exe -m timeit -s "from binascii import unhexlify; b = b'aa'*2**20" "unhexlify(b)"
20 loops, best of 5: 11.3 msec per loop

- After:
λ:[master !?](~/Code/src/python/cpython)=> ./python.exe -m timeit -s "from binascii import unhexlify; b = b'aa'*2**20" "unhexlify(b)"
50 loops, best of 5: 4.15 msec per loop
msg312951 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-02-26 20:35
New changeset 6b5df906afe113dbe421d044322254cfd4747c9c by Serhiy Storchaka (Sergey Fedoseev) in branch 'master':
bpo-32147: Improved perfomance of binascii.unhexlify(). (GH-4586)
https://github.com/python/cpython/commit/6b5df906afe113dbe421d044322254cfd4747c9c
msg338413 - (view) Author: Cheryl Sabella (cheryl.sabella) * (Python committer) Date: 2019-03-19 22:44
Since this PR was merged, can the issue be closed?
History
Date User Action Args
2019-03-20 04:02:39serhiy.storchakasetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2019-03-19 22:44:58cheryl.sabellasetnosy: + cheryl.sabella
messages: + msg338413
2018-02-26 20:35:47serhiy.storchakasetmessages: + msg312951
2018-02-26 20:19:06ned.deilysetversions: + Python 3.8, - Python 3.7
2017-12-17 20:19:31meador.ingesetnosy: + meador.inge
messages: + msg308498
2017-12-17 18:14:27sir-sigurdsetmessages: + msg308491
2017-12-03 11:05:09pitrousetmessages: + msg307505
2017-12-03 11:04:35pitrousetnosy: + pitrou
messages: + msg307504
2017-12-01 10:37:12sir-sigurdsetmessages: + msg307381
2017-12-01 10:13:02serhiy.storchakasetmessages: + msg307379
2017-12-01 10:06:23sir-sigurdsetmessages: + msg307378
2017-12-01 07:21:00serhiy.storchakasetnosy: + serhiy.storchaka

messages: + msg307368
versions: + Python 3.7
2017-11-27 12:21:52sir-sigurdsetcomponents: + Library (Lib)
2017-11-27 12:16:56sir-sigurdsetkeywords: + patch
stage: patch review
pull_requests: + pull_request4508
2017-11-27 12:14:29sir-sigurdcreate