Message349285
Currently PyLong_FromSize_t() uses PyLong_FromLong() for values < PyLong_BASE. It's suboptimal because PyLong_FromLong() needs to handle the sign. Removing PyLong_FromLong() call and handling small ints directly in PyLong_FromSize_t() makes it faster:
$ python -m perf timeit -s "from itertools import repeat; _len = repeat(None, 2).__length_hint__" "_len()" --compare-to=../cpython-master/venv/bin/python --duplicate=10000
/home/sergey/tmp/cpython-master/venv/bin/python: ..................... 18.7 ns +- 0.3 ns
/home/sergey/tmp/cpython-dev/venv/bin/python: ..................... 16.7 ns +- 0.1 ns
Mean +- std dev: [/home/sergey/tmp/cpython-master/venv/bin/python] 18.7 ns +- 0.3 ns -> [/home/sergey/tmp/cpython-dev/venv/bin/python] 16.7 ns +- 0.1 ns: 1.12x faster (-10%)
$ python -m perf timeit -s "from itertools import repeat; _len = repeat(None, 2**10).__length_hint__" "_len()" --compare-to=../cpython-master/venv/bin/python --duplicate=10000
/home/sergey/tmp/cpython-master/venv/bin/python: ..................... 26.2 ns +- 0.0 ns
/home/sergey/tmp/cpython-dev/venv/bin/python: ..................... 25.0 ns +- 0.7 ns
Mean +- std dev: [/home/sergey/tmp/cpython-master/venv/bin/python] 26.2 ns +- 0.0 ns -> [/home/sergey/tmp/cpython-dev/venv/bin/python] 25.0 ns +- 0.7 ns: 1.05x faster (-5%)
$ python -m perf timeit -s "from itertools import repeat; _len = repeat(None, 2**30).__length_hint__" "_len()" --compare-to=../cpython-master/venv/bin/python --duplicate=10000
/home/sergey/tmp/cpython-master/venv/bin/python: ..................... 25.6 ns +- 0.1 ns
/home/sergey/tmp/cpython-dev/venv/bin/python: ..................... 25.6 ns +- 0.0 ns
Mean +- std dev: [/home/sergey/tmp/cpython-master/venv/bin/python] 25.6 ns +- 0.1 ns -> [/home/sergey/tmp/cpython-dev/venv/bin/python] 25.6 ns +- 0.0 ns: 1.00x faster (-0%)
This change makes PyLong_FromSize_t() consistently faster than PyLong_FromSsize_t(). So it might make sense to replace PyLong_FromSsize_t() with PyLong_FromSize_t() in __length_hint__() implementations and other similar cases. For example:
$ python -m perf timeit -s "_len = iter(bytes(2)).__length_hint__" "_len()" --compare-to=../cpython-master/venv/bin/python --duplicate=10000
/home/sergey/tmp/cpython-master/venv/bin/python: ..................... 19.4 ns +- 0.3 ns
/home/sergey/tmp/cpython-dev/venv/bin/python: ..................... 17.3 ns +- 0.1 ns
Mean +- std dev: [/home/sergey/tmp/cpython-master/venv/bin/python] 19.4 ns +- 0.3 ns -> [/home/sergey/tmp/cpython-dev/venv/bin/python] 17.3 ns +- 0.1 ns: 1.12x faster (-11%)
$ python -m perf timeit -s "_len = iter(bytes(2**10)).__length_hint__" "_len()" --compare-to=../cpython-master/venv/bin/python --duplicate=10000
/home/sergey/tmp/cpython-master/venv/bin/python: ..................... 26.3 ns +- 0.1 ns
/home/sergey/tmp/cpython-dev/venv/bin/python: ..................... 25.3 ns +- 0.2 ns
Mean +- std dev: [/home/sergey/tmp/cpython-master/venv/bin/python] 26.3 ns +- 0.1 ns -> [/home/sergey/tmp/cpython-dev/venv/bin/python] 25.3 ns +- 0.2 ns: 1.04x faster (-4%)
$ python -m perf timeit -s "_len = iter(bytes(2**30)).__length_hint__" "_len()" --compare-to=../cpython-master/venv/bin/python --duplicate=10000
/home/sergey/tmp/cpython-master/venv/bin/python: ..................... 27.6 ns +- 0.1 ns
/home/sergey/tmp/cpython-dev/venv/bin/python: ..................... 26.0 ns +- 0.1 ns
Mean +- std dev: [/home/sergey/tmp/cpython-master/venv/bin/python] 27.6 ns +- 0.1 ns -> [/home/sergey/tmp/cpython-dev/venv/bin/python] 26.0 ns +- 0.1 ns: 1.06x faster (-6%) |
|
Date |
User |
Action |
Args |
2019-08-09 13:27:40 | sir-sigurd | set | recipients:
+ sir-sigurd |
2019-08-09 13:27:40 | sir-sigurd | set | messageid: <1565357260.71.0.37857302813.issue37802@roundup.psfhosted.org> |
2019-08-09 13:27:40 | sir-sigurd | link | issue37802 messages |
2019-08-09 13:27:40 | sir-sigurd | create | |
|