Title: add internal _PyLong_FromUnsignedChar() function
Type: performance Stage: patch review
Components: Interpreter Core Versions: Python 3.9
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: jdemeyer, sir-sigurd
Priority: normal Keywords: patch

Created on 2019-08-13 10:29 by sir-sigurd, last changed 2019-08-13 12:51 by jdemeyer.

Pull Requests
URL Status Linked Edit
PR 15251 open sir-sigurd, 2019-08-13 10:34
Messages (2)
msg349540 - (view) Author: Sergey Fedoseev (sir-sigurd) * Date: 2019-08-13 10:29
When compiled with default NSMALLPOSINTS, _PyLong_FromUnsignedChar() is significantly faster than other PyLong_From*():

$ python -m perf timeit -s "from collections import deque; consume = deque(maxlen=0).extend; b = bytes(2**20)" "consume(b)" --compare-to=../cpython-master/venv/bin/python
/home/sergey/tmp/cpython-master/venv/bin/python: ..................... 7.10 ms +- 0.02 ms
/home/sergey/tmp/cpython-dev/venv/bin/python: ..................... 4.29 ms +- 0.03 ms

Mean +- std dev: [/home/sergey/tmp/cpython-master/venv/bin/python] 7.10 ms +- 0.02 ms -> [/home/sergey/tmp/cpython-dev/venv/bin/python] 4.29 ms +- 0.03 ms: 1.66x faster (-40%)

It's mostly useful for bytes/bytearray, but also can be used in several other places.
msg349551 - (view) Author: Jeroen Demeyer (jdemeyer) * Date: 2019-08-13 12:51
Maybe an even better idea would be to partially inline PyLong_FromLong(). If the check for small ints in PyLong_FromLong() would be inlined, then the compiler could optimize those checks. This would benefit all users of PyLong_FromLong() without code changes.
Date User Action Args
2019-08-13 12:51:22jdemeyersetnosy: + jdemeyer
messages: + msg349551
2019-08-13 10:34:33sir-sigurdsetkeywords: + patch
stage: patch review
pull_requests: + pull_request14971
2019-08-13 10:29:55sir-sigurdcreate