Title: Use PEP 590 vectorcall to speed up calls to range(), list() and dict()
Type: enhancement Stage: patch review
Components: Interpreter Core Versions:
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Mark.Shannon, inada.naoki, jdemeyer, miss-islington
Priority: normal Keywords: patch

Created on 2019-06-09 09:23 by Mark.Shannon, last changed 2020-02-11 16:37 by petr.viktorin.

Pull Requests
URL Status Linked Edit
PR 13930 open Mark.Shannon, 2019-06-09 09:40
PR 14588 merged jdemeyer, 2019-07-04 13:44
PR 18464 open petr.viktorin, 2020-02-11 16:37
Messages (5)
msg345077 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2019-06-09 09:23
PEP 590 allows us the short circuit the __new__, __init__ slow path for commonly created builtin types.
As an initial step, we can speed up calls to range, list and dict by about 30%.
msg347272 - (view) Author: Inada Naoki (inada.naoki) * (Python committer) Date: 2019-07-04 11:11
Can we call tp_call instead of vectorcall when kwargs is not empty?

For example, dict_init may be faster than dict_vectorcall when `d2 = dict(**d1)`.
msg347336 - (view) Author: Jeroen Demeyer (jdemeyer) * (Python triager) Date: 2019-07-05 12:31
One thing that keeps bothering me when using vectorcall for type.__call__ is that we would have two completely independent code paths for constructing an object: the new one using vectorcall and the old one using tp_call, which in turn calls tp_new and tp_init.

In typical vectorcall usages, there is no need to support the old way any longer: we can set tp_call = PyVectorcall_Call and that's it. But for "type", we still need to support tp_new and tp_init because there may be C code out there that calls tp_new/tp_init directly. To give one concrete example: collections.defaultdict calls PyDict_Type.tp_init

One solution is to keep the old code for tp_new/tp_init. This is what Mark did in PR 13930. But this leads to duplication of functionality and is therefore error-prone (different code paths may have subtly different behaviour).

Since we don't want to break Python code calling dict.__new__ or dict.__init__, not implementing those is not an option. But to be compatible with the vectorcall signature, ideally we want to implement __init__ using METH_FASTCALL, so __init__ would need to be a normal method instead of a slot wrapper of tp_init (similar to Python classes). This would work, but it needs some support in typeobject.c
msg349809 - (view) Author: miss-islington (miss-islington) Date: 2019-08-15 15:49
New changeset 37806f404f57b234902f0c8de9a04647ad01b7f1 by Miss Islington (bot) (Jeroen Demeyer) in branch 'master':
bpo-37207: enable vectorcall for type.__call__ (GH-14588)
msg352133 - (view) Author: Inada Naoki (inada.naoki) * (Python committer) Date: 2019-09-12 11:48
$ ./python -m pyperf timeit --compare-to ./python-master 'dict()'
python-master: ..................... 89.9 ns +- 1.2 ns
python: ..................... 72.5 ns +- 1.6 ns

Mean +- std dev: [python-master] 89.9 ns +- 1.2 ns -> [python] 72.5 ns +- 1.6 ns: 1.24x faster (-19%)

$ ./python -m pyperf timeit --compare-to ./python-master -s 'import string; a=dict.fromkeys(string.ascii_lowercase); b=dict.fromkeys(string.ascii_uppercase)' -- 'dict(a, **b)'
python-master: ..................... 1.41 us +- 0.04 us
python: ..................... 1.53 us +- 0.04 us

Mean +- std dev: [python-master] 1.41 us +- 0.04 us -> [python] 1.53 us +- 0.04 us: 1.09x slower (+9%)


There is some overhead in old dict merging idiom.  But it seems reasonable compared to the benefit. LGTM.
Date User Action Args
2020-02-11 16:37:44petr.viktorinsetpull_requests: + pull_request17837
2019-09-12 11:48:13inada.naokisetmessages: + msg352133
2019-08-15 15:49:52miss-islingtonsetnosy: + miss-islington
messages: + msg349809
2019-07-05 12:31:43jdemeyersetnosy: + jdemeyer
messages: + msg347336
2019-07-04 13:44:15jdemeyersetpull_requests: + pull_request14406
2019-07-04 11:11:10inada.naokisetnosy: + inada.naoki
messages: + msg347272
2019-06-09 09:40:27Mark.Shannonsetkeywords: + patch
stage: patch review
pull_requests: + pull_request13796
2019-06-09 09:23:47Mark.Shannoncreate