URL |
Status |
Linked |
Edit |
PR 13930 |
closed |
Mark.Shannon,
2019-06-09 09:40
|
|
PR 14588 |
merged |
jdemeyer,
2019-07-04 13:44
|
|
PR 18464 |
merged |
petr.viktorin,
2020-02-11 16:37
|
|
PR 18928 |
merged |
petr.viktorin,
2020-03-11 15:11
|
|
PR 18936 |
merged |
corona10,
2020-03-11 17:42
|
|
PR 18980 |
merged |
corona10,
2020-03-13 16:56
|
|
PR 18986 |
merged |
corona10,
2020-03-14 08:26
|
|
PR 19019 |
merged |
corona10,
2020-03-15 14:27
|
|
PR 19053 |
merged |
corona10,
2020-03-18 01:34
|
|
PR 19280 |
merged |
corona10,
2020-04-01 15:57
|
|
PR 21337 |
merged |
corona10,
2020-07-05 15:20
|
|
PR 21347 |
closed |
miss-islington,
2020-07-06 11:22
|
|
PR 21350 |
merged |
corona10,
2020-07-06 12:59
|
|
msg345077 - (view) |
Author: Mark Shannon (Mark.Shannon) * |
Date: 2019-06-09 09:23 |
PEP 590 allows us the short circuit the __new__, __init__ slow path for commonly created builtin types.
As an initial step, we can speed up calls to range, list and dict by about 30%.
See https://gist.github.com/markshannon/5cef3a74369391f6ef937d52cca9bfc8
|
msg347272 - (view) |
Author: Inada Naoki (methane) * |
Date: 2019-07-04 11:11 |
Can we call tp_call instead of vectorcall when kwargs is not empty?
https://github.com/python/cpython/blob/7f41c8e0dd237d1f3f0a1d2ba2f3ee4e4bd400a7/Objects/call.c#L209-L219
For example, dict_init may be faster than dict_vectorcall when `d2 = dict(**d1)`.
|
msg347336 - (view) |
Author: Jeroen Demeyer (jdemeyer) * |
Date: 2019-07-05 12:31 |
One thing that keeps bothering me when using vectorcall for type.__call__ is that we would have two completely independent code paths for constructing an object: the new one using vectorcall and the old one using tp_call, which in turn calls tp_new and tp_init.
In typical vectorcall usages, there is no need to support the old way any longer: we can set tp_call = PyVectorcall_Call and that's it. But for "type", we still need to support tp_new and tp_init because there may be C code out there that calls tp_new/tp_init directly. To give one concrete example: collections.defaultdict calls PyDict_Type.tp_init
One solution is to keep the old code for tp_new/tp_init. This is what Mark did in PR 13930. But this leads to duplication of functionality and is therefore error-prone (different code paths may have subtly different behaviour).
Since we don't want to break Python code calling dict.__new__ or dict.__init__, not implementing those is not an option. But to be compatible with the vectorcall signature, ideally we want to implement __init__ using METH_FASTCALL, so __init__ would need to be a normal method instead of a slot wrapper of tp_init (similar to Python classes). This would work, but it needs some support in typeobject.c
|
msg349809 - (view) |
Author: miss-islington (miss-islington) |
Date: 2019-08-15 15:49 |
New changeset 37806f404f57b234902f0c8de9a04647ad01b7f1 by Miss Islington (bot) (Jeroen Demeyer) in branch 'master':
bpo-37207: enable vectorcall for type.__call__ (GH-14588)
https://github.com/python/cpython/commit/37806f404f57b234902f0c8de9a04647ad01b7f1
|
msg352133 - (view) |
Author: Inada Naoki (methane) * |
Date: 2019-09-12 11:48 |
$ ./python -m pyperf timeit --compare-to ./python-master 'dict()'
python-master: ..................... 89.9 ns +- 1.2 ns
python: ..................... 72.5 ns +- 1.6 ns
Mean +- std dev: [python-master] 89.9 ns +- 1.2 ns -> [python] 72.5 ns +- 1.6 ns: 1.24x faster (-19%)
$ ./python -m pyperf timeit --compare-to ./python-master -s 'import string; a=dict.fromkeys(string.ascii_lowercase); b=dict.fromkeys(string.ascii_uppercase)' -- 'dict(a, **b)'
python-master: ..................... 1.41 us +- 0.04 us
python: ..................... 1.53 us +- 0.04 us
Mean +- std dev: [python-master] 1.41 us +- 0.04 us -> [python] 1.53 us +- 0.04 us: 1.09x slower (+9%)
---
There is some overhead in old dict merging idiom. But it seems reasonable compared to the benefit. LGTM.
|
msg362219 - (view) |
Author: miss-islington (miss-islington) |
Date: 2020-02-18 15:13 |
New changeset 6e35da976370e7c2e028165c65d7d7d42772a71f by Petr Viktorin in branch 'master':
bpo-37207: Use vectorcall for range() (GH-18464)
https://github.com/python/cpython/commit/6e35da976370e7c2e028165c65d7d7d42772a71f
|
msg364095 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2020-03-13 13:57 |
New changeset 9ee88cde1abf7f274cc55a0571b1c2cdb1263743 by Dong-hee Na in branch 'master':
bpo-37207: Use PEP 590 vectorcall to speed up tuple() (GH-18936)
https://github.com/python/cpython/commit/9ee88cde1abf7f274cc55a0571b1c2cdb1263743
|
msg364322 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2020-03-16 14:04 |
New changeset c98f87fc330eb40fbcff627dfc50958785a44f35 by Dong-hee Na in branch 'master':
bpo-37207: Use _PyArg_CheckPositional() for tuple vectorcall (GH-18986)
https://github.com/python/cpython/commit/c98f87fc330eb40fbcff627dfc50958785a44f35
|
msg364324 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2020-03-16 14:06 |
New changeset 87ec86c425a5cd3ad41b831b54c0ce1a0c363f4b by Dong-hee Na in branch 'master':
bpo-37207: Add _PyArg_NoKwnames() helper function (GH-18980)
https://github.com/python/cpython/commit/87ec86c425a5cd3ad41b831b54c0ce1a0c363f4b
|
msg364340 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2020-03-16 17:17 |
New changeset 6ff79f65820031b219622faea8425edaec9a43f3 by Dong-hee Na in branch 'master':
bpo-37207: Use PEP 590 vectorcall to speed up set() constructor (GH-19019)
https://github.com/python/cpython/commit/6ff79f65820031b219622faea8425edaec9a43f3
|
msg364428 - (view) |
Author: Dong-hee Na (corona10) * |
Date: 2020-03-17 13:58 |
Victor,
frozenset is the last basic builtin collection which is not applied to this improvement yet.
frozenset also show similar performance improvement by using vectorcall
pyperf compare_to master.json bpo-37207.json
Mean +- std dev: [master] 2.26 us +- 0.06 us -> [bpo-37207] 2.06 us +- 0.05 us: 1.09x faster (-9%)
> What I mean is that vectorcall should not be used for everything
I definitely agree with this opinion. So I ask your opinion before submit the patch.
frozenset is not frequently used than the list/set/dict.
but frozenset is also the basic builtin collection, IMHO it is okay to apply vectorcall.
What do you think?
|
msg364447 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2020-03-17 16:55 |
> What do you think?
I would prefer to see a PR to give my opinion :)
|
msg364538 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2020-03-18 17:30 |
New changeset 1c60567b9a4c8f77e730de9d22690d8e68d7e5f6 by Dong-hee Na in branch 'master':
bpo-37207: Use PEP 590 vectorcall to speed up frozenset() (GH-19053)
https://github.com/python/cpython/commit/1c60567b9a4c8f77e730de9d22690d8e68d7e5f6
|
msg364808 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2020-03-22 16:03 |
Remaining issue: optimize list(iterable), PR 18928. I reviewed the PR and I'm waiting for Petr.
|
msg365307 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2020-03-30 12:16 |
New changeset ce105541f8ebcf2dffcadedfdeffdb698a0edb44 by Petr Viktorin in branch 'master':
bpo-37207: Use vectorcall for list() (GH-18928)
https://github.com/python/cpython/commit/ce105541f8ebcf2dffcadedfdeffdb698a0edb44
|
msg365309 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2020-03-30 12:18 |
All PRs are now merged. Thanks to everybody who was involved in this issue. It's a nice speedup which is always good to take ;-)
|
msg365385 - (view) |
Author: Petr Viktorin (petr.viktorin) * |
Date: 2020-03-31 12:43 |
The change to dict() was not covered by the smaller PRs.
That one will need more thought, but AFAIK it wasn't yet rejected.
|
msg365387 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2020-03-31 12:44 |
Oh sorry, I missed the dict.
|
msg365448 - (view) |
Author: Dong-hee Na (corona10) * |
Date: 2020-04-01 03:24 |
@vstinner @petr.viktorin
I 'd like to experiment dict vector call and finalize the work.
Can I proceed it?
|
msg365452 - (view) |
Author: Petr Viktorin (petr.viktorin) * |
Date: 2020-04-01 08:24 |
Definitely!
|
msg365488 - (view) |
Author: Dong-hee Na (corona10) * |
Date: 2020-04-01 15:39 |
+------------------+-------------------+-----------------------------+
| Benchmark | master-dict-empty | bpo-37207-dict-empty |
+==================+===================+=============================+
| bench dict empty | 502 ns | 443 ns: 1.13x faster (-12%) |
+------------------+-------------------+-----------------------------+
+------------------+--------------------+-----------------------------+
| Benchmark | master-dict-update | bpo-37207-dict-update |
+==================+====================+=============================+
| bench dict empty | 497 ns | 425 ns: 1.17x faster (-15%) |
+------------------+--------------------+-----------------------------+
+--------------------+---------------------+-----------------------------+
| Benchmark | master-dict-kwnames | bpo-37207-dict-kwnames |
+====================+=====================+=============================+
| bench dict kwnames | 1.38 us | 917 ns: 1.51x faster (-34%) |
+--------------------+---------------------+-----------------------------+
|
msg365489 - (view) |
Author: Dong-hee Na (corona10) * |
Date: 2020-04-01 15:40 |
@vstinner @petr.viktorin
Looks like benchmark showing very impressive result.
Can I submit the patch?
|
msg365490 - (view) |
Author: Petr Viktorin (petr.viktorin) * |
Date: 2020-04-01 15:45 |
> Can I submit the patch?
Yes!
If you think a patch is ready for review, just submit it. There's not much we can comment on before we see the code :)
(I hope that doesn't contradict what your mentor says...)
|
msg365491 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2020-04-01 15:48 |
When I designed the FASTCALL calling convention, I experimented a new tp_fastcall slot to PyTypeObject to optimize __call__() method: bpo-29259.
Results on the pyperformance benchmark suite were not really convincing and I had technical issues (decide if tp_call or tp_fastcall should be called, handle ABI compatibility and backward compatibility, etc.). I decided to give up on this idea.
I'm happy to see that PEP 590 managed to find its way into Python internals and actually make Python faster ;-)
|
msg365545 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2020-04-02 00:55 |
New changeset e27916b1fc0364e3627438df48550c16f0b80b82 by Dong-hee Na in branch 'master':
bpo-37207: Use PEP 590 vectorcall to speed up dict() (GH-19280)
https://github.com/python/cpython/commit/e27916b1fc0364e3627438df48550c16f0b80b82
|
msg365546 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2020-04-02 00:56 |
Can we now close this issue? Or does someone plan to push further optimizations. Maybe new issues can be opened for next optimizations?
|
msg365553 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2020-04-02 01:31 |
> When I designed the FASTCALL calling convention, I experimented a new tp_fastcall slot to PyTypeObject to optimize __call__() method: bpo-29259.
Ah, by the way, I also made an attempt to use the FASTCALL calling convention for tp_new and tp_init: bpo-29358. Again, the speedup wasn't obvious and the implementation was quite complicated with many corner cases. So I gave up on this one. It didn't seem to be really worth it.
|
msg365811 - (view) |
Author: Dong-hee Na (corona10) * |
Date: 2020-04-05 05:15 |
IMHO, we can close this PR.
Summary:
The PEP 590 vectorcall is applied to list, tuple, dict, set, frozenset and range
If someone wants to apply PEP 590 to other cases.
Please open a new issue for it!
Thank you, Mark, Jeroen, Petr and everyone who works for this issue.
|
msg365907 - (view) |
Author: Petr Viktorin (petr.viktorin) * |
Date: 2020-04-07 14:44 |
As discussed briefly in Mark's PR, benchmarks like this are now slower:
ret = dict(**{'a': 2, 'b': 4, 'c': 6, 'd': 8})
Python 3.8: Mean +- std dev: 281 ns +- 9 ns
master: Mean +- std dev: 456 ns +- 14 ns
|
msg373095 - (view) |
Author: Łukasz Langa (lukasz.langa) * |
Date: 2020-07-06 11:22 |
New changeset b4a9263708cc67c98c4d53b16933f6e5dd07990f by Dong-hee Na in branch 'master':
bpo-37207: Update whatsnews for 3.9 (GH-21337)
https://github.com/python/cpython/commit/b4a9263708cc67c98c4d53b16933f6e5dd07990f
|
msg373116 - (view) |
Author: Dong-hee Na (corona10) * |
Date: 2020-07-06 13:32 |
New changeset 97558d6b08a656eae209d49b206f703cee0359a2 by Dong-hee Na in branch '3.9':
[3.9] bpo-37207: Update whatsnews for 3.9 (GH-21337)
https://github.com/python/cpython/commit/97558d6b08a656eae209d49b206f703cee0359a2
|
|
Date |
User |
Action |
Args |
2022-04-11 14:59:16 | admin | set | github: 81388 |
2020-07-06 13:32:13 | corona10 | set | messages:
+ msg373116 |
2020-07-06 12:59:40 | corona10 | set | pull_requests:
+ pull_request20496 |
2020-07-06 11:22:31 | miss-islington | set | pull_requests:
+ pull_request20494 |
2020-07-06 11:22:11 | lukasz.langa | set | nosy:
+ lukasz.langa messages:
+ msg373095
|
2020-07-05 15:20:26 | corona10 | set | pull_requests:
+ pull_request20485 |
2020-04-07 14:44:27 | petr.viktorin | set | messages:
+ msg365907 |
2020-04-05 05:15:46 | corona10 | set | status: open -> closed resolution: fixed messages:
+ msg365811
stage: patch review -> resolved |
2020-04-02 01:31:24 | vstinner | set | messages:
+ msg365553 |
2020-04-02 00:56:53 | vstinner | set | messages:
+ msg365546 |
2020-04-02 00:55:47 | vstinner | set | messages:
+ msg365545 |
2020-04-01 15:57:53 | corona10 | set | stage: resolved -> patch review pull_requests:
+ pull_request18637 |
2020-04-01 15:48:51 | vstinner | set | messages:
+ msg365491 |
2020-04-01 15:45:57 | petr.viktorin | set | messages:
+ msg365490 |
2020-04-01 15:40:56 | corona10 | set | messages:
+ msg365489 |
2020-04-01 15:39:44 | corona10 | set | files:
+ bench_dict_update.py |
2020-04-01 15:39:37 | corona10 | set | files:
+ bench_dict_kwnames.py |
2020-04-01 15:39:29 | corona10 | set | files:
+ bench_dict_empty.py |
2020-04-01 15:39:16 | corona10 | set | messages:
+ msg365488 |
2020-04-01 08:24:07 | petr.viktorin | set | messages:
+ msg365452 |
2020-04-01 03:24:32 | corona10 | set | messages:
+ msg365448 |
2020-03-31 12:44:38 | vstinner | set | resolution: fixed -> (no value) messages:
+ msg365387 |
2020-03-31 12:43:46 | petr.viktorin | set | status: closed -> open
messages:
+ msg365385 |
2020-03-30 12:18:57 | vstinner | set | status: open -> closed versions:
+ Python 3.9 type: enhancement -> performance messages:
+ msg365309
resolution: fixed stage: patch review -> resolved |
2020-03-30 12:16:25 | vstinner | set | messages:
+ msg365307 |
2020-03-22 16:03:11 | vstinner | set | messages:
+ msg364808 |
2020-03-22 04:57:09 | phsilva | set | nosy:
+ phsilva
|
2020-03-18 17:30:53 | vstinner | set | messages:
+ msg364538 |
2020-03-18 01:34:59 | corona10 | set | pull_requests:
+ pull_request18404 |
2020-03-17 16:55:43 | vstinner | set | messages:
+ msg364447 |
2020-03-17 13:58:06 | corona10 | set | messages:
+ msg364428 |
2020-03-16 17:17:41 | vstinner | set | messages:
+ msg364340 |
2020-03-16 14:06:32 | vstinner | set | messages:
+ msg364324 |
2020-03-16 14:04:24 | vstinner | set | messages:
+ msg364322 |
2020-03-15 14:27:55 | corona10 | set | pull_requests:
+ pull_request18368 |
2020-03-14 08:26:35 | corona10 | set | pull_requests:
+ pull_request18334 |
2020-03-13 16:56:57 | corona10 | set | pull_requests:
+ pull_request18328 |
2020-03-13 13:57:15 | vstinner | set | nosy:
+ vstinner messages:
+ msg364095
|
2020-03-11 17:42:50 | corona10 | set | nosy:
+ corona10 pull_requests:
+ pull_request18288
|
2020-03-11 15:11:38 | petr.viktorin | set | nosy:
+ petr.viktorin pull_requests:
+ pull_request18280
|
2020-02-18 15:13:24 | miss-islington | set | messages:
+ msg362219 |
2020-02-11 16:37:44 | petr.viktorin | set | pull_requests:
+ pull_request17837 |
2019-09-12 11:48:13 | methane | set | messages:
+ msg352133 |
2019-08-15 15:49:52 | miss-islington | set | nosy:
+ miss-islington messages:
+ msg349809
|
2019-07-05 12:31:43 | jdemeyer | set | nosy:
+ jdemeyer messages:
+ msg347336
|
2019-07-04 13:44:15 | jdemeyer | set | pull_requests:
+ pull_request14406 |
2019-07-04 11:11:10 | methane | set | nosy:
+ methane messages:
+ msg347272
|
2019-06-09 09:40:27 | Mark.Shannon | set | keywords:
+ patch stage: patch review pull_requests:
+ pull_request13796 |
2019-06-09 09:23:47 | Mark.Shannon | create | |