classification
Title: Optimize dict_merge for copy
Type: Stage: patch review
Components: Interpreter Core Versions: Python 3.10
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: inada.naoki, serhiy.storchaka, yselivanov
Priority: normal Keywords: patch

Created on 2020-07-29 00:29 by inada.naoki, last changed 2020-07-29 04:50 by serhiy.storchaka.

Pull Requests
URL Status Linked Edit
PR 21669 closed inada.naoki, 2020-07-29 00:41
PR 21674 open inada.naoki, 2020-07-29 03:18
Messages (4)
msg374550 - (view) Author: Inada Naoki (inada.naoki) * (Python committer) Date: 2020-07-29 00:29
Although there are dict.copy() and PyDict_Copy(), dict_merge can be used for copying dict.

* d={}; d.update(orig)
* d=dict(orig)
* d=orig.copy() # orig has many dummy keys.
msg374551 - (view) Author: Inada Naoki (inada.naoki) * (Python committer) Date: 2020-07-29 00:50
Microbenchmark for commit cf4f61ce50e07f7ccd3aef991647050c8da058f9.

# timeit -s 'd=dict.fromkeys(range(8))' -- 'dict(d)'
Mean +- std dev: [master] 311 ns +- 2 ns -> [patched] 144 ns +- 1 ns: 2.16x faster (-54%)

# timeit -s 'd=dict.fromkeys(range(1000))' -- 'dict(d)'
Mean +- std dev: [master] 21.6 us +- 0.2 us -> [patched] 7.67 us +- 0.09 us: 2.81x faster (-64%)

# timeit -s 'd=dict.fromkeys(range(8))' -- '{}.update(d)'
Mean +- std dev: [master] 301 ns +- 5 ns -> [patched] 149 ns +- 1 ns: 2.01x faster (-50%)

# timeit -s 'd=dict.fromkeys(range(1000))' -- '{}.update(d)'
Mean +- std dev: [master] 21.4 us +- 0.2 us -> [patched] 7.64 us +- 0.07 us: 2.80x faster (-64%)
msg374555 - (view) Author: Inada Naoki (inada.naoki) * (Python committer) Date: 2020-07-29 02:05
To reduce code size, I am considering to remove clone_combined_dict. I will check how PyDict_Copy() is performance critical.

This is microbenchmark result of d.copy() and dict(d).

$ ./python -m pyperf timeit --compare-to ./python-master -s 'd=dict.fromkeys(range(1000))' -- 'd.copy()'
python-master: ..................... 4.36 us +- 0.07 us
python: ..................... 5.96 us +- 0.10 us

Mean +- std dev: [python-master] 4.36 us +- 0.07 us -> [python] 5.96 us +- 0.10 us: 1.37x slower (+37%)

$ ./python -m pyperf timeit --compare-to ./python-master -s 'd=dict.fromkeys(range(1000))' -- 'dict(d)'
python-master: ..................... 21.6 us +- 0.2 us
python: ..................... 6.01 us +- 0.09 us

Mean +- std dev: [python-master] 21.6 us +- 0.2 us -> [python] 6.01 us +- 0.09 us: 3.59x faster (-72%)
msg374556 - (view) Author: Inada Naoki (inada.naoki) * (Python committer) Date: 2020-07-29 03:10
PyDict_Copy() is not used in eval loop or calling functions. So removing clone_combined_dict() is a considerable option.

Another option is to use clone_combined_dict() in dict_merge, instead of adding dict_copy2().

Pros: No performance regression. PyDict_Copy() is as fast as before.
Cons: Can not "fast copy" split dict and dirty dict.

I suppose most dict used by `dict(d)` or `dict.update(d)` is clean and combined. So I will implement the second option.
History
Date User Action Args
2020-07-29 04:50:10serhiy.storchakasetnosy: + serhiy.storchaka, yselivanov
2020-07-29 03:18:29inada.naokisetstage: patch review
pull_requests: + pull_request20819
2020-07-29 03:10:35inada.naokisetmessages: + msg374556
2020-07-29 02:05:35inada.naokisetmessages: + msg374555
2020-07-29 00:50:51inada.naokisetmessages: + msg374551
stage: patch review -> (no value)
2020-07-29 00:41:12inada.naokisetkeywords: + patch
stage: patch review
pull_requests: + pull_request20813
2020-07-29 00:29:38inada.naokicreate