Issue 42141: Speedup various dict inits

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/86307

classification

Title:	Speedup various dict inits
Type:	performance	Stage:	resolved
Components:	Interpreter Core	Versions:	Python 3.10

process

Status:	closed	Resolution:	rejected
Dependencies:		Superseder:
Assigned To:		Nosy List:	Marco Sulla, methane
Priority:	normal	Keywords:

Created on 2020-10-24 21:17 by Marco Sulla, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Pull Requests
URL	Status	Linked	Edit
PR 22948	closed	Marco Sulla, 2020-10-24 21:17

Messages (7)
msg379540 - (view)	Author: Marco Sulla (Marco Sulla) *	Date: 2020-10-24 21:17
The PR #22948 is an augmented version of #22346. It speeds up also the creation of: 1. dicts from other dicts that are not "perfect" (combined and without holes) 2. fromkeys 3. copies of dicts with many holes 4. dict from keywords, as in #22346 A sample bench: python -m pyperf timeit --rigorous "dict(o)" -s """ from uuid import uuid4 def getUuid(): return str(uuid4()) o = {getUuid():getUuid() for i in range(1000)} delkey = getUuid() o[delkey] = getUuid() del o[delkey] """ ......................................... Before #22948: Mean +- std dev: 35.9 us +- 0.6 us After: Mean +- std dev: 26.4 us +- 0.4 us
msg379550 - (view)	Author: Inada Naoki (methane) *	Date: 2020-10-25 00:38
> 1. dicts from other dicts that are not "perfect" (combined and without holes) > 3. copies of dicts with many holes Note that I have optimized and rejected it by myself already. See https://github.com/python/cpython/pull/21669 and https://bugs.python.org/issue41431#msg374556 Code duplication is too huge compared to performance gain.
msg379573 - (view)	Author: Marco Sulla (Marco Sulla) *	Date: 2020-10-25 11:39
I'm quite sure I not invented the wheel :) but I think it's a good improvement: \| pathlib \| 35.8 ms \| 35.1 ms \| 1.02x faster \| Significant (t=13.21) \| \| scimark_monte_carlo \| 176 ms \| 172 ms \| 1.02x faster \| Significant (t=9.48) \| \| scimark_sor \| 332 ms \| 325 ms \| 1.02x faster \| Significant (t=11.96) \| \| telco \| 11.0 ms \| 10.8 ms \| 1.03x faster \| Significant (t=8.52) \| \| unpickle_pure_python \| 525 us \| 514 us \| 1.02x faster \| Significant (t=19.97) \| \| xml_etree_process \| 132 ms \| 129 ms \| 1.02x faster \| Significant (t=17.59) \|
msg379574 - (view)	Author: Marco Sulla (Marco Sulla) *	Date: 2020-10-25 11:43
Note that this time I've no slowdown in the macro bench, since I used normal builds, not optimized ones. I suppose an optimized build will show slowdown because the new functions are not in the test battery.
msg379576 - (view)	Author: Inada Naoki (methane) *	Date: 2020-10-25 12:59
I run pyperformance=1.0.0 for your speedup_dictinit branch (7df3b9c) and master branch, with PGO+LTO build. ``` $ ./python -m pyperf compare_to master-opt.json dictinit.json -G --min-speed=1 Slower (22): - unpack_sequence: 62.7 ns +- 0.7 ns -> 70.3 ns +- 0.5 ns: 1.12x slower (+12%) - pickle_dict: 28.6 us +- 0.1 us -> 30.7 us +- 0.1 us: 1.07x slower (+7%) - regex_dna: 233 ms +- 1 ms -> 245 ms +- 0 ms: 1.05x slower (+5%) - unpickle_list: 5.22 us +- 0.22 us -> 5.46 us +- 0.10 us: 1.05x slower (+5%) - sqlite_synth: 3.29 us +- 0.05 us -> 3.43 us +- 0.05 us: 1.04x slower (+4%) - regex_v8: 26.1 ms +- 0.1 ms -> 27.1 ms +- 0.4 ms: 1.04x slower (+4%) - spectral_norm: 147 ms +- 1 ms -> 153 ms +- 2 ms: 1.04x slower (+4%) - scimark_sparse_mat_mult: 5.28 ms +- 0.03 ms -> 5.48 ms +- 0.01 ms: 1.04x slower (+4%) - unpickle_pure_python: 326 us +- 4 us -> 338 us +- 3 us: 1.04x slower (+4%) - nbody: 143 ms +- 3 ms -> 148 ms +- 2 ms: 1.03x slower (+3%) - regex_effbot: 3.43 ms +- 0.01 ms -> 3.53 ms +- 0.02 ms: 1.03x slower (+3%) - django_template: 52.7 ms +- 0.7 ms -> 54.1 ms +- 0.9 ms: 1.03x slower (+3%) - scimark_fft: 405 ms +- 4 ms -> 415 ms +- 9 ms: 1.03x slower (+3%) - fannkuch: 523 ms +- 2 ms -> 535 ms +- 2 ms: 1.02x slower (+2%) - xml_etree_process: 80.2 ms +- 0.8 ms -> 82.0 ms +- 0.8 ms: 1.02x slower (+2%) - json_dumps: 14.6 ms +- 0.1 ms -> 14.9 ms +- 0.1 ms: 1.02x slower (+2%) - scimark_sor: 209 ms +- 2 ms -> 213 ms +- 2 ms: 1.02x slower (+2%) - genshi_text: 32.1 ms +- 0.5 ms -> 32.7 ms +- 0.4 ms: 1.02x slower (+2%) - logging_format: 10.9 us +- 0.2 us -> 11.0 us +- 0.2 us: 1.02x slower (+2%) - pyflate: 734 ms +- 6 ms -> 745 ms +- 8 ms: 1.01x slower (+1%) - float: 134 ms +- 1 ms -> 135 ms +- 2 ms: 1.01x slower (+1%) - raytrace: 511 ms +- 5 ms -> 517 ms +- 5 ms: 1.01x slower (+1%) Faster (2): - logging_silent: 209 ns +- 5 ns -> 205 ns +- 6 ns: 1.02x faster (-2%) - pickle_list: 5.04 us +- 0.03 us -> 4.96 us +- 0.04 us: 1.02x faster (-2%) Benchmark hidden because not significant (36): ``` I suppose all delta are noise. But there is a possibility small performance down caused by fatter binary.
msg379581 - (view)	Author: Marco Sulla (Marco Sulla) *	Date: 2020-10-25 15:14
The fact is that, IMHO, PGO will "false" the results, since it's quite improbable that in the test battery there's a test of creation of a dict from another dict with an hole. It seems to me that the comparison between the normal builds are more significant.
msg379610 - (view)	Author: Marco Sulla (Marco Sulla) *	Date: 2020-10-25 21:23
Well, after a second thought I think you're right, there's no significant advantage and too much duplicated code.

History
Date	User	Action	Args
2022-04-11 14:59:37	admin	set	github: 86307
2020-11-13 03:59:37	methane	set	resolution: rejected
2020-10-25 21:23:38	Marco Sulla	set	status: open -> closed messages: + msg379610 stage: resolved
2020-10-25 15:14:57	Marco Sulla	set	messages: + msg379581
2020-10-25 12:59:14	methane	set	messages: + msg379576
2020-10-25 11:43:23	Marco Sulla	set	messages: + msg379574
2020-10-25 11:39:25	Marco Sulla	set	type: performance messages: + msg379573 components: + Interpreter Core versions: + Python 3.10
2020-10-25 00:38:56	methane	set	nosy: + methane messages: + msg379550
2020-10-24 21:17:49	Marco Sulla	create