Issue 31179: Speed-up dict.copy() up to 5.5 times.

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/75362

classification

Title:	Speed-up dict.copy() up to 5.5 times.
Type:	performance	Stage:	resolved
Components:	Interpreter Core	Versions:	Python 3.7

process

Status:	closed	Resolution:	fixed
Dependencies:		Superseder:
Assigned To:		Nosy List:	methane, rhettinger, serhiy.storchaka, vstinner, yselivanov
Priority:	normal	Keywords:

Created on 2017-08-10 21:31 by yselivanov, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Pull Requests
URL	Status	Linked	Edit
PR 3067	merged	yselivanov, 2017-08-10 21:35

Messages (13)
msg300117 - (view)	Author: Yury Selivanov (yselivanov) *	Date: 2017-08-10 21:31
It's possible to significantly improve performance of shallow dict copy. Currently, PyDict_Copy creates a new empty dict object and then inserts key/values into it one by one. My idea is to simply memcpy the whole keys/items region and do the necessary increfs after it. This works just fine for non-key-sharing dicts. With the following simple microbenchmark: import time N = 1000000 for size in [0, 1, 10, 20, 50, 100, 500, 1000]: d = dict([(str(i), i) for i in range(size)]) t = time.monotonic() for i in range(N): d.copy() e = time.monotonic() - t print(f'dict(size={size}).copy() x {N} times:\t {e:.4f}') Output for 3.7 master: dict(size=0).copy() x 1000000 times: 0.1299 dict(size=1).copy() x 1000000 times: 0.1499 dict(size=10).copy() x 1000000 times: 0.3758 dict(size=20).copy() x 1000000 times: 0.7722 dict(size=50).copy() x 1000000 times: 1.2784 dict(size=100).copy() x 1000000 times: 2.5128 dict(size=500).copy() x 1000000 times: 12.8968 dict(size=1000).copy() x 1000000 times: 25.4276 Output for patched 3.7: dict(size=0).copy() x 1000000 times: 0.1352 dict(size=1).copy() x 1000000 times: 0.1285 dict(size=10).copy() x 1000000 times: 0.1632 dict(size=20).copy() x 1000000 times: 0.3076 dict(size=50).copy() x 1000000 times: 0.3663 dict(size=100).copy() x 1000000 times: 0.5140 dict(size=500).copy() x 1000000 times: 2.3419 dict(size=1000).copy() x 1000000 times: 4.6176
msg300120 - (view)	Author: STINNER Victor (vstinner) *	Date: 2017-08-10 21:43
> PyDict_Copy creates a new empty dict object and then inserts key/values into it one by one. Why not creating a "preallocated" dict in that case? _PyDict_NewPresized()
msg300123 - (view)	Author: Yury Selivanov (yselivanov) *	Date: 2017-08-10 21:46
>> PyDict_Copy creates a new empty dict object and then inserts key/values into it one by one. > Why not creating a "preallocated" dict in that case? _PyDict_NewPresized() I don't think it's related to the proposed patch. Please take a look at the PR. `_PyDict_NewPresized` and inserting entries one by one will not be faster than a `memcpy`.
msg300153 - (view)	Author: Inada Naoki (methane) *	Date: 2017-08-11 06:06
I like idea. One worrying point is how to deal with dirty dict. How about do it only when ma_used == keys->dk_nentries? Slightly off topic. Copy on write can be implemented via dk_refcnt. Functions just passing `**kwargs` has temporal copy of dict. And CoW will reduce the cost of temporal dict.
msg300155 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2017-08-11 07:10
The PR adds over 50 lines of code for optimising not very often used feature. There are two obvious ways of copying, dict(d) and d.copy(), the PR optimises just the latter one, and I'm not sure this is the most used way. The PR duplicates the low-level code, that increases maintainability cost. The PR changes the behavior. Currently the effect of copying is compacting the dict. >>> import sys >>> sys.getsizeof(d) 41020 >>> sys.getsizeof(d.copy()) 41020 >>> sys.getsizeof(dict(d)) 41020 >>> for i in range(1000): del d[i] ... >>> sys.getsizeof(dict(d)) 20544 >>> sys.getsizeof(d.copy()) 20544 >>> sys.getsizeof(d) 41020 >>> import sys >>> d = dict.fromkeys(range(2000)) >>> sys.getsizeof(d) 41020 >>> sys.getsizeof(d.copy()) 41020 >>> d = dict.fromkeys(range(2000)) >>> for i in range(1999): del d[i] ... >>> sys.getsizeof(d) 41020 >>> sys.getsizeof(d.copy()) 136 The PR preserves non compact layout in the copy.
msg300159 - (view)	Author: STINNER Victor (vstinner) *	Date: 2017-08-11 09:24
>>>> d = dict.fromkeys(range(2000)) >>>> for i in range(1999): del d[i] > ... >>>> sys.getsizeof(d) > 41020 >>>> sys.getsizeof(d.copy()) > 136 Why "del" doesn't compact the dict?
msg300163 - (view)	Author: Yury Selivanov (yselivanov) *	Date: 2017-08-11 14:48
> I like idea. > One worrying point is how to deal with dirty dict. > How about do it only when ma_used == keys->dk_nentries? I've added this check. See the updated PR. > The PR changes the behavior. Currently the effect of copying is compacting the dict. The check that INADA suggested enables compacting on copy, if it is needed. > The PR adds over 50 lines of code for optimising not very often used feature. I started to look into the problem because I need this for my upcoming PEP, so please don't dismiss this idea right away. I also think that copying a dict isn't a "not very often used feature", it depends on your frame of references. In some applications you do copy dict a lot. 50 lines of code speeding up one of the core methods 5.5x is a fair price to pay. > There are two obvious ways of copying, dict(d) and d.copy() That can also be easily optimized, btw. I'll see if I can do that without impacting the performance of creating new dicts. > The PR duplicates the low-level code, that increases maintainability cost. FWIW, the PR doesn't duplicate any of the code. It provides a new implementation that is more efficient than the old approach.
msg300164 - (view)	Author: Yury Selivanov (yselivanov) *	Date: 2017-08-11 14:49
> Why "del" doesn't compact the dict? This is a good question, btw.
msg300423 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2017-08-17 13:55
The side effect of this patch is making dict.copy() atomic. This is a worthy feature if extent it to dict constructor. For now the only way of making an atomic (or almost atomic) copy of a dict is dict(list(d.itemview())). It isn't very time and memory efficient. If you will make dict copying removing holes and extend your patch to dict constructor, it could be more useful. Look at the set implementation. It doesn't just use memcpy, but it contains specialized insertion implementation for the case if all items are unique. Fast copying is more important for dicts since the copying is more common for sets. It is a part of set operations and it is common to convert a set to a frozenset.
msg310380 - (view)	Author: Yury Selivanov (yselivanov) *	Date: 2018-01-21 19:56
I've pushed a new version of the patch that I intend to merge tomorrow. The last version has only one minor change: it uses fast-path for "slightly" non-compact dicts too (dicts don't use at most 3 entries). This protects us from pathological cases when a huge dict being almost emptied with pop/del and then gets copied -- we indeed want the copy to be compact. Although I believe that the real issue is that del and pop don't compact dicts from time to time, but I don't want that issue to hold off this patch in any way. > If you will make dict copying removing holes and extend your patch to dict constructor, it could be more useful. It's already useful -- I'm supporting a large code base (>0.5M LOC) which uses dict.copy() extensively, and it shows up in profile. I've seen it in many other places (particularly ORMs love to store information in dicts and use dict.copy() to track dirty state/changes). Please don't say that dict.copy() is not a common operation or that dict(other_dict) is more common than other_dict.copy() -- that's simply incorrect. > The side effect of this patch is making dict.copy() atomic. This is a worthy feature if extent it to dict constructor. I agree. I'll work on that later in a follow-up PR. Let's move in small steps.
msg310426 - (view)	Author: STINNER Victor (vstinner) *	Date: 2018-01-22 16:02
Yury: Would you mind to open an issue to investigate why dict are not compatected automatically?
msg310428 - (view)	Author: Yury Selivanov (yselivanov) *	Date: 2018-01-22 16:37
Victor: done; https://bugs.python.org/issue32623
msg310432 - (view)	Author: Yury Selivanov (yselivanov) *	Date: 2018-01-22 16:54
New changeset b0a7a037b8fde56b62f886d5188bced7776777b4 by Yury Selivanov in branch 'master': bpo-31179: Make dict.copy() up to 5.5 times faster. (#3067) https://github.com/python/cpython/commit/b0a7a037b8fde56b62f886d5188bced7776777b4

History
Date	User	Action	Args
2022-04-11 14:58:50	admin	set	github: 75362
2018-01-22 16:55:17	yselivanov	set	status: open -> closed resolution: fixed stage: patch review -> resolved
2018-01-22 16:54:45	yselivanov	set	messages: + msg310432
2018-01-22 16:37:55	yselivanov	set	messages: + msg310428
2018-01-22 16:02:07	vstinner	set	messages: + msg310426
2018-01-21 19:56:47	yselivanov	set	messages: + msg310380
2017-08-17 13:55:20	serhiy.storchaka	set	messages: + msg300423
2017-08-11 14:49:02	yselivanov	set	messages: + msg300164
2017-08-11 14:48:01	yselivanov	set	messages: + msg300163
2017-08-11 09:24:27	vstinner	set	messages: + msg300159
2017-08-11 07:10:53	serhiy.storchaka	set	nosy: + rhettinger
2017-08-11 07:10:08	serhiy.storchaka	set	nosy: + serhiy.storchaka messages: + msg300155
2017-08-11 06:06:47	methane	set	messages: + msg300153
2017-08-10 21:46:08	yselivanov	set	messages: + msg300123
2017-08-10 21:43:02	vstinner	set	messages: + msg300120
2017-08-10 21:35:14	yselivanov	set	pull_requests: + pull_request3104
2017-08-10 21:31:33	yselivanov	create