This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Optimize some set operations in dictkeys object
Type: performance Stage: resolved
Components: Interpreter Core Versions: Python 3.9
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: methane, rhettinger, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2019-10-28 11:55 by methane, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 16961 merged methane, 2019-10-28 11:57
Messages (7)
msg355536 - (view) Author: Inada Naoki (methane) * (Python committer) Date: 2019-10-28 11:55
-, |, and ^ of dictkeys are implemented as:

PyObject *result = PySet_New(self);
// Call set.difference_update, set.update, set.symmetric_difference_update with other.

PySet_New(iterable) has optimized step for iterable is dict.
But since iterable is dictkeys, PyIter_Next() is called for all elements in the dict.

We can pass dict instead of dictkey object to PySet_New.

  $ ./python -m pyperf timeit -o patched.json -s 'k = dict.fromkeys("abcdefghijklmnopqrstuvwxyz").keys(); s={1,2,3}' -- 'k | {1,2,3}'
  $ ./python -m pyperf compare_to master.json patched.json
  Mean +- std dev: [master] 778 ns +- 17 ns -> [patched] 550 ns +- 24 ns: 1.42x faster (-29%)
msg355541 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2019-10-28 13:33
How does it work with dict subclasses?
msg355618 - (view) Author: Inada Naoki (methane) * (Python committer) Date: 2019-10-29 06:20
> How does it work with dict subclasses?

PySet_New(iterable) uses fast path only when `PyDict_CheckExact(iterable)` is true.

So there is no change for dict subclasses.
msg355619 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2019-10-29 06:45
But what if a dict subclass overrides __iter__? You can get different result if set(d) != set(d.keys()).
msg355662 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2019-10-29 16:29
Consider testing PyDict_CheckExact.
msg355695 - (view) Author: Inada Naoki (methane) * (Python committer) Date: 2019-10-30 07:39
done.
msg356195 - (view) Author: Inada Naoki (methane) * (Python committer) Date: 2019-11-07 15:59
New changeset 6cbc84fb99acb33dd659d7adb29a20adbe62b74a by Inada Naoki in branch 'master':
bpo-38613: Optimize set operations of dict keys. (GH-16961)
https://github.com/python/cpython/commit/6cbc84fb99acb33dd659d7adb29a20adbe62b74a
History
Date User Action Args
2022-04-11 14:59:22adminsetgithub: 82794
2019-11-07 15:59:26methanesetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2019-11-07 15:59:10methanesetmessages: + msg356195
2019-10-30 07:39:49methanesetmessages: + msg355695
2019-10-29 16:29:45rhettingersetnosy: + rhettinger
messages: + msg355662
2019-10-29 06:45:21serhiy.storchakasetmessages: + msg355619
2019-10-29 06:20:01methanesetmessages: + msg355618
2019-10-28 13:33:42serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg355541
2019-10-28 11:57:25methanesetkeywords: + patch
stage: patch review
pull_requests: + pull_request16489
2019-10-28 11:55:45methanecreate