classification
Title: Optimize iterating split table values
Type: enhancement Stage: resolved
Components: Interpreter Core Versions: Python 3.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: inada.naoki, python-dev, serhiy.storchaka, vstinner, xiang.zhang
Priority: normal Keywords: patch

Created on 2016-11-01 17:25 by xiang.zhang, last changed 2017-03-31 16:36 by dstufft. This issue is now closed.

Files
File name Uploaded Description Edit
_PyDict_Next.patch xiang.zhang, 2016-11-01 17:25 review
iterate_splittable.patch xiang.zhang, 2016-11-02 10:00 review
iterate_splittable_v2.patch xiang.zhang, 2016-11-02 15:17 review
iterate_splittable_v3.patch xiang.zhang, 2016-11-03 04:53 eliminate undefined behaviour review
iterate_splittable_v4.patch xiang.zhang, 2016-11-03 08:40 review
Pull Requests
URL Status Linked Edit
PR 552 closed dstufft, 2017-03-31 16:36
Messages (26)
msg279884 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016-11-01 17:25
Since values of split table is always dense, we can optimise the current implementation of _PyDict_Next. I think this could hardly bring much performance enhancement. More importantly, this emphasizes the invariant and make bugs easy to find and test.
msg279908 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-11-02 09:09
I agree that this unlikely could cause visible effect, but this patch simplifies the code. LGTM.

Similar changes can be applied to other iteration code. dictiter_iternextkey, dict_keys, etc.
msg279909 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-11-02 09:10
I think this is too late for 3.6.
msg279910 - (view) Author: INADA Naoki (inada.naoki) * (Python committer) Date: 2016-11-02 09:19
LGTM, too.

> Similar changes can be applied to other iteration code. dictiter_iternextkey, dict_keys, etc.

I agree. Xiang, would you update patch?
msg279912 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016-11-02 09:28
> Xiang, would you update patch?

Working on it.
msg279916 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016-11-02 10:00
Here is the new patch to apply the optimization to more places.
msg279919 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-11-02 10:56
LGTM.
msg279928 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016-11-02 15:17
Update the patch to use more obvious comparison way. This uses INADA's suggestion and make the code more like other places.

I think the performance issue is better to be discussed in #28397. It doesn't have a significant impact on this patch. Hope we can move forward and not block on it.
msg279930 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-11-02 15:22
Could you please check how this change affects the performance?
msg279931 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016-11-02 16:30
python -> unpatched, python3 -> patched

iterkeys:

(split)

./python3 -m perf timeit --compare-to /home/angwer/cpython/python -s 'from argparse import Namespace; ns = Namespace(); [setattr(ns, str(i), str(i)) for i in range(10000)]' 'list(iter(ns.__dict__))'
python: ..................... 112 us +- 1 us
python3: ..................... 109 us +- 1 us

Median +- std dev: [python] 112 us +- 1 us -> [python3] 109 us +- 1 us: 1.03x faster

(combined)

./python3 -m perf timeit --compare-to /home/angwer/cpython/python -s 'd = {x:x for x in range(10000)}' 'list(iter(d))'python: ..................... 84.3 us +- 2.4 us
python3: ..................... 86.0 us +- 3.5 us

Median +- std dev: [python] 84.3 us +- 2.4 us -> [python3] 86.0 us +- 3.5 us: 1.02x slower

pydict_next:

(split)

./python3 -m perf timeit --compare-to /home/angwer/cpython/python -s 'from argparse import Namespace; ns = Namespace(); [setattr(ns, str(i), str(i)) for i in range(10000)]' 'repr(ns.__dict__)'
python: ..................... 1.85 ms +- 0.01 ms
python3: ..................... 1.85 ms +- 0.11 ms

Median +- std dev: [python] 1.85 ms +- 0.01 ms -> [python3] 1.85 ms +- 0.11 ms: 1.00x faster

(combined)

./python3 -m perf timeit --compare-to /home/angwer/cpython/python -s 'd = {x:x for x in range(10000)}' 'repr(d)'
python: ..................... 1.99 ms +- 0.01 ms
python3: ..................... 1.87 ms +- 0.01 ms

Median +- std dev: [python] 1.99 ms +- 0.01 ms -> [python3] 1.87 ms +- 0.01 ms: 1.06x faster
msg279933 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-11-02 17:21
There is not easy to benchmark PyDict_Next. dict.__repr__ does much hard work (calling repr() for keys and values, formatting strings). Using _testcapi.test_dict_iteration perhaps is the easiest way to measure the performance of pure PyDict_Next, but it includes creating and deallocating of dicts.
msg279934 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016-11-02 17:30
Yes, that's the point. I thought to expose an API in testcapimodule for test, but actually I am not willing to do that since I don't believe this patch could bring any visible performance change.
msg279935 - (view) Author: INADA Naoki (inada.naoki) * (Python committer) Date: 2016-11-02 18:18
> Xiang Zhang added the comment:
>
> Yes, that's the point. I thought to expose an API in testcapimodule for test, but actually I am not willing to do that since I don't believe this patch could bring any visible performance change.
>

I agree with you.
I feel this patch is cleanup rather than optimization.

* Removing redundant loop which always break at first
* Adding very useful assertion to find potential bugs
msg279936 - (view) Author: INADA Naoki (inada.naoki) * (Python committer) Date: 2016-11-02 18:26
LGTM again for iterate_splittable_v2.patch.

How about issue title (and commit message) to
"Cleanup iterating split table"?
msg279938 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-11-02 18:37
I believe the patch doesn't makes iterating split table slower and likely makes it a little faster. The patch adds one additional check, but it removes other checks. Thus this still is an optimization.

But the last patch has an undefined behavior.
msg279958 - (view) Author: INADA Naoki (inada.naoki) * (Python committer) Date: 2016-11-03 01:49
(Off topic note)

For readability, I prefer this style:

  PyDictKeyEntry *entries = DK_ENTRIES(mp->ma_keys);
  while (i < n && entries[i].me_value == NULL)
      i++;

But because sizeof(PyDictKeyEntry) is not 2^n, entries[i].me_value, i++
may be slower than entry_ptr->me_value, i++, entry_ptr++.

So we should prefer i++, entry_ptr++ style in case of iterating dict entries.
msg279966 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-11-03 06:50
Following example causes a crash in debug build:

d = dict.fromkeys(range(100))
for k in range(99):
    del d[k]

it = iter(d)
assert next(it) == 99
d.clear()
d[0] = None
next(it)
msg279968 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016-11-03 07:26
Hmm, the resolution could be simple. But how about

>>> d = dict.fromkeys(range(100))
>>> for k in range(98):
...     del d[k]
... 
>>> it = iter(d)
>>> next(it)
98
>>> d.clear()
>>> d[0] = 1
>>> d[1] = 2
>>> next(it)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

Actually we haven't exhaust the dict yet. Is it reasonable for me to expect the second next returning 0? I actually mean is it reasonable for me to expect len(dict) elements returned before StopIteration raised even if the dict is changed?
msg279970 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-11-03 08:08
It is appropriate if iterating modifying dict raises RuntimeError, produces less items or even produce some items multiple times. What is not appropriate -- crash, hang and infinite iteration. There was a proposition about making this behavior more consistent (issue19332), but it was rejected.

I would suggest just remove assert() from your patch and address undefined behavior in other issue.
msg279971 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016-11-03 08:08
Currently dict iterator does not allow size changed during iteration. This is more strict than list iterator but still allow modification during iteration. Maybe we could deny all modification by checking dict->ma_version_tag. But that's irrelevant to this issue.

Serhiy, this means the iternext* functions all get UB. These codes existed even in 3.3. Do we really need to eliminate them now?
msg279973 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016-11-03 08:14
> I would suggest just remove assert() from your patch and address undefined behavior in other issue.

That's what v2 does. If there is another issue, let's also leave _PyDict_Next to it.
msg279975 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-11-03 08:27
In current code there is no UB in _PyDict_Next().
msg279977 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-11-03 08:49
v4 LGTM.
msg280039 - (view) Author: INADA Naoki (inada.naoki) * (Python committer) Date: 2016-11-04 07:50
LGTM. I'll commit.
msg280040 - (view) Author: Roundup Robot (python-dev) Date: 2016-11-04 07:59
New changeset 3904194d06e6 by INADA Naoki in branch 'default':
Issue #28580: Optimize iterating split table values.
https://hg.python.org/cpython/rev/3904194d06e6
msg280041 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016-11-04 08:02
Thanks!
History
Date User Action Args
2017-03-31 16:36:28dstufftsetpull_requests: + pull_request1015
2016-11-04 08:02:59xiang.zhangsetmessages: + msg280041
2016-11-04 08:00:15inada.naokisetstatus: open -> closed
resolution: fixed
stage: commit review -> resolved
2016-11-04 07:59:26python-devsetnosy: + python-dev
messages: + msg280040
2016-11-04 07:50:04inada.naokisetmessages: + msg280039
2016-11-03 08:49:19serhiy.storchakasetmessages: + msg279977
2016-11-03 08:40:24xiang.zhangsetfiles: + iterate_splittable_v4.patch
2016-11-03 08:27:17serhiy.storchakasetmessages: + msg279975
2016-11-03 08:14:31xiang.zhangsetmessages: + msg279973
2016-11-03 08:08:39xiang.zhangsetmessages: + msg279971
2016-11-03 08:08:28serhiy.storchakasetmessages: + msg279970
2016-11-03 07:26:58xiang.zhangsetmessages: + msg279968
2016-11-03 06:50:12serhiy.storchakasetmessages: + msg279966
2016-11-03 04:53:56xiang.zhangsetfiles: + iterate_splittable_v3.patch
2016-11-03 01:49:04inada.naokisetmessages: + msg279958
2016-11-02 18:37:14serhiy.storchakasetmessages: + msg279938
2016-11-02 18:26:56inada.naokisetmessages: + msg279936
2016-11-02 18:18:12inada.naokisetmessages: + msg279935
2016-11-02 17:30:28xiang.zhangsetmessages: + msg279934
2016-11-02 17:21:41serhiy.storchakasetmessages: + msg279933
2016-11-02 16:31:00xiang.zhangsetmessages: + msg279931
2016-11-02 15:22:37serhiy.storchakasetmessages: + msg279930
2016-11-02 15:17:29xiang.zhangsetfiles: + iterate_splittable_v2.patch
nosy: + vstinner
messages: + msg279928

2016-11-02 10:56:17serhiy.storchakasetmessages: + msg279919
stage: patch review -> commit review
2016-11-02 10:00:03xiang.zhangsetfiles: + iterate_splittable.patch

messages: + msg279916
2016-11-02 09:28:19xiang.zhangsetmessages: + msg279912
title: Optimize _PyDict_Next for split table -> Optimize iterating split table values
2016-11-02 09:19:36inada.naokisetmessages: + msg279910
2016-11-02 09:10:10serhiy.storchakasettype: behavior -> enhancement
messages: + msg279909
versions: - Python 3.6
2016-11-02 09:09:13serhiy.storchakasetmessages: + msg279908
2016-11-01 18:37:41xiang.zhanglinkissue28199 dependencies
2016-11-01 17:26:19xiang.zhangsettitle: Optimise _PyDict_Next for split table -> Optimize _PyDict_Next for split table
components: + Interpreter Core
versions: + Python 3.6, Python 3.7
2016-11-01 17:25:08xiang.zhangcreate