New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize iterating split table values #72766
Comments
Since values of split table is always dense, we can optimise the current implementation of _PyDict_Next. I think this could hardly bring much performance enhancement. More importantly, this emphasizes the invariant and make bugs easy to find and test. |
I agree that this unlikely could cause visible effect, but this patch simplifies the code. LGTM. Similar changes can be applied to other iteration code. dictiter_iternextkey, dict_keys, etc. |
I think this is too late for 3.6. |
LGTM, too.
I agree. Xiang, would you update patch? |
Working on it. |
Here is the new patch to apply the optimization to more places. |
LGTM. |
Update the patch to use more obvious comparison way. This uses INADA's suggestion and make the code more like other places. I think the performance issue is better to be discussed in bpo-28397. It doesn't have a significant impact on this patch. Hope we can move forward and not block on it. |
Could you please check how this change affects the performance? |
python -> unpatched, python3 -> patched iterkeys: (split) ./python3 -m perf timeit --compare-to /home/angwer/cpython/python -s 'from argparse import Namespace; ns = Namespace(); [setattr(ns, str(i), str(i)) for i in range(10000)]' 'list(iter(ns.__dict__))' Median +- std dev: [python] 112 us +- 1 us -> [python3] 109 us +- 1 us: 1.03x faster (combined) ./python3 -m perf timeit --compare-to /home/angwer/cpython/python -s 'd = {x:x for x in range(10000)}' 'list(iter(d))'python: ..................... 84.3 us +- 2.4 us Median +- std dev: [python] 84.3 us +- 2.4 us -> [python3] 86.0 us +- 3.5 us: 1.02x slower pydict_next: (split) ./python3 -m perf timeit --compare-to /home/angwer/cpython/python -s 'from argparse import Namespace; ns = Namespace(); [setattr(ns, str(i), str(i)) for i in range(10000)]' 'repr(ns.__dict__)' Median +- std dev: [python] 1.85 ms +- 0.01 ms -> [python3] 1.85 ms +- 0.11 ms: 1.00x faster (combined) ./python3 -m perf timeit --compare-to /home/angwer/cpython/python -s 'd = {x:x for x in range(10000)}' 'repr(d)' Median +- std dev: [python] 1.99 ms +- 0.01 ms -> [python3] 1.87 ms +- 0.01 ms: 1.06x faster |
There is not easy to benchmark PyDict_Next. dict.__repr__ does much hard work (calling repr() for keys and values, formatting strings). Using _testcapi.test_dict_iteration perhaps is the easiest way to measure the performance of pure PyDict_Next, but it includes creating and deallocating of dicts. |
Yes, that's the point. I thought to expose an API in testcapimodule for test, but actually I am not willing to do that since I don't believe this patch could bring any visible performance change. |
I agree with you.
|
LGTM again for iterate_splittable_v2.patch. How about issue title (and commit message) to |
I believe the patch doesn't makes iterating split table slower and likely makes it a little faster. The patch adds one additional check, but it removes other checks. Thus this still is an optimization. But the last patch has an undefined behavior. |
(Off topic note) For readability, I prefer this style: PyDictKeyEntry *entries = DK_ENTRIES(mp->ma_keys);
while (i < n && entries[i].me_value == NULL)
i++; But because sizeof(PyDictKeyEntry) is not 2^n, entries[i].me_value, i++ So we should prefer i++, entry_ptr++ style in case of iterating dict entries. |
Following example causes a crash in debug build: d = dict.fromkeys(range(100))
for k in range(99):
del d[k]
it = iter(d)
assert next(it) == 99
d.clear()
d[0] = None
next(it) |
Hmm, the resolution could be simple. But how about >>> d = dict.fromkeys(range(100))
>>> for k in range(98):
... del d[k]
...
>>> it = iter(d)
>>> next(it)
98
>>> d.clear()
>>> d[0] = 1
>>> d[1] = 2
>>> next(it)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration Actually we haven't exhaust the dict yet. Is it reasonable for me to expect the second next returning 0? I actually mean is it reasonable for me to expect len(dict) elements returned before StopIteration raised even if the dict is changed? |
It is appropriate if iterating modifying dict raises RuntimeError, produces less items or even produce some items multiple times. What is not appropriate -- crash, hang and infinite iteration. There was a proposition about making this behavior more consistent (bpo-19332), but it was rejected. I would suggest just remove assert() from your patch and address undefined behavior in other issue. |
Currently dict iterator does not allow size changed during iteration. This is more strict than list iterator but still allow modification during iteration. Maybe we could deny all modification by checking dict->ma_version_tag. But that's irrelevant to this issue. Serhiy, this means the iternext* functions all get UB. These codes existed even in 3.3. Do we really need to eliminate them now? |
That's what v2 does. If there is another issue, let's also leave _PyDict_Next to it. |
In current code there is no UB in _PyDict_Next(). |
v4 LGTM. |
LGTM. I'll commit. |
New changeset 3904194d06e6 by INADA Naoki in branch 'default': |
Thanks! |
Misc/NEWS
so that it is managed by towncrier #552Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: