New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dict: Optimize PyDict_GetItemString() #73481
Comments
PyDict_GetItemString() is heavily used, especially from keyword argument parsing. This benchmark is based on a8563ef0eb8a, so PyDict_GetItemString() calls for $ ../python -m perf compare_to -G --min-speed 2 default.json patched.json
Slower (1):
- scimark_lu: 430 ms +- 21 ms -> 446 ms +- 23 ms: 1.04x slower (+4%) Faster (11):
Benchmark hidden because not significant (52): ... Performance difference of telco is bit surprising. Until most common builtin functions are converted to FASTCALL, this patch has significant |
It looks to me that PyDict_GetItemString(), PyObject_GetAttrString(), etc are mainly for backward compatibility and for using in performance non-critical code. Performance critical code caches string objects. The only code that heavily used PyDict_GetItemString() was parsing keyword arguments in PyArg_ParseTupleAndKeywords(). But this API was replaced with more efficient _PyArg_ParseTupleAndKeywordsFast() and _PyArg_ParseStackAndKeywords() for internal use. I think something similar will be exposed as public API when it become enough mature. bpo-29029 made PyArg_ParseTupleAndKeywords() much less using PyDict_GetItemString(). PyDict_GetItemString() can be used with non-ASCII C strings. They are decoded with UTF-8. The patch works incorrectly in this case. I afraid that adding more and more specialized code in Objects/dictobject.c can slow down other functions in this file. And this makes the maintenance harder. |
This patch checks passed C string is ascii or not. But I don't want make dict complex too. telco is more faster with bpo-29296. |
Close this issue for now, until profiler shows me PyDict_XxxxxString. |
I like Serhiy's rationale. We should try to avoid PyDict_GetItemString() wheneve possible. If PyDict_GetItemString() becomes a clear bottleneck, we can discuss again optimizing it. But in the meanwhile, I prefer to use anothe rule: an optimization should not modify the behaviour of a function. That's why I didn't optimize OrderedDict.pop() yet (it changed the docstring, AC should be enhanced for this case). So yeah, let's close this one. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: