classification
Title: dict: Optimize PyDict_GetItemString()
Type: performance Stage:
Components: Interpreter Core Versions: Python 3.7
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: Nosy List: inada.naoki, rhettinger, serhiy.storchaka, vstinner
Priority: normal Keywords: patch

Created on 2017-01-17 10:47 by inada.naoki, last changed 2017-01-17 13:27 by vstinner. This issue is now closed.

Files
File name Uploaded Description Edit
dict_getitemascii.patch inada.naoki, 2017-01-17 10:47 review
Messages (5)
msg285631 - (view) Author: Inada Naoki (inada.naoki) * (Python committer) Date: 2017-01-17 10:47
PyDict_GetItemString() is heavily used, especially from keyword argument parsing.
Current implementation creates temporary string for key object.
This patch avoid the temporary key string when passed C string is ASCII.

This benchmark is based on a8563ef0eb8a, so PyDict_GetItemString() calls for
parsing positional arguments is reduced already.


$ ../python -m perf compare_to -G --min-speed 2 default.json patched.json
Slower (1):
- scimark_lu: 430 ms +- 21 ms -> 446 ms +- 23 ms: 1.04x slower (+4%)

Faster (11):
- telco: 24.2 ms +- 0.4 ms -> 21.8 ms +- 0.7 ms: 1.11x faster (-10%)
- xml_etree_parse: 315 ms +- 17 ms -> 302 ms +- 14 ms: 1.04x faster (-4%)
- logging_simple: 31.6 us +- 0.3 us -> 30.4 us +- 0.3 us: 1.04x faster (-4%)
- mako: 41.6 ms +- 0.7 ms -> 40.3 ms +- 0.4 ms: 1.03x faster (-3%)
- logging_format: 36.5 us +- 0.3 us -> 35.5 us +- 0.4 us: 1.03x faster (-3%)
- float: 297 ms +- 4 ms -> 289 ms +- 4 ms: 1.03x faster (-3%)
- scimark_monte_carlo: 276 ms +- 10 ms -> 269 ms +- 7 ms: 1.02x faster (-2%)
- regex_effbot: 5.31 ms +- 0.37 ms -> 5.19 ms +- 0.06 ms: 1.02x faster (-2%)
- pickle_pure_python: 1.32 ms +- 0.02 ms -> 1.29 ms +- 0.02 ms: 1.02x faster (-2%)
- scimark_sor: 525 ms +- 9 ms -> 514 ms +- 8 ms: 1.02x faster (-2%)
- richards: 180 ms +- 3 ms -> 176 ms +- 2 ms: 1.02x faster (-2%)

Benchmark hidden because not significant (52): ...


Performance difference of telco is bit surprising.
Profiler shows the difference is from `print(t, file=outfil)` (here: https://github.com/python/performance/blob/master/performance/benchmarks/bm_telco.py#L79 )

Until most common builtin functions are converted to FASTCALL, this patch has significant
performance gain.
msg285633 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-01-17 11:54
It looks to me that PyDict_GetItemString(), PyObject_GetAttrString(), etc are mainly for backward compatibility and for using in performance non-critical code. Performance critical code caches string objects.

The only code that heavily used PyDict_GetItemString() was parsing keyword arguments in PyArg_ParseTupleAndKeywords(). But this API was replaced with more efficient _PyArg_ParseTupleAndKeywordsFast() and _PyArg_ParseStackAndKeywords() for internal use. I think something similar will be exposed as public API when it become enough mature. Issue29029 made PyArg_ParseTupleAndKeywords() much less using PyDict_GetItemString().

PyDict_GetItemString() can be used with non-ASCII C strings. They are decoded with UTF-8. The patch works incorrectly in this case.

I afraid that adding more and more specialized code in Objects/dictobject.c can slow down other functions in this file. And this makes the maintenance harder.
msg285635 - (view) Author: Inada Naoki (inada.naoki) * (Python committer) Date: 2017-01-17 12:07
This patch checks passed C string is ascii or not.

But I don't want make dict complex too.  telco is more faster with issue29296.
Most common builtin functions are not METH_KEYWORDS when it merged.
msg285636 - (view) Author: Inada Naoki (inada.naoki) * (Python committer) Date: 2017-01-17 12:11
Close this issue for now, until profiler shows me PyDict_XxxxxString.
msg285639 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-01-17 13:27
> Close this issue for now, until profiler shows me PyDict_XxxxxString.

I like Serhiy's rationale. We should try to avoid PyDict_GetItemString() wheneve possible. If PyDict_GetItemString() becomes a clear bottleneck, we can discuss again optimizing it. But in the meanwhile, I prefer to use anothe rule: an optimization should not modify the behaviour of a function. That's why I didn't optimize OrderedDict.pop() yet (it changed the docstring, AC should be enhanced for this case).

So yeah, let's close this one.
History
Date User Action Args
2017-01-17 13:27:12vstinnersetmessages: + msg285639
2017-01-17 12:11:06inada.naokisetstatus: open -> closed
resolution: rejected
messages: + msg285636
2017-01-17 12:07:43inada.naokisetmessages: + msg285635
2017-01-17 11:54:56serhiy.storchakasetnosy: + rhettinger, vstinner, serhiy.storchaka
messages: + msg285633
2017-01-17 10:47:42inada.naokicreate