classification
Title: Faster parsing keyword arguments
Type: performance Stage: resolved
Components: Interpreter Core Versions: Python 3.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: serhiy.storchaka Nosy List: brett.cannon, gregory.p.smith, haypo, larry, pitrou, python-dev, rhettinger, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2016-07-19 16:03 by serhiy.storchaka, last changed 2016-12-17 20:41 by serhiy.storchaka. This issue is now closed.

Files
File name Uploaded Description Edit
faster_keyword_args_parse.patch serhiy.storchaka, 2016-07-19 16:03 review
faster_keyword_args_parse_2.patch serhiy.storchaka, 2016-08-03 20:39 review
faster_keyword_args_parse_alt.patch serhiy.storchaka, 2016-12-17 20:40 review
faster_keyword_args_parse_alt2.patch serhiy.storchaka, 2016-12-17 20:41 review
Messages (13)
msg270832 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-07-19 16:03
Parsing keyword arguments is much more slow than parsing positional arguments. Parsing time can be larger that useful execution time.

$ ./python -m timeit "b'a:b:c'.split(b':', 1)"
1000000 loops, best of 3: 0.638 usec per loop
$ ./python -m timeit "b'a:b:c'.split(b':', maxsplit=1)"
1000000 loops, best of 3: 1.64 usec per loop

The main culprit is that Python strings are created for every keyword name on every call.

Proposed patch adds alternative API that caches keyword names as Python strings in special object. Argument Clinic is changed to use this API in generated file. An effect of the optimization:

$ ./python -m timeit "b'a:b:c'.split(b':', maxsplit=1)"
1000000 loops, best of 3: 0.826 usec per loop

Invocations of PyArg_ParseTupleAndKeywords() in non-generated code are kept, since API is not stable yet. Later I'm going to cache parsed format strings and speed up parsing positional arguments too.
msg270889 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2016-07-20 18:31
I haven't reviewed the patch, but the idea is great as I know one of Larry's hopes of using Argument Clinic was to allow for this kind of speed-up.
msg271834 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-08-02 18:05
Ping.
msg271928 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-08-03 20:39
Updated patch addresses Antoine's comments. All checks of format string are moved into parser_init.

I experimented with Antoine's idea about making vgetargskeywords a simple wrapper around vgetargskeywordsfast with one-shot parser, but this slows down parsing positional arguments too much (due to creating Python strings for unused keyword names).
msg272201 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2016-08-09 00:02
See also the old issue #17170 "string method lookup is too slow".
msg272231 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-08-09 10:37
Indeed, in issue17170 this issue was discussed first.
msg272411 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2016-08-11 07:50
Normally, LGTM is an almost useless comment, but the patch does in fact look good to me.  I like how compact and straight-forward the changes are to the individual parsing calls.
msg272656 - (view) Author: Roundup Robot (python-dev) Date: 2016-08-14 07:53
New changeset e527715bd0b3 by Serhiy Storchaka in branch 'default':
Issue #27574: Decreased an overhead of parsing keyword arguments in functions
https://hg.python.org/cpython/rev/e527715bd0b3
msg272930 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2016-08-17 12:34
The issue can now be closed no?
msg272979 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-08-17 18:18
I left this issue open for three reasons.

1. I had ideas and almost finished patch for different optimization. Unfortunately my hope was not justified, new implementation is slower. If I fail to fix it in few days, I'll close the issue.

2. For bikeshedding in case somebody want to suggest different names or interface.

3. I was going to convert most occurrences of PyArg_ParseTupleAndKeywords() to Argument Clinic for achieving larger effect of this optimization. But this patch was larger than I expected.
msg272981 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2016-08-17 18:30
I think for converting uses to Argument Clinic it can be done in a more iterative process on a per-module basis. How many modules do we have left to convert? If it isn't ridiculously huge we could open individual issues to convert them each.
msg272983 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-08-17 18:47
Yes, I came to conclusion than needed to push existing issues for separate files. I'm sure there are ready patches waiting for review. Now there is additional reason for converting to Argument Clinic. But some files contain only one PyArg_ParseTupleAndKeywords(), I think we can convert them in one patch.
msg283516 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-12-17 20:40
Just for the history, there are two alternative patches. They unpack keyword arguments to linear array. I expected this approach can add more optimization, but actually the benefit is too small or negative.
History
Date User Action Args
2016-12-17 20:41:04serhiy.storchakasetstatus: open -> closed
assignee: serhiy.storchaka
resolution: fixed
files: + faster_keyword_args_parse_alt2.patch
stage: resolved
2016-12-17 20:40:32serhiy.storchakasetfiles: + faster_keyword_args_parse_alt.patch

messages: + msg283516
2016-08-17 18:47:08serhiy.storchakasetmessages: + msg272983
2016-08-17 18:30:03brett.cannonsetmessages: + msg272981
2016-08-17 18:18:16serhiy.storchakasetmessages: + msg272979
2016-08-17 12:34:59hayposetmessages: + msg272930
2016-08-14 07:53:05python-devsetnosy: + python-dev
messages: + msg272656
2016-08-12 06:20:03gregory.p.smithsetnosy: + gregory.p.smith
2016-08-11 07:50:30rhettingersetnosy: + rhettinger
messages: + msg272411
2016-08-09 10:37:38serhiy.storchakasetmessages: + msg272231
2016-08-09 00:02:07hayposetnosy: + haypo
messages: + msg272201
2016-08-03 20:40:05serhiy.storchakasetfiles: + faster_keyword_args_parse_2.patch

messages: + msg271928
2016-08-03 07:52:14pitrousetnosy: + pitrou
2016-08-02 18:05:02serhiy.storchakasetmessages: + msg271834
2016-07-20 18:31:08brett.cannonsetnosy: + brett.cannon, larry
messages: + msg270889
2016-07-19 16:03:55serhiy.storchakacreate