Title: Faster parsing keyword arguments
msg270832 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-07-19 16:03
Parsing keyword arguments is much more slow than parsing positional arguments. Parsing time can be larger that useful execution time.

$ ./python -m timeit "b'a:b:c'.split(b':', 1)"
1000000 loops, best of 3: 0.638 usec per loop
$ ./python -m timeit "b'a:b:c'.split(b':', maxsplit=1)"
1000000 loops, best of 3: 1.64 usec per loop

The main culprit is that Python strings are created for every keyword name on every call.

Proposed patch adds alternative API that caches keyword names as Python strings in special object. Argument Clinic is changed to use this API in generated file. An effect of the optimization:

$ ./python -m timeit "b'a:b:c'.split(b':', maxsplit=1)"
1000000 loops, best of 3: 0.826 usec per loop

Invocations of PyArg_ParseTupleAndKeywords() in non-generated code are kept, since API is not stable yet. Later I'm going to cache parsed format strings and speed up parsing positional arguments too.
msg270889 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2016-07-20 18:31
I haven't reviewed the patch, but the idea is great as I know one of Larry's hopes of using Argument Clinic was to allow for this kind of speed-up.
msg271834 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-08-02 18:05
msg271928 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-08-03 20:39
Updated patch addresses Antoine's comments. All checks of format string are moved into parser_init.

I experimented with Antoine's idea about making vgetargskeywords a simple wrapper around vgetargskeywordsfast with one-shot parser, but this slows down parsing positional arguments too much (due to creating Python strings for unused keyword names).
msg272201 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-08-09 00:02
See also the old issue #17170 "string method lookup is too slow".
msg272231 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-08-09 10:37
Indeed, in issue17170 this issue was discussed first.
msg272411 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2016-08-11 07:50
Normally, LGTM is an almost useless comment, but the patch does in fact look good to me.  I like how compact and straight-forward the changes are to the individual parsing calls.
msg272656 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2016-08-14 07:53
New changeset e527715bd0b3 by Serhiy Storchaka in branch 'default':
Issue #27574: Decreased an overhead of parsing keyword arguments in functions
msg272930 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-08-17 12:34
The issue can now be closed no?
msg272979 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-08-17 18:18
I left this issue open for three reasons.

1. I had ideas and almost finished patch for different optimization. Unfortunately my hope was not justified, new implementation is slower. If I fail to fix it in few days, I'll close the issue.

2. For bikeshedding in case somebody want to suggest different names or interface.

3. I was going to convert most occurrences of PyArg_ParseTupleAndKeywords() to Argument Clinic for achieving larger effect of this optimization. But this patch was larger than I expected.
msg272981 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2016-08-17 18:30
I think for converting uses to Argument Clinic it can be done in a more iterative process on a per-module basis. How many modules do we have left to convert? If it isn't ridiculously huge we could open individual issues to convert them each.
msg272983 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-08-17 18:47
Yes, I came to conclusion than needed to push existing issues for separate files. I'm sure there are ready patches waiting for review. Now there is additional reason for converting to Argument Clinic. But some files contain only one PyArg_ParseTupleAndKeywords(), I think we can convert them in one patch.
msg283516 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-12-17 20:40
Just for the history, there are two alternative patches. They unpack keyword arguments to linear array. I expected this approach can add more optimization, but actually the benefit is too small or negative.
