Message 270832 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	serhiy.storchaka
Recipients	serhiy.storchaka
Date	2016-07-19.16:03:48
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1468944235.85.0.107735525995.issue27574@psf.upfronthosting.co.za>
In-reply-to

Content
Parsing keyword arguments is much more slow than parsing positional arguments. Parsing time can be larger that useful execution time. $ ./python -m timeit "b'a:b:c'.split(b':', 1)" 1000000 loops, best of 3: 0.638 usec per loop $ ./python -m timeit "b'a:b:c'.split(b':', maxsplit=1)" 1000000 loops, best of 3: 1.64 usec per loop The main culprit is that Python strings are created for every keyword name on every call. Proposed patch adds alternative API that caches keyword names as Python strings in special object. Argument Clinic is changed to use this API in generated file. An effect of the optimization: $ ./python -m timeit "b'a:b:c'.split(b':', maxsplit=1)" 1000000 loops, best of 3: 0.826 usec per loop Invocations of PyArg_ParseTupleAndKeywords() in non-generated code are kept, since API is not stable yet. Later I'm going to cache parsed format strings and speed up parsing positional arguments too.

Parsing keyword arguments is much more slow than parsing positional arguments. Parsing time can be larger that useful execution time.

$ ./python -m timeit "b'a:b:c'.split(b':', 1)"
1000000 loops, best of 3: 0.638 usec per loop
$ ./python -m timeit "b'a:b:c'.split(b':', maxsplit=1)"
1000000 loops, best of 3: 1.64 usec per loop

The main culprit is that Python strings are created for every keyword name on every call.

Proposed patch adds alternative API that caches keyword names as Python strings in special object. Argument Clinic is changed to use this API in generated file. An effect of the optimization:

$ ./python -m timeit "b'a:b:c'.split(b':', maxsplit=1)"
1000000 loops, best of 3: 0.826 usec per loop

Invocations of PyArg_ParseTupleAndKeywords() in non-generated code are kept, since API is not stable yet. Later I'm going to cache parsed format strings and speed up parsing positional arguments too.

History
Date	User	Action	Args
2016-07-19 16:03:58	serhiy.storchaka	set	recipients: + serhiy.storchaka
2016-07-19 16:03:55	serhiy.storchaka	set	messageid: <1468944235.85.0.107735525995.issue27574@psf.upfronthosting.co.za>
2016-07-19 16:03:55	serhiy.storchaka	link	issue27574 messages
2016-07-19 16:03:55	serhiy.storchaka	create