This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Speed up PyArg_ParseTupleAndKeywords() and improve error msg
Type: Stage:
Components: Interpreter Core Versions: Python 2.6
process
Status: closed Resolution: accepted
Dependencies: Superseder:
Assigned To: christian.heimes Nosy List: christian.heimes, nnorwitz, paulhankin, rupole
Priority: normal Keywords: patch

Created on 2007-03-30 06:45 by rupole, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
getargs.c.patch rupole, 2007-03-30 06:45 Patch for getargs.c
getargs_1.patch rupole, 2007-04-04 05:26 Revised patch with suggested changes
_testcapimodule.patch rupole, 2007-04-04 05:39 Add a function to test PyArg_ParseTupleAndKeywords
test_getargs2.patch rupole, 2007-04-04 09:40 Unit tests for keyword args
trunk_getargs_speedup.patch christian.heimes, 2008-02-25 14:42
trunk_getargs_speedup2.patch christian.heimes, 2008-02-26 08:51
Messages (13)
msg52342 - (view) Author: Roger Upole (rupole) Date: 2007-03-30 06:45
This is a fix for [ 1283289 ] PyArg_ParseTupleAndKeywords gives misleading error message.  It also yields a 10-15% decrease in cpu usage, depending on the number of arguments.
msg52343 - (view) Author: Paul Hankin (paulhankin) Date: 2007-04-01 18:00
Patched code compiles and passes test suite, and looks cleaner than the
code it replaces. It works on the cases the bug report.

There should be unit tests added to Lib/test/test_getargs2; I've skimmed
the tests in there and can't see any that test combinations of kw, tuple
and positional arguments. The patch replaces significant chunks of arg
parsing code, so I'd be more convinced of its correctness if there were
some unit tests. Certainly there should be at least one test for the
bug the patch fixes.

The code is 99% written in the prevailing style, but:
Source contains several over-long lines, and non-standard comments, ie
you should use
 /* XXX something .. */
and not
 /* ??? something .. ??? */
Also there are a few minor spacing inconsistencies which would be nice
to tidy up.

The diff is one level too high up - it contains the directory Python2.5/
It applies cleanly to the trunk though, it just makes it slightly more difficult
to apply the patch.

The original bug report requests a post to python-dev describing the new
error messages. I suggest this post includes all cases where
the new code produces a different message than the old.

I suggest to proceed:
1. Add unit tests, both for the bug-fix and for existing functionality
   which has been modified
2. Fix up comments and long lines
3. Check with python-dev about new error messages
msg52344 - (view) Author: Roger Upole (rupole) Date: 2007-04-04 05:26
File Added: getargs_1.patch
msg52345 - (view) Author: Roger Upole (rupole) Date: 2007-04-04 05:39
File Added: _testcapimodule.patch
msg52346 - (view) Author: Roger Upole (rupole) Date: 2007-04-04 09:40
File Added: test_getargs2.patch
msg52347 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2007-04-19 06:37
The patch to the test code looks fine, although there are some style issues.  Some lines over 80 columns and there needs to be spaces around = and other binary operators.

I need to review the C code changes, but this patch looks interesting.  

Roger, in the future, can you make a single patch with all the files.  It's easier to download and apply one patch with multiple files than 3 diff patches.
msg59355 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2008-01-06 11:57
Can you review the patch and commit it for 2.6? A patch which cleans up
the code *and* make it faster is always a good idea. :)
msg62969 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2008-02-25 04:38
Christian,

Could you clean this patch up?  Specifically:

 * Put everything into one patch
 * Eliminate unnecessary changes (changing variable name or whitespace)
 * Conform to the style in the file
 * Verify all the tests run with regrtest.py -u all  when built
--without-pydebug
 * Verify it runs faster
msg62983 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2008-02-25 14:42
I did some cleanup (style, var names, <80 chars per line) and combined
the patch set into a single patch. The regression tests are passing for
a pydebug build. I'm too busy to profile and test the patch with a
vanilla Python right now.

Pybench is showing a small speedup (first run with patch, second run
without patch): 26498ms 25920ms   +2.2% 28000ms 26888ms   +4.1%
msg63028 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2008-02-26 05:38
I looked over the new patch Christian uploaded and I think I understand
what's going on.  I didn't do a through comparison, but if all the tests
pass, I think that's good enough.  Good work!

Here are the issues I would like to see fixed before check in (all small
and easy):
 * Fix the format nits.  There were missing spaces around = and ==.
 * Remove the comment that says:
+				 * XXX TypeError doesn't seem right for this,
+				 * maybe create an ArgumentError ???
+				 * XXX Arg positions currently reported as
+				 * 1-based, contrary to most things in
+				 * python ???

    The first line of that comment should stay.
 * Expand the comment that says: Maybe skip this in debug build ?
    Add a XXX or TODO and explain why.  I think it's just that it would
give extra testing.  I'm not sure it's worth it since that would be
added code, but it's a valid question.

 * Verify that it's faster by compiling python in a release build (ie,
optimized, not debug with assertions) and do something that would
trivially execute the code.  For example:

  * ''.split()
  * ''.split('', 1)
  * (1,).index(1)
  * (1,).index(1, 0, 1)

That might be a decent set.  I just grepped for PyArg_ParseTuple
Objects/*.c and found some common cases where this code would be used. 
It would be great to add the timing info from the test cases above (as
well as the pybench results) to the check in message.

 * In the test do a single import at the top of the file.
 * Use raise Error('') instead of raise Error, '' in the test.  Actually
instead of raising an exception, call self.fail(error_message)
msg63036 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2008-02-26 08:51
> * Fix the format nits.  There were missing spaces around = and ==.

Oh, I missed the macro above the function. :]

> * Verify that it's faster by compiling python in a release build 

It's roughly the same speed. :/ The variation between timeit runs is
quite high on my machine. The new code is neither significant faster nor
noticeable slower.

./python Lib/timeit.py "for i in xrange(100): ''.split(); ''.split(' ',
1); (1,).index(1); (1,).index(1, 0, 1)"
1000 loops, best of 3: 342 usec per loop

The timing distributes about +-15usec with 400+ every now and then.

./python Lib/timeit.py -s "def func(a, b, c=None, d=None, e=None): pass"
"for i in xrange(100): func(a=1, b=2); func(b=1, a=2, d=3, c=4, e=5)"

This benchmark shows a noticeable speedup. It's about 205usec with patch
and 225usec without patch.
msg63049 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2008-02-26 16:15
On Tue, Feb 26, 2008 at 12:51 AM, Christian Heimes
<report@bugs.python.org> wrote:
>
>  > * Verify that it's faster by compiling python in a release build
>
>  It's roughly the same speed. :/ The variation between timeit runs is
>  quite high on my machine. The new code is neither significant faster nor
>  noticeable slower.

Use your best judgment.  I haven't looked at the resulting code, only
the patch so it's hard for me to know for certain.  It appears the
patch simplifies the code, so even if there is no speed benefit, there
may be a maintainability benefit.  It's up to you.
msg63050 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2008-02-26 18:25
I've applied the patch in r61086.
History
Date User Action Args
2022-04-11 14:56:23adminsetgithub: 44787
2008-02-26 18:25:18christian.heimessetstatus: open -> closed
messages: + msg63050
2008-02-26 16:15:03nnorwitzsetmessages: + msg63049
2008-02-26 08:51:43christian.heimessetfiles: + trunk_getargs_speedup2.patch
messages: + msg63036
2008-02-26 05:39:23nnorwitzsetassignee: nnorwitz -> christian.heimes
2008-02-26 05:38:45nnorwitzsetresolution: accepted
messages: + msg63028
2008-02-25 14:42:40christian.heimessetfiles: + trunk_getargs_speedup.patch
messages: + msg62983
2008-02-25 04:38:26nnorwitzsetmessages: + msg62969
2008-01-06 11:57:33christian.heimessetassignee: nnorwitz
versions: + Python 2.6
messages: + msg59355
nosy: + christian.heimes
2007-03-30 06:45:52rupolecreate