Issue1442927
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Unsupported provider
Created on 2006-03-04 06:21 by alanmcintyre, last changed 2022-04-11 14:56 by admin. This issue is now closed.
Files | ||||
---|---|---|---|---|
File name | Uploaded | Description | Edit | |
longopt.zip | alanmcintyre, 2006-03-04 06:21 | Patch, performance test code & results | ||
longopt.tgz | alanmcintyre, 2006-03-06 22:33 | Patch, perf test code & results (version #2) | ||
longopt3.tar.bz2 | alanmcintyre, 2006-03-07 03:05 | Patch, performance test code & results (version 3) |
Messages (7) | |||
---|---|---|---|
msg49637 - (view) | Author: Alan McIntyre (alanmcintyre) * | Date: 2006-03-04 06:21 | |
The current implementation of PyLong_FromString in Python 2.5 uses muladd1 to add each digit of the input string into the final number. Because muladd1 creates a new long to hold the result on every call, an intermediate long object is created/destroyed for each digit in the input string. This patch improves on the current implementation of PyLong_FromString in 3 main ways: 1. Creates and manipulates (in-place) a single long object to hold the result, skipping the creation of all those intermediate long objects. 2. Multiple digits from the input string are consolidated into a single long digit before adding them into the long integer object. This greatly reduces the number of "multiply/add" cycles required to push all the digits into the long object. 3. Three chunks of code like "if (ch <= '9') k = ch - '0'" in longobject.c are replaced by a digit value lookup vector. I'm not irreversibly stuck on this idea; it doesn't measurably add to performance, but it just seems (to me, anyway) to make the code in long_from_binary_base and PyLong_FromString a little less cluttered. This is the same lookup table from patch 1335972 (an optimization for int()). I expect if both patches get accepted it would be best to make them both reference a single instance of this table; if it looks like that's what will happen I'll tweak one or both patches as necessary. My cheezy test results (included in the attached file in an OpenOffice spreadsheet) show that the patch makes long() about 50% faster than the existing implementation for decimal input strings of about 10 characters. Longer input strings show even better performance improvement, leveling off around 3x faster for very long strings. This patch passes regression tests on my machine (WinXP, Visual C++ .net Standard 2003). I plan to try out the tests on my Linux box this weekend just to make sure the performance boost still remains when Python gets compiled by a C compiler that isn't neutered (standard .net 2003 doesn't appear to allow any optimizations). The test and test data generation scripts I used for this performance comparison are included in the attached zip file. At the moment I don't have any added tests; if somebody can suggest some things that ought to be tested I'll gladly write some tests. |
|||
msg49638 - (view) | Author: Alan McIntyre (alanmcintyre) * | Date: 2006-03-06 22:33 | |
Logged In: YES user_id=1115903 Version #2 is attached. I made a couple of tweaks and tested the patch out on Linux just to make sure the performace is still as good with compiler optimizations. For short numbers (numbers that would fit into an int), long() is 10-30% *slower* than before applying the patch. For longer numbers, long() is up to 249% faster, with the peak occurring around 1000 digits. If the negative performance impact for int-sized digits is unacceptable, I will see if I can do something about it. However, one always has the option of using int() on very long strings anyway, and it will automatically fall through to PyLong_FromString if the number is too long. The performance impact on int() for small numbers is so small as to be negligible (<5%), which is to be expected since the modified code isn't called when using int() on input strings < 2**32. |
|||
msg49639 - (view) | Author: Alan McIntyre (alanmcintyre) * | Date: 2006-03-07 03:05 | |
Logged In: YES user_id=1115903 Version #3 is attached; it has an across-the-board improvement of ~10% over version 2. The performance hit for calling long() on 9-digit numbers is now only about -10%, breakeven happens somewhere around 11 digits, and the best performance is about +282% in the vicinity of 1000 digits. Sorry to keep commenting on my own patch. :) I think I'm done now. |
|||
msg49640 - (view) | Author: Georg Brandl (georg.brandl) * | Date: 2006-05-16 07:40 | |
Logged In: YES user_id=849994 Assigned to Tim. Perhaps something for Iceland? |
|||
msg49641 - (view) | Author: Tim Peters (tim.peters) * | Date: 2006-05-16 07:44 | |
Logged In: YES user_id=31435 Thanks for reminding me, Georg! This is a good possiblity for the Iceland sprint. |
|||
msg49642 - (view) | Author: Georg Brandl (georg.brandl) * | Date: 2006-05-16 07:46 | |
Logged In: YES user_id=849994 If someone ;) created a new tracker category, I'd go through the patches and flag all I can find for the sprint. |
|||
msg49643 - (view) | Author: Tim Peters (tim.peters) * | Date: 2006-05-24 21:45 | |
Logged In: YES user_id=31435 Thanks again, Alan! I did some major fiddling for portability and to try to eradicate the penalty for "short" input strings. It's still slower on 1-digit inputs, by about 4-5%, but faster than before with at least 2-digit inputs. The peak speedup remains at around 800-1000 decimal digits, but it's 6x faster there now. Much of that came from eliminating code :-) For example, there was no actual need for the memset(), and reducing the main loop to: for (; pz < pzstop; ++pz) { c += (twodigits)*pz * convmult; *pz = (digit)(c & MASK); c >>= SHIFT; } was a huge win under VC 7.1 (in the patch, it has a branch testing whether c is 0, and I didn't believe the comment that said the branch made it faster ;-)). Anyway, this is checked in now. The table of digit values is duplicated for the moment, and I hope to refactor that soon (given that ints and longs have become increasingly unified in Python, it's become increasingly confusing to have two string->int routines -- while they have to remain for backward compatibility, there's nothing to stop rewriting the core to use a new unified conversion function). |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:56:15 | admin | set | github: 42977 |
2006-03-04 06:21:13 | alanmcintyre | create |