-
-
Notifications
You must be signed in to change notification settings - Fork 29.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Integer & Long types: Performance improvement of 1.6x to 2x for base 10 conversions #50962
Comments
Converting integer & long types to their ASCII representation is a task I have written a special case for base 10 conversions which allows for
My tests show an improvement of 1.6x to 1.8x improvement for integer Note that because integers are displayed using fprintf(), the Patch is provided against trunk. It is somewhat difficult to read the |
I wrote a dummy script to generate a big number (2568 decimal digits, 8530 Python 2.7a0, Pentium4 @ 3.0 GHz (32 bits), long base=2^15 Smallest value of 5 runs: original = 5046.8 ms
patched = 2032.4 ms For huge numbers, the patch is much (60%) faster. -- Small integer (type=int) : n=factorial(10) (22 bits, 7 decimal digits) with original = 861.7 ms
patched = 639.2 ms It's also faster (26%). -- And with n=1 (1 bit, 1 decimal digit), type=int : original = 606.7
patched = 561.6 It's a little bit faster (7%) with the patch. I don't see any performance regression, only good improvements: 60% faster |
By benchmark should be reproduced on a 64 bits CPU with 2^15 and 2^30 |
Would it be possible to share some constant data between intobject.c and |
Yes I agree it would be a good idea to have one definition and one Where do you suggest these tables could be put? I'll be happy to |
I'm a bit ambivalent about this patch: I'd like to see string to integer If you're working with huge integers and care about speed, then you should Gawain, do you have a particular use-case in mind for this optimization? |
On the other hand, _PyLong_Format currently contains general machinery I'd be interested to know how much the conversion of two digits at one |
I disagree with you, mark. The patch is around 20 lines and does optimize all |
I ran this patch against Unladen Swallow's slowspitfire template slowspitfire: (./perf.py -r -b slowspitfire ../a/python.exe ../b/python.exe) Other benchmarks benefit, but are only barely statistically significant. |
Thanks for those results, Collin.
On OS X 10.6 (64-bit Python non-debug non-framework build with gcc 4.2 original = 1023.9 ms (best of 10 runs)
patched = 1005.3 ms (best of 10 runs).
|
Sorry: ignore that last message; I was talking through my hat. I Ahem. After a 'make distclean', I get the following results (same specs original: 783.8 ms patch 2 (attached) is Gawain's original patch, but with the base != |
One more iteration of the patch is attached: I rewrote the conversion (First three values below are the same as before.) OS X 10.6, 64-bit build of Python, 30-bit digits: original: 783.8 ms For OS X 10.5, 32-bit build of Python with 15-bit digits, on the same platform as above, original: 2045.1 ms |
I don't understand the following comment in patch3: I think that pin base is 2^30 / 2^15 and pout base is 10^9 / 10 ^ 4, not 10. |
Barring objections, I plan to apply the 'long_decimal_conversion_py3k' This would leave the non-binary-base code in _PyLong_Format unused. I'll |
Mark, As for the basic integers, I'm working on another patch which improves |
Updated patch, with minor changes:
Not that it matters much, but it's curious that on my machine (gcc-4.2, OS
in the middle of the inner loop, rather than the line:
I'm wondering whether this is just a quirk of my OS/compiler combination, |
Applied long_decimal_conversion_py3k_2.patch in r74851; backported to Still to do:
While we're at it, it would also be good to look at the decimal string -> |
The now unused arbitrary base conversion was removed in r74910 (py3k). |
Here's a patch for trunk's Objects/intobject.c. With this patch, I'm I'm leaving out the two-digits-at-a-time optimization. It *does* give a I'll apply this in a few days' time. |
Here's a modified version of the patch to Objects/intobject.c which Compared to the int_decimal_conversion_trunk.patch, my tests show a I admit, there is a slight increase in code complexity. However, I I hope I don't come across as unappreciative, on the contrary the I do think it would be unfortunately to not go a little further though - |
Gawain, Programmer cycles matter too. :-) Code clarity, especially in the Python For example, Tim Peters has been heard to say that if he did this all over http://mail.python.org/pipermail/python-dev/2008-November/083355.html It's not easy deciding where to draw the line, but for me, this particular |
Committed int_decimal_conversion_trunk.patch in r75084. Thanks, Gawain. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: