Message73694
Just to be clear: this patch halves the number of shifts and masks,
asymptotically; it doesn't affect the number of adds and multiplies
(except for saving a few additions to zero by setting the initial carry
intelligently). Is that correct? It's quite surprising that the bit
operations have such a large effect on the running time.
Perhaps this is an argument for considering changing PyLong_SHIFT to 16
(or better, 32, since surely almost every compiler provides uint64_t or
equivalent these days)? Though that would be quite an involved task.
While I believe the idea is sound, the patch isn't quite correct---it's
got some assertions that won't always hold. For example, with the
patch, in a debug build of Python, I get:
>>> a = b = 2**30-1
[36413 refs]
>>> a*b
Assertion failed: (carry <= PyLong_MASK), function x_mul, file
Objects/longobject.c, line 2228.
Abort trap
I'm fairly sure that the assert(carry <= PyLong_MASK) doesn't matter,
and that the assertion at the end of the main outer loop (assert(carry
>> PyLong_SHIFT) == 0)) should still hold. But some more comments and
analysis in the code would be good. For example, if carry >=
PyLong_MASK the the comment that 2*MASK + 2*MASK*MASK is contained in a
twodigit isn't so useful. How big can carry get? |
|
Date |
User |
Action |
Args |
2008-09-24 09:19:21 | mark.dickinson | set | recipients:
+ mark.dickinson, pernici |
2008-09-24 09:19:20 | mark.dickinson | set | messageid: <1222247960.75.0.83074528772.issue3944@psf.upfronthosting.co.za> |
2008-09-24 09:19:19 | mark.dickinson | link | issue3944 messages |
2008-09-24 09:19:18 | mark.dickinson | create | |
|