A few weeks ago, I blogged about taking advantage of Karatsuba
multiplication and Newton's method to divide big integers quickly (some
of you may have read it, as it was posted to Daily Python-URL among
other places):
http://fredrik-j.blogspot.com/2008/07/making-division-in-python-faster.html
To summarize, this method blows the builtin division out of the water
already at ~(2000 digits)/(1000 digits).
The primary catch is that the result of the Newton division may be
slightly wrong (typically 1 ulp). However, a single extra multiplication
and a subtraction at the end allows one to compute a remainder, and
since the remainder must satisfy 0 <= r < q, the error is easily
corrected. From a quick test, the cost of the extra multiplication seems
to move the break-even point with the builtin division up to around
5000/2500 digits.
A pure Python implementation of divmod, with error correction based on
the remainder, can be found in this file:
http://www.dd.chalmers.se/~frejohl/code/libarith/libarith.py
(See the function idivmod)
Of particular note is that fast divmod gives a fast way to do radix
conversion, by recursively splitting the number in half. The function
numeral (see same .py file) does this, e.g:
>>> from time import clock
>>> a = 2**1257787-1
>>> t1=clock(); s1=str(a); t2=clock(); t2-t1
109.08065159119386
>>> t1=clock(); s2=numeral(a); t2=clock(); t2-t1
7.1394886780303182
>>> s1 == s2
True
(This recursive algorithm, by the way, is actually faster than str()
even with the slow builtin divmod.)
Would there be any interest in porting these algorithms to C and using
them in the standard Python long implementation?
There are likely some problems that I have overlooked. A critical review
will be most welcome. |