New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Private _nth_root function loses accuracy #71948
Comments
First reported by Martin Panter here: http://bugs.python.org/issue27181#msg272488 I'm afraid I don't know enough about PowerPC to suggest a fix other than weakening the test on that platform. |
Tests fail on a Power PC buildbot: http://buildbot.python.org/all/builders/PPC64LE%20Fedora%203.x/builds/1476/steps/test/logs/stdio Traceback (most recent call last):
File "/home/shager/cpython-buildarea/3.x.edelsohn-fedora-ppc64le/build/Lib/test/test_statistics.py", line 1216, in testExactPowers
self.assertEqual(self.nroot(x, n), i)
AssertionError: 29.000000000000004 != 29 ====================================================================== Traceback (most recent call last):
File "/home/shager/cpython-buildarea/3.x.edelsohn-fedora-ppc64le/build/Lib/test/test_statistics.py", line 1228, in testExactPowersNegatives
self.assertEqual(self.nroot(x, n), i)
AssertionError: -29.000000000000004 != -29 |
You can do no better than to consult with Uncle Timmy. |
The problem is more widespread than just Power PC. The same failures (+/-29) are also seen on the three s390x buildbots. There are multiple failures of testExactPowers(Negatives) on The first is An slightly different failure happens on AMD64 Free BSD and Open Indiana buildbots: Traceback (most recent call last):
File "/export/home/buildbot/64bits/3.x.cea-indiana-amd64/build/Lib/test/test_statistics.py", line 1247, in testFraction
self.assertEqual(self.nroot(x**12, 12), float(x))
AssertionError: 1.1866666666666665 != 1.1866666666666668 |
Steven: can you explain why you think your code *should* be giving exact results for exact powers? Do you have an error analysis that says that should be the case? One issue here is that libm pow functions vary hugely in quality, so any algorithm that depends on ** or math.pow is going to be hard to prove anything about. I think the tests should simply be weakened: it's unreasonable to expect perfect results in this case. |
On Sun, Aug 14, 2016 at 08:29:39AM +0000, Mark Dickinson wrote:
No error analysis, only experimental results.
pow() is just the starting point. An earlier version of the function Because I'm not to worried about being super-fast, the nth root function
So I'm confident that nth_root() should never be worse than pow().
Okay. I'll weaken the tests. |
Here's the problem, though: you're using the libm pow again in this test, so the result of this being True doesn't give tight information about how close y is to the nth root of x (and the test result may well differ from platform to platform). |
Same deal here: those aren't the actual errors; they're approximations to the errors, since the computations of the epsilons depends on (a) the usual floating-point rounding, and more significantly (b) the accuracy of the |
On Sun, Aug 14, 2016 at 10:52:42AM +0000, Mark Dickinson wrote:
Let's call the two calculated roots y and w, where y is generated by You're saying that it might turn out that abs(y**n - x) < abs(w**n - x), I can see how that could happen, but not what to do about it. I'm I think this would be a problem if I wanted to make nth_root() a public What else can I do? Since I'm only dealing with integer powers, should I def ipow(y, n):
x = 1.0
while n > 0:
if n % 2 == 1:
x *= y
n //= 2
y *= y
return x |
I'd expect that a square-and-multiply-style pow would be less accurate than math.pow, in general, simply because of the number of floating-point operations involved. But I don't think there's a real problem here so long as you don't have an expectation of getting super-accurate (e.g., correctly rounded or faithfully rounded) results; testing that results are accurate to within 10 ulps or so is probably good enough. If you want an actual correctly-rounded nth root just for testing purposes, it's certainly possible to construct one with integer arithmetic, but an easier alternative might simply be to use MPFR's "mpfr_root" function (as wrapped by gmpy2) to generate test values. |
On Sun, Aug 14, 2016 at 12:05:37PM +0000, Mark Dickinson wrote:
Thanks for the advice Mark. This has certainly showed my naivety when it comes to floating point :-( |
Maybe one day. (Though I'm not holding my breath.) IEEE 754-2008 does indeed *recommend* (but not actually *require*) correctly rounded implementations of all the usual transcendental and power functions (including nth root). And if everyone followed those recommendations, then yes, the correct rounding would imply that those cross-platform differences would be a thing of the past. I don't see that happening any time soon, though: writing an implementation of log (for example) that's provably correctly rounded for all possible inputs is much harder than writing an implementation that's simply "almost always" correctly rounded (e.g., provably accurate to 0.53 ulps instead of 0.5 ulps), and the latter is going to be good enough in the vast majority of contexts. Going from "almost always correctly rounded" to "correctly rounded" isn't going to have much effect on the overall accuracy of a multistep numerical algorithm, so all you're buying (apart from a likely degraded running time) is the cross-platform reproducibility, and it's not really clear how useful cross-platform reproducibility is as a goal in its own right. Overall, it doesn't seem like a good tradeoff. The pow function is especially hard to make correctly rounded, not least because of its two inputs. At least for single-input transcendental functions there's some (perhaps remote) hope of doing exhaustive testing on the 18 million million million possible double-precision inputs; for a function taking two inputs that's completely infeasible. You can take a look at CRlibm [1] for efforts in the direction of a correctly-rounded libm; AFAIK it hasn't had wide adoption, and its pow function is still work in progress. |
A meta-note: one iteration of Newton's method generally, roughly speaking, doubles the number of "good bits" in the initial approximation. For floating n'th root, it would an astonishingly bad libm pow() that didn't get more than half the leading bits in pow(x, 1/n) right. So a single Newton iteration, if carried out in infinite precision, should be enough to get all the bits "right" (meaning not significantly worse than 0.5 ulp error when converted back to a float). So if you find yourself doing more than one Newton iteration, you're just fighting floating-point noise. It's an illusion - nothing is actually getting better, except perhaps by accident. Which suggests one approach for doubles (in C - same as a Python float): get the pow() approximation. Feed it to a I believe that's the best you can possibly do without doing real work ;-) |
Just for fun, here's a recipe for a correctly-rounded nth root operation for positive finite floats. I'm not suggesting using this in the business logic: it's likely way too slow (especially for large n), but it may have a use in the tests. The logic for the Newton iteration in floor_nroot follows that outlined in this Stack Overflow answer: http://stackoverflow.com/a/35276426 from math import frexp, ldexp
def floor_nroot(x, n):
""" For positive integers x, n, return the floor of the nth root of x. """
# Initial guess: here we use the smallest power of 2 that exceeds the nth
# root. But any value greater than or equal to the target result will do.
a = 1 << -(-x.bit_length() // n)
while True:
d = x // a**(n-1)
if a <= d:
return a
a = ((n-1) * a + d) // n
def nroot(x, n):
"""
Correctly-rounded nth root (n >= 2) of x, for a finite positive float x.
"""
if not (x > 0 and n >= 2):
raise ValueError("x should be positive; n should be at least 2")
m, e = frexp(x)
rootm = floor_nroot(int(m * 2**53) << (53*n + (e-1)%n - 52), n)
return ldexp(rootm + rootm % 2, (e-1)//n - 53) |
... assuming IEEE 754 binary64 format, of course. |
Since test_NTh_Root is failing on so many different buildbot platforms and in different ways, I'm marking this as a "release blocker". It would be very helpful to get this resolved prior to 3.6.0b1 next month. Suggest checking the test results outputs of the various 3x build slaves to ensure all the cases have been addressed. |
Thanks, Mark! I had worked out the Just FYI, this was my "cheap" suggestion: from fractions import Fraction
def rootn(x, n):
g = Fraction(x**(1.0/n))
g = ((n-1)*g + Fraction(x)/g**(n-1)) / n
return float(g) For fun, I ran that and your
and They always delivered the same result, which differed from But reducing the Moral of the story: none I can see ;-) |
Noting that def floor_nroot(x, n):
"""For positive integers x, n, return the floor of the nth root of x."""
bl = x.bit_length()
if bl <= 1000:
a = int(float(x) ** (1.0/n))
else:
xhi = x >> (bl - 53)
# x ~= xhi * 2**(bl-53)
# x**(1/n) ~= xhi**(1/n) * 2**((bl-53)/n)
# Let(bl-53)/n = w+f where w is the integer part.
# 2**(w+f) is then 2**w * 2**f, where 2**w is a shift.
a = xhi ** (1.0 / n)
t = (bl - 53) / n
w = int(t)
a *= 2.0 ** (t - w)
m, e = frexp(a)
a = int(m * 2**53)
e += w - 53
if e >= 0:
a <<= e
else:
a >>= -e
# A guess of 1 can be horribly slow, since then the next
# guess is approximately x/n. So force the first guess to
# be at least 2. If that's too large, fine, it will be
# cut down to 1 right away.
a = max(a, 2)
a = ((n-1)*a + x // a**(n-1)) // n
while True:
d = x // a**(n-1)
if a <= d:
return a
a = ((n-1) * a + d) // n I haven't yet found a case in the context of `nroot` where it doesn't get out on the first `if a <= d:` test. Of course you can provoke as many iterations as you like by passing `x` with a lot more than 53 "significant" bits (the large `x` passed by `nroot` are mostly long strings of trailing 0 bits). |
"Just for fun, here's a recipe for a correctly-rounded nth root operation for positive finite floats. I'm not suggesting using this in the business logic: it's likely way too slow (especially for large n), but it may have a use in the tests." I don't know well the statistics module, but it looks like it doesn't use directly floats, more a somehow higher level type of numbers to try to reduce rounding errors. For me, the math module is a thin wrapper on C library math functions, except of a few functions specific to Python like math.factorial. But the statistics module is at a higher level. Maybe we should draw a line between accuracy and speed. For example, explain in the statistics module that the module is designed for accuracy? Sorry if I completly misunderstood the design of the statistics module :-) |
About tradeoff, would it be possible to add an option to choose the quality of the accuracy? For example, a flag to choose between "fast nth root" or "accurate nth root". Python already has two kind of numbers: Decimal and float. Maybe the "flag" should be the type of input numbers, Decimal or float? I'm asking because I looked at http://lipforge.ens-lyon.fr/www/crlibm/ a few years ago, and such library is designed for accuracy, not for speed. crlibm is based on the scslib library which compose a number using multiple float numbers to get a better precision. To get correct rounding, crlibm requires more loop iterations and more floating point number operations, so as expected, it is slower. FYI my old blog article (in french!): http://haypo.alwaysdata.net/blog/index.php/2009/02/20/188-bibliotheque-mathematique-arrondi-exact-crlibm |
Just to share my little experience with rounding numbers. Last years, I worked on a API to convert timestamps between float and integers, the private "PyTime C API": https://vstinner.github.io/pytime.html At the beginning, I used various floatting point numbers which looks fine in decimal. But quickly, I got rounding issues on various buildbots. After many years fighting against compilers and trying to write a complete test suite, I decided to only use numbers which can be stored exactly in IEEE 754: if use_float:
# numbers with an exact representation in IEEE 754 (base 2)
for pow2 in (3, 7, 10, 15):
ns = 2.0 ** (-pow2)
ns_timestamps.extend((-ns, ns)) If you are curious, look at CPyTimeTestCase, TestCPyTime and TestOldPyTime classes in Lib/test/test_time.py. At the end, I decided to reimplement each conversion function in pure Python using decimal.Decimal and compare the result with the C implementation. It makes the unit tests shorter and simpler. |
Victor: by "way too slow", I really *do* mean way too slow. :-) floor_nroot does arithmetic with integers of bit-length approximately 54*n. For small n, that's fine, but if someone tried to take the geometric mean of a list of 50000 values (which it seems to me should be a perfectly reasonable use-case), floor_nroot would then be trying to do computations with multi-million-bit integers. The |
Oh, I see. But maybe the Decimal is fast enough for such giant numbers, since we |
I don't care about correct rounding here, but it is, e.g., a bit embarrassing that >>> 64**(1/3)
3.9999999999999996 Which you may or may not see on your box, depending on your platform pow(), but which you "should" see: 1/3 is not a third, it's a binary approximation that's a little less than 1/3, so 64 to that power should be less than 4. There are two practical problems with My simple fractions.Fraction function suffers the second problem too, but is even slower for large n. But if you give up on guaranteed correct rounding, this is really quite easy: as I noted before, a single Newton iteration approximately doubles the number of "good bits", so _any_ way of getting the effect of extra precision in a single iteration will deliver a result good enough for all practical purposes. Note that an error bound of strictly less than 1 ulp is good enough to guarantee that the exact result is delivered whenever the exact result is representable. Here's one way, exploiting that the decimal module has adjustable precision: import decimal
def rootn(x, n):
c = decimal.getcontext()
oldprec = c.prec
try:
c.prec = 40
g = decimal.Decimal(x**(1.0/n))
g = ((n-1)*g + decimal.Decimal(x)/g**(n-1)) / n
return float(g)
finally:
c.prec = oldprec Now memory use remains trivial regardless of n. I haven't yet found a case where the result differs from the correctly-rounded For n less than about 70, though, Note: just in case there's confusion about this, That said, I have seen rare cases where this (and |
What if use pow() with exactly represented degree in approximating step? def rootn(x, n):
g = x**(1.0/n)
m = 1 << (n-1).bit_length()
if n != m:
g = (x*g**(m-n))**(1.0/m)
return g Maybe it needs several iterations, because it converges slower than Newton approximation. But every step can be faster. |
Serhiy, I don't know what you're thinking there, and the code doesn't make much sense to me. For example, consider n=2. Then m == n, so you accept the initial >>> serhiy(7.073208563506701e+46, 2)
2.6595504438733062e+23
>>> pow(7.073208563506701e+46, 0.5) # exactly the same as your code
2.6595504438733062e+23
>>> nroot(7.073208563506701e+46, 2) # best possible result
2.6595504438733066e+23
>>> import math
>>> math.sqrt(7.073208563506701e+46) # also achieved by sqrt()
2.6595504438733066e+23 On general principle, you can't expect to do better than plain pow() unless you do _something_ that gets the effect of using more than 53 mantissa bits - unless the platform pow() is truly horrible. Using pow() multiple times is useless; doing Newton steps _in_ native float (C double) precision is useless; even some form of "binary search" is useless because just computing g**n (in native precision) suffers rounding errors worse than pow() suffered to begin with. So long as you stick to native precision, you're fighting rounding errors at least as bad as the initial rounding errors you're trying to correct. There's no a priori reason to even hope iterating will converge. |
Victor, happy to add comments, but only if there's sufficient interest in actually using this. In the context of this issue report, it's really only important that Mark understands it, and he already does ;-) For example, it starts with float |
This has been really eye-opening, and I just wanted to drop you a note I haven't checked it in yet, but I've already managed to simplify the Can somebody comment on my reasoning here? I start by taking an initial guess for the root by using Or should I just always run one iteration of Newton and trust that it |
Wouldn't following implementation be faster? import decimal
c = decimal.DefaultContext.copy()
c.prec = 25
def rootn(x, n,
D=decimal.Decimal,
sub=c.subtract,
mul=c.multiply,
log=c.ln):
g = x ** (1.0/n)
g += float(sub(log(D(x)), mul(log(D(g)), D(n)))) * g / n
return g
del decimal, c |
That's clever, Serhiy! Where did it come from? It's not Newton's method, but it also appears to enjoy quadratic convergence. As to speed, why are you asking? You should be able to time it, yes? On my box, it's about 6 times slower than the last code I posted. I assume (but don't know) that's because + - * / have trivial cost compared to power() and ln(), and your code uses two ln() while the last code I posted uses only one power(). Also, my power() has an exact integer exponent, which - for whatever reasons - appears to run much faster than using a non-integer Decimal exponent. It's for the latter reason that I didn't suggest just doing Or maybe it's "a platform thing". FYI, I'm using the released Python 3.5.2 on 64-bit Win 10 Pro. |
Steven, you certainly can ;-) check first whether As to whether a Newton step won't make things worse, that requires seeing the exact code you're using. There are many mathematically equivalent ways to code "a Newton step" that have different numeric behavior. If for some unfathomable (to me) reason you're determined to stick to native precision, then - generally speaking - the numerically best way to do "a correction step" is to code it in the form:
Coding a Newton step here as, e.g., When |
Tim Peters: "Victor, happy to add comments, but only if there's It looks like tests fail on some platforms. I guess that it depends on I like the idea of providing an accurate and *portable* function If someone considers that the performance of What do you think? |
As I said, the last code I posted is "fast enough" - I can't imagine a real application can't live with being able to do "only" tens of thousands of roots per second. A geometric mean is typically an output summary statistic, not a transformation applied billions of times to input data. To get a 100% guarantee of 100% portability (although still confined to IEEE 754 format boxes), we can't use the libm pow() at all. Then you can kiss speed goodbye. Mark's Here's another way, but routinely 100 times slower (than the last code I posted): import decimal
c = decimal.DefaultContext.copy()
c.prec = 25
def slow(x, n,
D=decimal.Decimal,
pow=c.power,
div=c.divide,
d1=decimal.Decimal(1)):
return float(pow(D(x), div(d1, n)))
Too slow for my tastes, but at least it's obvious ;-) |
Let's spell one of these out, to better understand why sticking to native precision is inadequate. Here's one way to write the Newton step in "guess + relatively_small_correction" form: def plain(x, n):
g = x**(1.0/n)
return g - (g - x/g**(n-1))/n Alas, it's pretty much useless. That's because when g is a really good guess, Example:
and we want the square root. Mark's >>> nroot(x, 2)
2.298568080262861e-07 In this case, so does sqrt(), and also x**0.5 (on my box - pow() may not on yours): >>> sqrt(x)
2.298568080262861e-07
>>> x**0.5
2.298568080262861e-07 You're going to think it needs "improvement", because the square of that is not x: >>> sqrt(x)**2 < x
True Let's see what happens in the
>>> g.hex()
'0x1.ed9d1dd7ce57fp-23'
>>> temp.hex()
'0x1.ed9d1dd7ce580p-23' They differ by just 1 in the very last bit. There's nothing wrong with "the math", but _in native precision_ the subtraction leaves only 1 bit of information. Then: >>> correction = (g - temp)/2
>>> correction
-1.3234889800848443e-23 is indeed small compared to g, but it only has one "real" bit: >>> correction.hex()
'-0x1.0000000000000p-76' Finally, >>> g - correction
2.2985680802628612e-07 is _worse_ than the guess we started with. Not because of "the math", but because sticking to native precision leaves us with an extremely crude approximation to the truth. This can be repaired in a straightforward way by computing the crucial subtraction with greater than native precision. For example, import decimal
c = decimal.DefaultContext.copy()
c.prec = 25
def blarg(x, n,
D=decimal.Decimal,
pow=c.power,
sub=c.subtract,
div=c.divide):
g = x**(1.0/n)
Dg = D(g)
return g - float(sub(Dg, div(D(x), pow(Dg, n-1)))) / n
Then the difference is computed as Decimal('-2.324439989147273024835272E-23') and the correction (convert that to float and divide by 2) as: -1.1622199945736365e-23 The magnitude of that is smaller than of the -1.3234889800848443e-23 we got in native precision, and adding the new correction to g makes no difference: the correction is (correctly!) viewed as being too small (compared with g) to matter. So, bottom line: there is no known way of writing the Newton step that won't make things worse at times, so long as it sticks to native precision. Newton iterations in native precision are wonderful when, e.g., you know you have about 20 good bits and want to get about 40 good bits quickly. The last bit or two remain pure noise, unless you can write the correction computation in a way that retains "a bunch" of significant bits. By far the easiest way to do the latter is simply to use more precision. |
Attched file "roots.py" you can run to get a guess as to how bad pow(x, 1/n) typically is on your box. Note that it's usually "pretty darned good" the larger n = 2272
x**(1/n)
-1 10
0 982
1 8
with 1 native-precision step
all correct
with 1 extended-precision step
all correct That means that, out of 1000 "random-ish" x, x**(1/2272) returned a result 1 ulp smaller than best-possible 10 times, 1 ulp too large 8 times, and best-possible 982 times ("ulp" is relative to the best-possible result). Doing any form of a single "guess + small_correction" Newton step (in either native or extended precision) repaired all the errors. Things look very different for small n. Like: n = 2
x**(1/n)
-1 1
0 997
1 2
with 1 native-precision step
-1 117
0 772
1 111
with 1 extended-precision step
all correct 1/2 is exactly representable, so errors are few. But doing a native-precsion "correction" made results significantly worse. n=3 is more interesting, because 1/3 is not exactly representable:
n = 3
x**(1/n)
-55 1
-54 2
-53 2
-52 1
-51 3
-50 3
-49 2
-48 3
-47 4
-46 2
-45 6
-44 5
-43 7
-42 5
-41 6
-40 4
-39 7
-38 7
-37 8
-36 7
-35 5
-34 8
-33 8
-32 14
-31 8
-30 9
-29 15
-28 13
-27 21
-26 14
-25 7
-24 14
-23 15
-22 7
-21 18
-20 14
-19 12
-18 17
-17 8
-16 13
-15 15
-14 9
-13 10
-12 14
-11 11
-10 7
-9 10
-8 14
-7 11
-6 7
-5 11
-4 12
-3 12
-2 12
-1 8
0 12
1 7
2 11
3 11
4 12
5 5
6 9
7 14
8 12
9 8
10 14
11 15
12 13
13 12
14 9
15 15
16 17
17 9
18 10
19 17
20 15
21 9
22 9
23 6
24 11
25 20
26 24
27 21
28 16
29 11
30 12
31 3
32 11
33 12
34 11
35 11
36 2
37 8
38 8
39 9
40 3
41 5
42 4
43 7
44 4
45 7
46 5
47 3
48 4
50 2
53 3
54 3
56 1
with 1 native-precision step
-1 42
0 917
1 41
with 1 extended-precision step
all correct There a single native-precision step helped a lot overall. But for n=4 (where 1/n is again exactly representable), it again hurts: n = 4
x**(1/n)
0 999
1 1
with 1 native-precision step
-1 50
0 894
1 56
with 1 extended-precision step
all correct And at n=5 it helps again; etc. Of course my story hasn't changed ;-) That is, use the single-step extended-precision code. It's "almost always" correctly rounded (I still haven't seen a case where it isn't, and the test code here assert-fails if it finds one), and is plenty fast. The reason this code gets very visibly slower as |
BTW, add this other way of writing a native-precision Newton step to see that it's much worse (numerically) than writing it in the "guess + small_correction" form used in roots.py. Mathematically they're identical, but numerically they behave differently: def native2(x, n):
g = x**(1.0/n)
if g**n == x:
return g
return ((n-1)*g + x/g**(n-1)) / n |
As discussed with Ned via email, this issue shouldn't affect the public geometric_function and the failing tests (currently skipped) are too strict. The existing tests demand an unrealistic precision which should be loosened, e.g. from assertEqual to assertAlmostEqual. Ned wrote: "If you are only planning to make changes to the tests themselves, I think that can wait for b2." I am currently unable to build 3.6, and I won't have time to resolve that before b1. |
I think this whole nth root discussion has become way more complicated than it needs to be, and there's a simple and obvious solution. Two observations:
If we do this, this also fixes issue bpo-28111, which is caused by the current algorithm getting into difficulties when computing the nth root of the 2**e part of x*2**e. Here's a direct solution using the Decimal module. If my back-of-the-envelope forward error analysis is correct, the result is always within 0.502 ulps of the correct result. That means it's *usually* going to be correctly rounded, and *always* going to be faithfully rounded. If 0.502 ulps isn't good enough for you, increase the context precision from 20 to 30, then we'll always be within 0.5000000000002 ulps instead. The code deals with the core problem where x is finite and positive. Extending to negative values, zeros, nans and infinities is left as an exercise for the reader. import decimal
import math
ROOTN_CONTEXT = decimal.Context(prec=20)
LOG2 = ROOTN_CONTEXT.ln(2)
def rootn(x, e, n):
"""
Accurate value for (x*2**e)**(1/n).
|
Here's the error analysis, for the record. It's crude, but effective. Assume
Finally, we want to express the error bound above in terms of ulps of the final_error <= 0.5 + 2 * 10**(17-p) ulps. For p = 20, this is 0.502 ulps. |
Mark, the code I showed in roots.py is somewhat more accurate and highly significantly faster than the code you just posted. It's not complicated at all: it just uses Decimal to do a single Newton correction with extended precision. Since it doesn't use the Decimal exp() or ln(), it's faster. It does use the Decimal pow(), but with an integer exponent, so this specific use of pow() doesn't invoke the Decimal exp() or ln() either. And it's still the case that I haven't found a case where its result isn't correctly rounded. My testing framework found multiple not-correctly-rounded cases in your new code within seconds. Presumably you could boost the precision to improve that, but then it would get even slower. |
Okay, fair enough. In that case, we still need a solution for computing rootn(x * 2**e) in the case where x*2**e itself overflows / underflows an IEEE 754 float. |
Decimal pow doesn't special-case integer exponents; the solution will still be based on exp and ln. |
Ah, sorry; I'm wrong. That was true for the Python version of the decimal module, not for the C implementation. |
Here's one: :-) >>> rootn(1 + 2**-52, 2)
1.0000000000000002 The correctly rounded result would be 1.0. |
Mark, thanks for the counterexample! I think I can fairly accuse you of thinking ;-) I expect the same approach would be zippy for scaling x by 2**e, provided that the scaled value doesn't exceed the dynamic range of the decimal context. Like so: def erootn(x, e, n,
D=decimal.Decimal,
D2=decimal.Decimal(2),
pow=c.power,
sub=c.subtract,
div=c.divide):
g = x**(1.0/n) * 2.0**(e/n) # force e to float for Python 2?
Dg = D(g)
return g - float(sub(Dg, div(D(x)*D2**e, pow(Dg, n-1)))) / n The multiple errors in the native-float-precision starting guess shouldn't hurt. In this case D(x)*D2**e may not exactly equal the scaled input either, but the error(s) are "way at the end" of the extended-precision decimal format. I don't know the range of |
Oops! The |
Let me give complete code for the last idea, also forcing the scaling multiplication to use the correct context: import decimal
c = decimal.DefaultContext.copy()
c.prec = 25
c.Emax = decimal.MAX_EMAX
c.Emin = decimal.MIN_EMIN
def erootn(x, e, n,
D=decimal.Decimal,
D2=decimal.Decimal(2),
pow=c.power,
sub=c.subtract,
mul=c.multiply,
div=c.divide):
g = x**(1.0/n) * 2.0**(e/n)
Dg = D(g)
return g - float(sub(Dg, div(mul(D(x), pow(D2, e)),
pow(Dg, n-1)))) / n
Both pow() instances use an integer exponent, so the C implementation of decimal still avoids expensive exp() and ln() computations - it's all just basic - * / under the covers. On a 32-bit box, the scaled input (x*2**e) must be <= 9.999...999E425000000, which is about 2**1411819443 (an exponent of about 1.4 billion). On the close-to-0 side, approximately the reciprocals of those giant numbers. The range is much wider on a 64-bit box (because Emax and Emin have much larger absolute values). Then, e.g., >>> erootn(1, 1411819443, 1411819443)
2.0
>>> erootn(1, -1411819443, 1411819443)
0.5 So it's obviously perfect ;-) Except that, because it reduces to the previous algorithm when e=0, Mark's counterexample (to correct rounding) still holds: >>> erootn(1 + 2**-52, 0, 2)
1.0000000000000002 Boosting precision to 28 (from 25) is enough to repair that, but it doesn't bother me (there was never a claim that it was always correctly rounded - but at prec=25 it went through hours & hours of testing randomish inputs and Basic error analysis for Newton's method usually goes like this: suppose the true root is t*(1 + (n-1)/2 * e**2) That the relative error goes from Good enough for me ;-) |
Let me clarify something about the extended algorithm: the starting guess is no longer the most significant source of error. It's the The rough error analysis is straightforward to extend, but I used wxMaxima to do the Taylor expansions and to ensure I didn't drop any terms. So suppose that instead of exactly x2**e we get x2e*(1+r1) for some small |r1| (on the order of 10-25), and instead of the exact n'th root 1 + r1/n - (n-1)/n * r1*r2 + (n-1)/2 * r2**2 The largest term (besides 1) by far is now |
Me too. |
I can't find |
Yep, |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: