Attched file "roots.py" you can run to get a guess as to how bad pow(x, 1/n) typically is on your box.
Note that it's usually "pretty darned good" the larger `n` is. There's a reason for that. For example, when n=1000, all x satisfying 1 <= x < 2**1000 have a root r satisfying 1 <= r < 2. Mathematically, that's a 1-to-1 function, but in floating point there are only 2**52 representable r in the lstter range: the larger n, the more of a many-to-1 function the n'th root is in native floating point precision. That makes it much easier to get the best-possible result even by accident ;-) So, e.g., this kind of output is common for "large" n:
n = 2272
x**(1/n)
-1 10
0 982
1 8
with 1 native-precision step
all correct
with 1 extended-precision step
all correct
That means that, out of 1000 "random-ish" x, x**(1/2272) returned a result 1 ulp smaller than best-possible 10 times, 1 ulp too large 8 times, and best-possible 982 times ("ulp" is relative to the best-possible result). Doing any form of a single "guess + small_correction" Newton step (in either native or extended precision) repaired all the errors.
Things look very different for small n. Like:
n = 2
x**(1/n)
-1 1
0 997
1 2
with 1 native-precision step
-1 117
0 772
1 111
with 1 extended-precision step
all correct
1/2 is exactly representable, so errors are few. But doing a native-precsion "correction" made results significantly worse.
n=3 is more interesting, because 1/3 is not exactly representable:
n = 3
x**(1/n)
-55 1
-54 2
-53 2
-52 1
-51 3
-50 3
-49 2
-48 3
-47 4
-46 2
-45 6
-44 5
-43 7
-42 5
-41 6
-40 4
-39 7
-38 7
-37 8
-36 7
-35 5
-34 8
-33 8
-32 14
-31 8
-30 9
-29 15
-28 13
-27 21
-26 14
-25 7
-24 14
-23 15
-22 7
-21 18
-20 14
-19 12
-18 17
-17 8
-16 13
-15 15
-14 9
-13 10
-12 14
-11 11
-10 7
-9 10
-8 14
-7 11
-6 7
-5 11
-4 12
-3 12
-2 12
-1 8
0 12
1 7
2 11
3 11
4 12
5 5
6 9
7 14
8 12
9 8
10 14
11 15
12 13
13 12
14 9
15 15
16 17
17 9
18 10
19 17
20 15
21 9
22 9
23 6
24 11
25 20
26 24
27 21
28 16
29 11
30 12
31 3
32 11
33 12
34 11
35 11
36 2
37 8
38 8
39 9
40 3
41 5
42 4
43 7
44 4
45 7
46 5
47 3
48 4
50 2
53 3
54 3
56 1
with 1 native-precision step
-1 42
0 917
1 41
with 1 extended-precision step
all correct
There a single native-precision step helped a lot overall. But for n=4 (where 1/n is again exactly representable), it again hurts:
n = 4
x**(1/n)
0 999
1 1
with 1 native-precision step
-1 50
0 894
1 56
with 1 extended-precision step
all correct
And at n=5 it helps again; etc.
Of course my story hasn't changed ;-) That is, use the single-step extended-precision code. It's "almost always" correctly rounded (I still haven't seen a case where it isn't, and the test code here assert-fails if it finds one), and is plenty fast. The reason this code gets very visibly slower as `n` increases is due to using the always-correctly-rounded `nroot()` for comparison. |