Author mark.dickinson
Recipients mark.dickinson, rhettinger, steven.daprano, tim.peters
Date 2021-11-24.08:12:57
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1637741577.47.0.660410014905.issue45876@roundup.psfhosted.org>
In-reply-to
Content
> am still studying the new one

Sorry - I should have added at least _some_ comments to it.

Here's the underlying principle, which I think of as "The magic of round-to-odd":

Suppose x is a positive real number and e is an integer satisfying 2^e <= x. Then assuming IEEE 754 binary64 floating-point format, the quantity:

    2^(e-54) * to-odd(x / 2^(e-54))

rounds to the same value as x does under _any_ of the standard IEEE 754 rounding modes, including the round-ties-to-even rounding mode that we care about here.

Here, to-odd is the function R -> Z that maps each integer to itself, but maps each non-integral real number x to the *odd* integer nearest x. (So for example all of 2.1, 2.3, 2.9, 3.0, 3.1, 3.9 map to 3.)

This works because (x / 2^(e-54)) gives us an integer with at least 55 bits: the 53 bits we'll need in the eventual significand, a rounding bit, and then the to-odd supplies a "sticky" bit that records inexactness. Note that the principle works in the subnormal range too - no additional tricks are needed for that. In that case we just end up wastefully computing a few more bits than we actually _need_ to determine the correctly-rounded value.

The code applies this principle to the case x = sqrt(n/m) and
e = (n.bit_length() - m.bit_length() - 1)//2. The condition 2^e <= x holds because:

    2^(n.bit_length() - 1) <= n
    m < 2^m.bit_length()
 
so

    2^(n.bit_length() - 1 - m.bit_length()) < n/m

and taking square roots gives

    2^((n.bit_length() - 1 - m.bit_length())/2) < √(n/m)

so taking e = (n.bit_length() - 1 - m.bit_length())//2, we have

    2^e <= 2^((n.bit_length() - 1 - m.bit_length())/2) < √(n/m)

Now putting q = e - 54, we need to compute

    2^q * round-to-odd(√(n/m) / 2^q)

rounded to a float. The two branches both do this computation, by moving 2^q into either the numerator or denominator of the fraction as appropriate depending on the sign of q.
History
Date User Action Args
2021-11-24 08:12:57mark.dickinsonsetrecipients: + mark.dickinson, tim.peters, rhettinger, steven.daprano
2021-11-24 08:12:57mark.dickinsonsetmessageid: <1637741577.47.0.660410014905.issue45876@roundup.psfhosted.org>
2021-11-24 08:12:57mark.dickinsonlinkissue45876 messages
2021-11-24 08:12:57mark.dickinsoncreate