One note: in the original post, not only are the values being tested coming from NumPy's arange, but round(x[i],1) is testing *NumPy's* rounding functionality, not Python's. x[i] has type np.float64, and while np.float64 does inherit from Python's float, it also overrides float.__round__ with its own implementation (that essentially amounts to scale-by-power-of-ten, round-to-nearest-int, unscale, just like Python used to do in the bad old days). So errors from arange plus NumPy's non-correctly-rounded round means that all bets are off on what happens for values that _look_ as though they're ties when shown in decimal, but aren't actually ties thanks to the what-you-see-is-not-what-you-get nature of binary floating-point.
On arange in particular, I've never looked closely into the implementation; it's never noticeably not been "close enough" (i.e., accurate to within a few ulps either way), and I've never needed it or expected it to be perfectly correctly rounded. Now that it's been brought up, I'll take a look. (But that shouldn't keep this issue open, since that's a pure NumPy issue.)
Honestly, given the usual floating-point imprecision issues, I'm surprised that the balance is coming out as evenly as it is in Tim's and Steven's experiments. I can see why it might work for a single binade, but I'm at a loss to explain why you'd expect a perfect balance across several binades.
For example: if you're looking at values of the form 0.xx5 in the binade [0.5, 1.0], and rounding those to two decimal places, you'd expect perfect parity, because if you pair the values from [0.5, 0.75] with the reverse of the values from [0.75, 1.0], in each pair exactly one of the two values will round up, and one down (the paired values always add up to *exactly* 1.5, with no rounding, so the errors from the decimal-to-binary rounding will always go in opposite directions). For example 0.505 rounds up, and dually 0.995 rounds down. (But whether the pair gives (up, down) or (down, up) will depend on exactly which way the rounding went when determining the nearest binary64 float, so will be essentially unpredictable.)
>>> test_values = [x/1000 for x in range(505, 1000, 10)]
>>> len(test_values) # total count of values
50
>>> sum(round(val, 2) > val for val in test_values) # number rounding up
25
But then you need to make a similar argument for the next binade down: [0.25, 0.5] (which doesn't work at all in this case, because that binade contains an odd number of values).
Nevertheless, this *does* seem to work, and I haven't yet found a good explanation why. Any offers?
>>> k = 8
>>> test_values = [x/10**(k+1) for x in range(5, 10**(k+1), 10)]
>>> sum(round(val, k) > val for val in test_values)
50000000
BTW, yes, I think this issue can be closed. |