Message400303
The rounding correction in _ss() looks mathematically incorrect to me:
∑ (xᵢ - x̅ + εᵢ)² = ∑ (xᵢ - x̅)² - (∑ εᵢ)² ÷ n
If we drop this logic (which seems completely bogus), all the tests still pass and the code becomes cleaner:
def _ss(data, c=None):
if c is None:
c = mean(data)
T, total, count = _sum((y := x - c) * y for x in data)
return (T, total)
-- Algebraic form of the current code ----------------------
from sympy import symbols, simplify
x1, x2, x3, e1, e2, e3 = symbols('x1 x2 x3 e1 e2 e3')
n = 3
# high accuracy mean
c = (x1 + x2 + x3) / n
# sum of squared deviations with subtraction errors
total = (x1 - c + e1)**2 + (x2 - c + e2)**2 + (x3 - c + e3)**2
# sum of subtraction errors = e1 + e2 + e3
total2 = (x1 - c + e1) + (x2 - c + e2) + (x3 - c + e3)
# corrected sum of squared deviations
total -= total2 ** 2 / n
# exact sum of squared deviations
desired = (x1 - c)**2 + (x2 - c)**2 + (x3 - c)**2
# expected versus actual
print(simplify(desired - total))
This gives:
(e1 + e2 + e3)**2/3
+ (-2*x1 + x2 + x3)**2/9
+ (x1 - 2*x2 + x3)**2/9
+ (x1 + x2 - 2*x3)**2/9
- (3*e1 + 2*x1 - x2 - x3)**2/9
- (3*e2 - x1 + 2*x2 - x3)**2/9
- (3*e3 - x1 - x2 + 2*x3)**2/9
-- Substituting in concrete values ----------------------
x1, x2, x3, e1, e2, e3 = 11, 17, 5, 0.3, 0.1, -0.2
This gives:
75.74000000000001 uncorrected total
75.72666666666667 "corrected" total
72.0 desired result |
|
Date |
User |
Action |
Args |
2021-08-26 01:14:13 | rhettinger | set | recipients:
+ rhettinger, mark.dickinson, steven.daprano, xtreak, reed, iritkatriel |
2021-08-26 01:14:13 | rhettinger | set | messageid: <1629940453.2.0.817078014396.issue39218@roundup.psfhosted.org> |
2021-08-26 01:14:13 | rhettinger | link | issue39218 messages |
2021-08-26 01:14:13 | rhettinger | create | |
|