Author oscarbenjamin
Recipients agthorr, belopolsky, christian.heimes, gregory.p.smith, mark.dickinson, oscarbenjamin, pitrou, ronaldoussoren, sjt, steven.daprano, stutzbach, tshepang, vajrasky
Date 2013-08-12.19:59:50
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <CAHVvXxTKjU8P3aK=t-hQntym8-1XxJZHkO-T_aW75zDkwsKsLQ@mail.gmail.com>
In-reply-to <5209356C.3010902@pearwood.info>
Content
On 12 August 2013 20:20, Steven D'Aprano <report@bugs.python.org> wrote:
> On 12/08/13 19:21, Mark Dickinson wrote:
>> About the implementation of sum:
> add_partial is no longer documented as a public function, so I'm open to switching algorithms in the future.

Along similar lines it might be good to remove the doc-test for using
decimal.ROUND_DOWN. I can't see any good reason for anyone to want
that behaviour when e.g. computing the mean() whereas I can see
reasons for wanting to reduce rounding error for decimal in
statistics.sum. It might be a good idea not to tie yourself to the
guarantee implied by that test.

I tried an alternative implementation of sum() that can also reduce
rounding error with decimals but it failed that test (by making the
result more accurate). Here's the sum() I wrote:

def sum(data, start=0):

    if not isinstance(start, numbers.Number):
        raise TypeError('sum only accepts numbers')

    inexact_types = (float, complex, decimal.Decimal)
    def isexact(num):
        return not isinstance(num, inexact_types)

    if isexact(start):
        exact_total, inexact_total = start, 0
    else:
        exact_total, inexact_total = 0, start

    carrybits = 0

    for x in data:
        if isexact(x):
            exact_total = exact_total + x
        else:
            new_inexact_total = inexact_total + (x + carrybits)
            carrybits = -(((new_inexact_total - inexact_total) - x) - carrybits)
            inexact_total = new_inexact_total

    return (exact_total + inexact_total) + carrybits

It is more accurate for e.g. the following:
    nums = [decimal.Decimal(10 ** n) for n in range(50)]
    nums += [-n for n in reversed(nums)]
    assert sum(nums) == 0

However there will also be other situations where it is less accurate such as
    print(sum([-1e30, +1e60, 1, 3, -1e60, 1e30]))
so it may not be suitable as-is.
History
Date User Action Args
2013-08-12 19:59:50oscarbenjaminsetrecipients: + oscarbenjamin, gregory.p.smith, ronaldoussoren, mark.dickinson, belopolsky, pitrou, agthorr, christian.heimes, stutzbach, steven.daprano, sjt, tshepang, vajrasky
2013-08-12 19:59:50oscarbenjaminlinkissue18606 messages
2013-08-12 19:59:50oscarbenjamincreate