Message 195004 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	oscarbenjamin
Recipients	agthorr, belopolsky, christian.heimes, gregory.p.smith, mark.dickinson, oscarbenjamin, pitrou, ronaldoussoren, sjt, steven.daprano, stutzbach, tshepang, vajrasky
Date	2013-08-12.19:59:50
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<CAHVvXxTKjU8P3aK=t-hQntym8-1XxJZHkO-T_aW75zDkwsKsLQ@mail.gmail.com>
In-reply-to	<5209356C.3010902@pearwood.info>

Content
On 12 August 2013 20:20, Steven D'Aprano <report@bugs.python.org> wrote: > On 12/08/13 19:21, Mark Dickinson wrote: >> About the implementation of sum: > add_partial is no longer documented as a public function, so I'm open to switching algorithms in the future. Along similar lines it might be good to remove the doc-test for using decimal.ROUND_DOWN. I can't see any good reason for anyone to want that behaviour when e.g. computing the mean() whereas I can see reasons for wanting to reduce rounding error for decimal in statistics.sum. It might be a good idea not to tie yourself to the guarantee implied by that test. I tried an alternative implementation of sum() that can also reduce rounding error with decimals but it failed that test (by making the result more accurate). Here's the sum() I wrote: def sum(data, start=0): if not isinstance(start, numbers.Number): raise TypeError('sum only accepts numbers') inexact_types = (float, complex, decimal.Decimal) def isexact(num): return not isinstance(num, inexact_types) if isexact(start): exact_total, inexact_total = start, 0 else: exact_total, inexact_total = 0, start carrybits = 0 for x in data: if isexact(x): exact_total = exact_total + x else: new_inexact_total = inexact_total + (x + carrybits) carrybits = -(((new_inexact_total - inexact_total) - x) - carrybits) inexact_total = new_inexact_total return (exact_total + inexact_total) + carrybits It is more accurate for e.g. the following: nums = [decimal.Decimal(10 ** n) for n in range(50)] nums += [-n for n in reversed(nums)] assert sum(nums) == 0 However there will also be other situations where it is less accurate such as print(sum([-1e30, +1e60, 1, 3, -1e60, 1e30])) so it may not be suitable as-is.

On 12 August 2013 20:20, Steven D'Aprano <report@bugs.python.org> wrote:
> On 12/08/13 19:21, Mark Dickinson wrote:
>> About the implementation of sum:
> add_partial is no longer documented as a public function, so I'm open to switching algorithms in the future.

Along similar lines it might be good to remove the doc-test for using
decimal.ROUND_DOWN. I can't see any good reason for anyone to want
that behaviour when e.g. computing the mean() whereas I can see
reasons for wanting to reduce rounding error for decimal in
statistics.sum. It might be a good idea not to tie yourself to the
guarantee implied by that test.

I tried an alternative implementation of sum() that can also reduce
rounding error with decimals but it failed that test (by making the
result more accurate). Here's the sum() I wrote:

def sum(data, start=0):

    if not isinstance(start, numbers.Number):
        raise TypeError('sum only accepts numbers')

    inexact_types = (float, complex, decimal.Decimal)
    def isexact(num):
        return not isinstance(num, inexact_types)

    if isexact(start):
        exact_total, inexact_total = start, 0
    else:
        exact_total, inexact_total = 0, start

    carrybits = 0

    for x in data:
        if isexact(x):
            exact_total = exact_total + x
        else:
            new_inexact_total = inexact_total + (x + carrybits)
            carrybits = -(((new_inexact_total - inexact_total) - x) - carrybits)
            inexact_total = new_inexact_total

    return (exact_total + inexact_total) + carrybits

It is more accurate for e.g. the following:
    nums = [decimal.Decimal(10 ** n) for n in range(50)]
    nums += [-n for n in reversed(nums)]
    assert sum(nums) == 0

However there will also be other situations where it is less accurate such as
    print(sum([-1e30, +1e60, 1, 3, -1e60, 1e30]))
so it may not be suitable as-is.

History
Date	User	Action	Args
2013-08-12 19:59:50	oscarbenjamin	set	recipients: + oscarbenjamin, gregory.p.smith, ronaldoussoren, mark.dickinson, belopolsky, pitrou, agthorr, christian.heimes, stutzbach, steven.daprano, sjt, tshepang, vajrasky
2013-08-12 19:59:50	oscarbenjamin	link	issue18606 messages
2013-08-12 19:59:50	oscarbenjamin	create