Message 198520 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	oscarbenjamin
Recipients	mark.dickinson, oscarbenjamin, rhettinger
Date	2013-09-28.14:46:27
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<CAHVvXxSN-ps8LA48+U3qRWtMpMCYhdQkZj9QHOmc4Cavq7-89w@mail.gmail.com>
In-reply-to	<1380307239.45.0.484925032555.issue19086@psf.upfronthosting.co.za>

Content
Thanks for responding Raymond. Raymond Hettinger wrote: > A "start" argument won't help you, because you will discard information on input. A sequence like [1E100, 0.1, -1E100, 0.1] wouldn't work when split into subtotal=fsum([1E100, 0.1]) and fsum([-1E100, 0.1], start=subtotal). I'm not sure if you've fully understood the proposal. The algorithm underlying fsum is (I think) called distillation. It compresses a list of floats into a shorter list of non-overlapping floats having the same exact sum. Where fsum deviates from this idea is that it doesn't return a list of floats, rather it returns only the largest float. My proposal is that there be a way to tell fsum to return the list of floats whose sum is exact and not rounded. Specifically the subtotal that is returned and fed back into fsum would be a list of floats and no information would be discarded. So fsum(numbers, []) would return a list of floats and that list can be passed back in again. > > My motivation for this is that I want to be able to write > > an efficient sum function that is accurate for a mix of ints > > and floats > > FWIW, fsum() already works well with integers as long as they don't exceed 53bits of precision. I was simplifying the use-case somewhat. I would also like this sum function to work with fractions and decimals neither of which coerces exactly to float (and potentially I want some non-stdlib types also). > For such exotic use cases, the decimal module would be a better alternative. Now that the decimal module has a C implementation, there is no reason not to use it for high precision applications. It is possible to coerce to Decimal and exactly sum a list of ints, floats and decimals (by trapping inexact and increasing the context precision). I have tried this under CPython 3.3 and it was 20-50x slower than fsum depending on how it manages the arithmetic context and whether it can be used safely with a generator that also manipulates the context - the safe version that doesn't build a list is 50x slower. It is also not possible to use Decimal exactly with Fractions. I believe that there are other use-cases for having fsum be usable incrementally. This would make it usable for accurate summation in incremental, parallel and distributed computation also. Unfortunately fsum itself already isn't used as much as it should be and I agree that probably all the use cases for this extension would be relatively obscure.

Thanks for responding Raymond.

Raymond Hettinger wrote:
> A "start" argument won't help you, because you will discard information
on input.  A sequence like [1E100, 0.1, -1E100, 0.1] wouldn't work when
split into subtotal=fsum([1E100, 0.1]) and fsum([-1E100, 0.1],
start=subtotal).

I'm not sure if you've fully understood the proposal.

The algorithm underlying fsum is (I think) called distillation. It
compresses a list of floats into a shorter list of non-overlapping floats
having the same exact sum. Where fsum deviates from this idea is that it
doesn't return a list of floats, rather it returns only the largest float.
My proposal is that there be a way to tell fsum to return the list of
floats whose sum is exact and not rounded. Specifically the subtotal that
is returned and fed back into fsum would be a list of floats and no
information would be discarded. So fsum(numbers, []) would return a list of
floats and that list can be passed back in again.

> > My motivation for this is that I want to be able to write
> > an efficient sum function that is accurate for a mix of ints
> > and floats
>
> FWIW, fsum() already works well with integers as long as they don't
exceed 53bits of precision.

I was simplifying the use-case somewhat. I would also like this sum
function to work with fractions and decimals neither of which coerces
exactly to float (and potentially I want some non-stdlib types also).

>  For such exotic use cases, the decimal module would be a better
alternative.  Now that the decimal module has a C implementation, there is
no reason not to use it for high precision applications.

It is possible to coerce to Decimal and exactly sum a list of ints, floats
and decimals (by trapping inexact and increasing the context precision). I
have tried this under CPython 3.3 and it was 20-50x slower than fsum
depending on how it manages the arithmetic context and whether it can be
used safely with a generator that also manipulates the context - the safe
version that doesn't build a list is 50x slower. It is also not possible to
use Decimal exactly with Fractions.

I believe that there are other use-cases for having fsum be usable
incrementally. This would make it usable for accurate summation in
incremental, parallel and distributed computation also. Unfortunately fsum
itself already isn't used as much as it should be and I agree that probably
all the use cases for this extension would be relatively obscure.

History
Date	User	Action	Args
2013-09-28 14:46:27	oscarbenjamin	set	recipients: + oscarbenjamin, rhettinger, mark.dickinson
2013-09-28 14:46:27	oscarbenjamin	link	issue19086 messages
2013-09-28 14:46:27	oscarbenjamin	create