Message 335057 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	steven.daprano
Recipients	josh.r, mark.dickinson, rhettinger, steven.daprano, tim.peters
Date	2019-02-08.05:28:53
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<20190208052845.GJ1834@ando.pearwood.info>
In-reply-to	<1549411161.36.0.783593325223.issue35904@roundup.psfhosted.org>

Content
> On my current 3.8 build, this code given an approx 500x speed-up On my system, I only get a 30x speed-up using your timeit code. Using ints instead of random floats, I only get a 9x speed-up. This just goes to show how sensitive these timing results are on platform and hardware. What do you think of this implementation? def floatmean(data:Iterable) -> Float: try: n = len(data) except TypeError: # Handle iterators with no len. n = 0 def count(x): nonlocal n n += 1 return x total = math.fsum(map(count, data)) return total/n else: return math.fsum(data)/n Compared to the "no frills" fsum()/len() version: - I see no visible slowdown on lists of floats; - it handles iterators as well. On my computer, the difference between the sequence path and the iterator path is just a factor of 3.5. How does it compare on other machines? As for the name, I think we have three reasonable candidates: float_mean fast_mean fmean (with or without underscores for the first two). Do people have a preference?

> On my current 3.8 build, this code given an approx 500x speed-up

On my system, I only get a 30x speed-up using your timeit code. Using 
ints instead of random floats, I only get a 9x speed-up.

This just goes to show how sensitive these timing results are on 
platform and hardware.

What do you think of this implementation?

def floatmean(data:Iterable) -> Float:
    try:
        n = len(data)
    except TypeError:
        # Handle iterators with no len.
        n = 0
        def count(x):
            nonlocal n
            n += 1
            return x
        total = math.fsum(map(count, data))
        return total/n
    else:
        return math.fsum(data)/n

Compared to the "no frills" fsum()/len() version:

- I see no visible slowdown on lists of floats;
- it handles iterators as well.

On my computer, the difference between the sequence path and the 
iterator path is just a factor of 3.5. How does it compare on other 
machines?

As for the name, I think we have three reasonable candidates:

float_mean
fast_mean
fmean

(with or without underscores for the first two). Do people have a 
preference?

History
Date	User	Action	Args
2019-02-08 05:28:54	steven.daprano	set	recipients: + steven.daprano, tim.peters, rhettinger, mark.dickinson, josh.r
2019-02-08 05:28:53	steven.daprano	link	issue35904 messages
2019-02-08 05:28:53	steven.daprano	create