This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author rhettinger
Recipients josh.r, mark.dickinson, rhettinger, steven.daprano, tim.peters
Date 2019-02-08.00:10:58
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1549584658.38.0.853557916562.issue35904@roundup.psfhosted.org>
In-reply-to
Content
>>    def fmean(seq: Sequence[float]) -> float:
>>        return math.fsum(seq) / len(seq)
>
> Is it intentional that this doesn't support iterators?

Since we need both the sum and the length, this seemed like a good starting point.  Also, the existing mean() function already covers the more general cases.

I suspect that it is common to keep the data in memory so that more than one descriptive statistic can be generated:

    data = load_measurements()
    data.sort()
    n = len(data)
    mu = fastmean(data)
    sigma = stdev(data, xbar=mu)
    low, q1, q2, q3, high = data[0], data[n//4], data[n//2], data[3*n//4], data[-1]
    popular = mode(data, first_tie=True)


It's possible (though possibly not desirable) to provide an fallback path:

    def fastmean(data: Iterable) -> float:
        try:
            return fsum(data) / len(data)
        except TypeError:
            # Slow alternative   
            return float(mean(data))
            # Memory intensive alternative
            data = list(data)
            return fsum(data) / len(data)  
            # Less accurate alternative
            total = n = 0
            for n, x in enumerate(data, start=1):
                total += x
            return total / n
History
Date User Action Args
2019-02-08 00:11:00rhettingersetrecipients: + rhettinger, tim.peters, mark.dickinson, steven.daprano, josh.r
2019-02-08 00:10:58rhettingersetmessageid: <1549584658.38.0.853557916562.issue35904@roundup.psfhosted.org>
2019-02-08 00:10:58rhettingerlinkissue35904 messages
2019-02-08 00:10:58rhettingercreate