This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author remi.lapeyre
Recipients mark.dickinson, remi.lapeyre, rhettinger, steven.daprano
Date 2019-05-24.15:35:28
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <>
Hi Steven, thanks for taking the time to reviewing my patch.

Regarding the relevance of add select(), I was looking for work to do in the bug tracker and found some references to it ( for example).

I knew that there is multiples definition of the percentiles but got sloppy in my previous response by wanting to answer quickly. I will try not to do this again.

Regarding the use of sorting, I thought that sorting would be quicker than doing the other linear-time algorithm in Python given the general performance of Tim sort, some tests in agreed with that.

For the iterator, I was thinking about how to implement percentiles when writing select() and thought that by writing:

def _select(data, i, key=None):
    if not len(data):
        raise StatisticsError("select requires at least one data point")
    if not (1 <= i <= len(data)):
        raise StatisticsError(f"The index looked for must be between 1 and {len(data)}")
    data = sorted(data, key=key)
    return islice(data, i-1, None)

def select(data, i, key=None):
    return next(_select(data, y, key=key))

and then doing some variant of:

    it = _select(data, i, key=key)
    left, right = next(it), next(it)
    # compute percentile with left and right

to implement the quantiles without sorting multiple time the list. Now that quantiles() has been implement by Raymond Hettinger, this is moot anyway.    

Since its probably not useful, feel free to disregard my PR.
Date User Action Args
2019-05-24 15:35:28remi.lapeyresetrecipients: + remi.lapeyre, rhettinger, mark.dickinson, steven.daprano
2019-05-24 15:35:28remi.lapeyresetmessageid: <>
2019-05-24 15:35:28remi.lapeyrelinkissue35775 messages
2019-05-24 15:35:28remi.lapeyrecreate