Author steven.daprano
Recipients rhettinger, steven.daprano
Date 2019-04-07.23:37:16
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <>
I think adding quantiles (sometimes called fractiles) is a good feature to add. I especially have some use-cases for quartiles. I especially like that it delegates to the inv_cdf() method when available, and I'm happy with the API you suggested.

Forgive me if you're already aware of this, but the calculation of quantiles is unfortunately complicated by the fact that there are so many different ways to calculate them. (I see you have mentioned a potential future API for interp_method.)

See, for example:

for a discussion. My own incomplete survey of statistics software has found about 20 distinct methods for calculating quantiles in use. I'm very happy to see this function added, but I'm not happy to commit to a specific calculation method without discussion.

If you agree that we can change the implementation later (and hence the specific cut points returned) then I see no reason why we can't get this in before feature freeze, and then do a review to find the "best" default implementation later. I already have three candidates:

1. Langford suggests his "CDF method 4", which is equivalent to Hyndman and Fan's Definition 2; it is also the default method used by SAS.

2. Hyndman and Fan themselves recommend their Definition 8:

3. R's default is H&F's Definition 7. (I suggest this only to ease comparisons with results from R, not because it has any statistical advantage.)

Do we agree that there is to be no backwards-compatibility guarantee made on the implementation and the specific cut-points returned? (Or at least not yet.)
Date User Action Args
2019-04-07 23:37:17steven.dapranosetrecipients: + steven.daprano, rhettinger
2019-04-07 23:37:17steven.dapranosetmessageid: <>
2019-04-07 23:37:17steven.dapranolinkissue36546 messages
2019-04-07 23:37:16steven.dapranocreate