Author rhettinger
Recipients rhettinger
Date 2019-04-06.21:22:16
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1554585736.61.0.903931747573.issue36546@roundup.psfhosted.org>
In-reply-to
Content
It is a common and useful data analysis technique to examine quartiles, deciles, and percentiles. It is especially helpful for comparing distinct datasets (heights of boys versus heights of girls) or for comparing against a reference distribution (empirical data versus a normal distribution for example).

--- sample session ---

>>> from statistics import NormalDist, quantiles
>>> from pylab import plot

# SAT exam scores
>>> sat = NormalDist(1060, 195)
>>> list(map(round, quantiles(sat, n=4)))       # quartiles
[928, 1060, 1192]
>>> list(map(round, quantiles(sat, n=10)))      # deciles
[810, 896, 958, 1011, 1060, 1109, 1162, 1224, 1310]

# Summarize a dataset
>>> data = [110, 96, 155, 87, 98, 82, 156, 88, 172, 102, 91, 184, 105, 114, 104]
>>> quantiles(data, n=2)                        # median
[104.0]
>>> quantiles(data, n=4)                        # quartiles
[91.0, 104.0, 155.0]
>>> quantiles(data, n=10)                       # deciles
[85.0, 88.6, 95.0, 99.6, 104.0, 108.0, 122.2, 155.8, 176.8]


# Assess when data is normally distributed by comparing quantiles
>>> reference_dist = NormalDist.from_samples(data)
>>> quantiles(reference_dist, n=4)
[93.81594518619364, 116.26666666666667, 138.71738814713967]

# Make a QQ plot to visualize how well the data matches a normal distribution
# plot(quantiles(data, n=7), quantiles(reference_dist, n=7))
History
Date User Action Args
2019-04-06 21:22:16rhettingersetrecipients: + rhettinger
2019-04-06 21:22:16rhettingersetmessageid: <1554585736.61.0.903931747573.issue36546@roundup.psfhosted.org>
2019-04-06 21:22:16rhettingerlinkissue36546 messages
2019-04-06 21:22:16rhettingercreate