Message339544
It is a common and useful data analysis technique to examine quartiles, deciles, and percentiles. It is especially helpful for comparing distinct datasets (heights of boys versus heights of girls) or for comparing against a reference distribution (empirical data versus a normal distribution for example).
--- sample session ---
>>> from statistics import NormalDist, quantiles
>>> from pylab import plot
# SAT exam scores
>>> sat = NormalDist(1060, 195)
>>> list(map(round, quantiles(sat, n=4))) # quartiles
[928, 1060, 1192]
>>> list(map(round, quantiles(sat, n=10))) # deciles
[810, 896, 958, 1011, 1060, 1109, 1162, 1224, 1310]
# Summarize a dataset
>>> data = [110, 96, 155, 87, 98, 82, 156, 88, 172, 102, 91, 184, 105, 114, 104]
>>> quantiles(data, n=2) # median
[104.0]
>>> quantiles(data, n=4) # quartiles
[91.0, 104.0, 155.0]
>>> quantiles(data, n=10) # deciles
[85.0, 88.6, 95.0, 99.6, 104.0, 108.0, 122.2, 155.8, 176.8]
# Assess when data is normally distributed by comparing quantiles
>>> reference_dist = NormalDist.from_samples(data)
>>> quantiles(reference_dist, n=4)
[93.81594518619364, 116.26666666666667, 138.71738814713967]
# Make a QQ plot to visualize how well the data matches a normal distribution
# plot(quantiles(data, n=7), quantiles(reference_dist, n=7)) |
|
Date |
User |
Action |
Args |
2019-04-06 21:22:16 | rhettinger | set | recipients:
+ rhettinger |
2019-04-06 21:22:16 | rhettinger | set | messageid: <1554585736.61.0.903931747573.issue36546@roundup.psfhosted.org> |
2019-04-06 21:22:16 | rhettinger | link | issue36546 messages |
2019-04-06 21:22:16 | rhettinger | create | |
|