Issue 33573: statistics.median does not work with ordinal scale, add doc

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/77754

classification

Title:	statistics.median does not work with ordinal scale, add doc
Type:	enhancement	Stage:	resolved
Components:	Documentation, Library (Lib)	Versions:	Python 3.8, Python 3.7, Python 3.6

process

Status:	closed	Resolution:	fixed
Dependencies:		Superseder:
Assigned To:	docs@python	Nosy List:	W deW, docs@python, steven.daprano, taleinat, terry.reedy
Priority:	normal	Keywords:	patch

Created on 2018-05-18 19:29 by W deW, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Files
File name	Uploaded	Description	Edit
testMedian.py	W deW, 2018-05-18 19:29	simple demonstraion of failure

Pull Requests
URL	Status	Linked	Edit
PR 7587	merged	taleinat, 2018-06-10 13:15
PR 7906	merged	miss-islington, 2018-06-25 11:05
PR 7907	merged	miss-islington, 2018-06-25 11:06

Messages (10)
msg317048 - (view)	Author: W deW (W deW) *	Date: 2018-05-18 19:29
The 0.5-quantile or median is defined for ordinal, interval, and ratio scales. An Enumerator as derived from Enum and extended with rich comparison methods implements an ordinal scale. Therefore calculating the median over a list of such enum-elements ought to be possible. The current implementation tries to interpolate the median value by averaging the two middle observations. This is allowed for interval and ratio scales, but since this interpolation involves an addition, not so for ordinal scales. Although computationally it is possible to do this for numeric ordinal variables, logically it is non-sense for the distance between ordinal values is - by definition - unknown. On non-numeric ordinal values it is even computationally impossible. The correct return value would be: the first value in an ordered set where al least half the number of observations is smaller or equal than it. This is observation[len(observation)//2] for odd and even length ordered lists of values. Whether the same applies to interval and ratio scales is a matter of opinion. The currently implemented algorith definitely is more popular these days.
msg317120 - (view)	Author: Steven D'Aprano (steven.daprano) *	Date: 2018-05-19 14:18
For ordinal scales, you should use either median_low or median_high. I don't think the standard median function ought to choose for you whether to take the low or high median. It is better to be explicit about which you want, by calling the relevant function, than for median to guess which one you need.
msg317122 - (view)	Author: Steven D'Aprano (steven.daprano) *	Date: 2018-05-19 14:53
By the way, this isn't a crash (that's for things which cause the interpreter to segfault). I'm marking this as Not a bug, but I'm open to suggestions to improve either the documentation or the median functions.
msg317125 - (view)	Author: Steven D'Aprano (steven.daprano) *	Date: 2018-05-19 15:10
What do you think of adding a note in the documentation for median? "If your data is ordinal (supports order operations) but not numeric (doesn't support addition), you should use ``median_low`` or ``median_high`` instead."
msg317248 - (view)	Author: W deW (W deW) *	Date: 2018-05-21 19:04
Changing the documentation in tis way seems to me an excellent and easy way to solve the issue.
msg317694 - (view)	Author: Terry J. Reedy (terry.reedy) *	Date: 2018-05-25 17:56
I agree.
msg319219 - (view)	Author: Tal Einat (taleinat) *	Date: 2018-06-10 13:17
PR ready for review.
msg320414 - (view)	Author: Tal Einat (taleinat) *	Date: 2018-06-25 11:04
New changeset fdd6e0bf18517c3dc5e24c48fbfe890229fad1b5 by Tal Einat in branch 'master': bpo-33573: docs to suggest median() alternatives for non-numeric data (GH-7587) https://github.com/python/cpython/commit/fdd6e0bf18517c3dc5e24c48fbfe890229fad1b5
msg320416 - (view)	Author: Tal Einat (taleinat) *	Date: 2018-06-25 11:18
New changeset 150cd3cb272021e9a2d865dd28486b00199fe77d by Tal Einat (Miss Islington (bot)) in branch '3.7': [3.7] bpo-33573: docs to suggest median() alternatives for non-numeric data (GH-7587) (GH-7906) https://github.com/python/cpython/commit/150cd3cb272021e9a2d865dd28486b00199fe77d
msg320417 - (view)	Author: Tal Einat (taleinat) *	Date: 2018-06-25 11:27
New changeset 8fd8cfa369fe4b6d6ac430cd28ead32717df7bee by Tal Einat (Miss Islington (bot)) in branch '3.6': [3.6] bpo-33573: docs to suggest median() alternatives for non-numeric data (GH-7587) (GH-7907) https://github.com/python/cpython/commit/8fd8cfa369fe4b6d6ac430cd28ead32717df7bee

History
Date	User	Action	Args
2022-04-11 14:59:00	admin	set	github: 77754
2018-06-25 11:27:42	taleinat	set	status: open -> closed resolution: fixed stage: patch review -> resolved
2018-06-25 11:27:03	taleinat	set	messages: + msg320417
2018-06-25 11:18:56	taleinat	set	messages: + msg320416
2018-06-25 11:06:13	miss-islington	set	pull_requests: + pull_request7513
2018-06-25 11:05:19	miss-islington	set	pull_requests: + pull_request7512
2018-06-25 11:04:04	taleinat	set	messages: + msg320414
2018-06-10 13:17:23	taleinat	set	nosy: + taleinat messages: + msg319219
2018-06-10 13:15:22	taleinat	set	keywords: + patch stage: needs patch -> patch review pull_requests: + pull_request7209
2018-05-25 17:56:20	terry.reedy	set	resolution: not a bug -> (no value) assignee: docs@python stage: needs patch title: statistics.median does not work with ordinal scale -> statistics.median does not work with ordinal scale, add doc nosy: + terry.reedy, docs@python versions: + Python 3.6, Python 3.8 messages: + msg317694 components: + Documentation type: behavior -> enhancement
2018-05-21 19:04:38	W deW	set	messages: + msg317248
2018-05-19 15:10:19	steven.daprano	set	messages: + msg317125
2018-05-19 14:53:35	steven.daprano	set	type: crash -> behavior resolution: not a bug messages: + msg317122 versions: + Python 3.7, - Python 3.4
2018-05-19 14:18:09	steven.daprano	set	nosy: + steven.daprano messages: + msg317120
2018-05-18 19:29:45	W deW	create