Message 283092 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	sria91
Recipients	sria91, steven.daprano, wolma
Date	2016-12-13.10:17:21
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<CAN3Ck4AF0b4JQ4iJ1hcg7vhJEbJf1yncZFnofpw1hwW760XcNw@mail.gmail.com>
In-reply-to	<CAN3Ck4CZWhxkdxSWwmDQT_922Weqgw_6U6SCRvDbugY1hNNt8w@mail.gmail.com>

Content
@steven: data = [1, 2, 3, 4, 4, 4, 5, 6, 7, 7, 8, 8, 8, 8, 8, 8, 8, 9, 9] is clearly unimodal with mode 8 data would have been bimodal if 4 repeated exactly the same (7) number of times as 8, like this: data = [1, 2, 3, 4, 4, 4, 4, 4, 4, 4, 5, 6, 7, 7, 8, 8, 8, 8, 8, 8, 8, 9, 9] in which case the new patch in PR 50 would return a tuple (4, 8) Thanks & Regards Srikanth Anantharam +91 7204 350429 https://sria91.github.io/ Sent from Android On 13-Dec-2016 3:24 PM, "Steven D'Aprano" <report@bugs.python.org> wrote: Steven D'Aprano added the comment: On Tue, Dec 13, 2016 at 09:35:22AM +0000, Srikanth Anantharam wrote: > > Srikanth Anantharam added the comment: > > A better choice would be to return a tuple of values (sliced from the > table). And let the user decide which one to use. The current mode() function is designed for a very basic use-case, where you have an obvious single mode from discrete data. The problem with dealing with multiple modes is that its not easy to tell the difference between a genuinely multi-modal sample and one which just happens to have a few samples with the same value: data = [1, 2, 3, 4, 4, 4, 5, 6, 7, 7, 8, 8, 8, 8, 8, 8, 8, 9, 9] Assuming the sampling is fair, 8 is clearly the mode; but is it bimodal with 4 the second mode? Or perhaps even four modes, 8, 4, 7 and 9? I have plans for introducing a binning function to collect data into bins and run statistics on the bins. That might be a better way to deal with multi-modal samples: if you bin the data (for discrete data, use a bin size of 1) and then look at the frequencies, you can decide how many modes there are. Thanks for the suggestion. ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue28956> _______________________________________

@steven:

data = [1, 2, 3, 4, 4, 4, 5, 6, 7, 7, 8, 8, 8, 8, 8, 8, 8, 9, 9]
is clearly unimodal with mode 8

data would have been bimodal if 4 repeated exactly the same (7) number of
times as 8, like this:
data = [1, 2, 3, 4, 4, 4, 4, 4, 4, 4, 5, 6, 7, 7, 8, 8, 8, 8, 8, 8, 8, 9, 9]

in which case the new patch in PR 50 would return a tuple
(4, 8)

Thanks & Regards
Srikanth Anantharam
+91 7204 350429
https://sria91.github.io/

Sent from Android

On 13-Dec-2016 3:24 PM, "Steven D'Aprano" <report@bugs.python.org> wrote:

Steven D'Aprano added the comment:

On Tue, Dec 13, 2016 at 09:35:22AM +0000, Srikanth Anantharam wrote:
>
> Srikanth Anantharam added the comment:
>
> A better choice would be to return a tuple of values (sliced from the
> table). And let the user decide which one to use.

The current mode() function is designed for a very basic use-case, where
you have an obvious single mode from discrete data.

The problem with dealing with multiple modes is that its not easy to
tell the difference between a genuinely multi-modal sample and one which
just happens to have a few samples with the same value:

data = [1, 2, 3, 4, 4, 4, 5, 6, 7, 7, 8, 8, 8, 8, 8, 8, 8, 9, 9]

Assuming the sampling is fair, 8 is clearly the mode; but is it bimodal
with 4 the second mode? Or perhaps even four modes, 8, 4, 7 and 9?

I have plans for introducing a binning function to collect data into
bins and run statistics on the bins. That might be a better way to deal
with multi-modal samples: if you bin the data (for discrete data, use a
bin size of 1) and then look at the frequencies, you can decide how many
modes there are.

Thanks for the suggestion.

----------

_______________________________________
Python tracker <report@bugs.python.org>
<http://bugs.python.org/issue28956>
_______________________________________

History
Date	User	Action	Args
2016-12-13 10:17:21	sria91	set	recipients: + sria91, steven.daprano, wolma
2016-12-13 10:17:21	sria91	link	issue28956 messages
2016-12-13 10:17:21	sria91	create