Author rhettinger
Recipients Windson Yang, cheryl.sabella, francismb, rhettinger, steven.daprano
Date 2019-02-27.07:16:35
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1551251795.38.0.453148090195.issue35892@roundup.psfhosted.org>
In-reply-to
Content
> Are you happy guaranteeing that it will always be the first
> mode encountered?

Yes.  

All of the other implementations I looked at make some guarantee about which mode is returned.  Maple, Matlab, and Excel all return the first encountered.¹  That is convenient for us because it is what Counter(data).most_common(1) already does and does cheaply (single pass, no auxiliary memory).  It also matches what a number of our other tools do:

>>> max(3, 3.0)       # 3 is first encountered
3
>>> max(3.0, 3)       # 3.0 is first encountered
3.0
>>> list(dict.fromkeys('aabbaacc'))[0] # 'a' is first encountered
'a'
>>> sorted([3, 3.0])[0]  # 3 is first encountered (due to sort stability)
3
>>> sorted([3.0, 3])[0]  # 3.0 is first encountered (due to sort stability)
3.0

¹ Scipy returned the smallest value rather than first value but it algorithm was sorting based to accommodate running a parallel mode() computation on multiple columns of an array. For us, that approach would be much slow, would require more memory, and would require more bookkeeping. 

P.S. I'm no longer thinking this should be saved for Pycon sprints.  That is right at the beta 1 feature freeze.  We should aim to get this resolved well in advance of that.
History
Date User Action Args
2019-02-27 07:16:35rhettingersetrecipients: + rhettinger, steven.daprano, francismb, cheryl.sabella, Windson Yang
2019-02-27 07:16:35rhettingersetmessageid: <1551251795.38.0.453148090195.issue35892@roundup.psfhosted.org>
2019-02-27 07:16:35rhettingerlinkissue35892 messages
2019-02-27 07:16:35rhettingercreate