Author tim.peters
Recipients steven.daprano, terry.reedy, thomasahle, tim.peters, vajrasky
Date 2014-05-31.02:53:08
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1401504789.72.0.713917805361.issue21592@psf.upfronthosting.co.za>
In-reply-to
Content
I suggest this needs a measurable goal, like "minimize expected-case time" or "minimize worst-case time".  "Use a state of the art implementation" isn't a measurable goal - it's an abstract wish with no measurable merit on its own ;-)

Note that I wrote the "median-of-k" code referenced in the first post here (it was in reply to David Eppstein).  Even then it was hard to beat sort() + index.

It's harder now, and for an important reason:  Python's _current_ sort can exploit many kinds of partial order, and can often complete in linear time.

This makes picking "typical" input for timing crucial.  If you want to skew it to put sort() + index at maximum disadvantage, use only shuffled (random order) pairwise-unequal inputs.  But most streams of data do have some kinds of partial order (which is why the newer sort() implementation has been so successful), and "typical" is a much tougher thing to capture than "shuffled".
History
Date User Action Args
2014-05-31 02:53:09tim.peterssetrecipients: + tim.peters, terry.reedy, steven.daprano, thomasahle, vajrasky
2014-05-31 02:53:09tim.peterssetmessageid: <1401504789.72.0.713917805361.issue21592@psf.upfronthosting.co.za>
2014-05-31 02:53:09tim.peterslinkissue21592 messages
2014-05-31 02:53:08tim.peterscreate