This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author tim.peters
Recipients
Date 2002-07-31.06:02:48
SpamBayes Score
Marked as misclassified
Message-id
In-reply-to
Content
Logged In: YES 
user_id=31435

~sort gets more mysterious all the time:  the mystery now 
is why it's *not* much slower everywhere!  Here are the 
exact # of compares ~sort does:

i        n     sort    msort    %ch    lg(n!)
--  ------  -------  -------  -----  --------
15   32768   130484   188720  44.63    444255
16   65536   260019   377634  45.23    954037
17  131072   555035   755476  36.11   2039137
18  262144  1107826  1511174  36.41   4340409
19  524288  2218562  3022584  36.24   9205096
20 1048576  4430616  6045418  36.45  19458756

The last column is the information-theoretic lower bound for 
sorting random arrays of this size (no comparison-based 
algorithm can do better than than on average), showing 
that sort() and msort() are both getting a lot of good out of 
the duplicates.  But sort()'s special case for equal 
elements is extremely effective on ~sort's specific data 
pattern, and msort just isn't going to get close to that (it 
does better than sort() on skewed distributions with lots of 
duplicates, though).

The only thing I can think of that could transform 
what "should be" highly significant slowdowns into highly 
significant speedups on some boxes are catastrophic 
cache effects in samplesort.  But knowing something about 
how both algorithms work <wink>, that's not screaming "oh, 
of course".
History
Date User Action Args
2007-08-23 15:14:17adminlinkissue587076 messages
2007-08-23 15:14:17admincreate