This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author tim.peters
Recipients
Date 2002-07-30.16:14:58
SpamBayes Score
Marked as misclassified
Message-id
In-reply-to
Content
Logged In: YES 
user_id=31435

In Kevin's company database (see Python-Dev), there were 
8 fields we could sort on.

Two fields were strongly correlated with the order of the 
records as given, and msort() was >6x faster on those.

Two fields were weakly correlated, and msort was a major 
win on those (>25% speedup).

One field had many duplicates, with a highly skewed 
distribution.  msort was better than 2x faster on that.

But the rest (phone#, #employees, address) were 
essentially randomly ordered, and msort was 
systematically a few percent slower on those.  That 
wouldn't have been remarkable, except that the percentage 
slowdown was a few times larger than the percentage by 
which msort did more comparisons than sort().

I eventually figured out the obvious:  the # of records 
wasn't an exact power of 2, and on random data msort then 
systematically arranged for the final merge to be between a 
run with a large power-of-2 size, and a run with the little bit 
left over.  That adds a bunch of compares over perfectly 
balanced merges, plus O(N) pointer copies, just to get that 
little bit in place.

The new merge4.patch repairs that as best as (I think) non-
heroically possible, quickly picking a minimum run length in 
advance that should almost never lead to a "bad" final 
merge when the data is randomly ordered.

In each of Kevin's 3 "problem sorts", msort() does fewer 
compares than sort() now, and the runtime is generally 
within a fraction of a percent.  These all-in-cache cases still 
seem to favor sort(), though, and it appears to be because 
msort() does a lot more data movement (note that 
quicksorts do no more than one swap per compare, and 
often none, while mergesorts do a copy on every 
compare).  The other 5 major-to-killer wins msort got on 
this data remain intact.

The code changes needed were tiny, but the doc file 
changed a lot more.

Note that this change has no effect on arrays with power-of-
2 sizes, so sortperf.py timings shouldn't change (and don't 
on my box).  The code change is solely to compute a good 
minimum run length before the main loop begins, and it 
happens to return the same value as was hard-coded 
before when the array has a power-of-2 size.

More testing on real data would be most welcome!  Kevin's 
data was very helpful to me.
History
Date User Action Args
2007-08-23 15:14:16adminlinkissue587076 messages
2007-08-23 15:14:16admincreate