Message40670
Logged In: YES
user_id=31435
In Kevin's company database (see Python-Dev), there were
8 fields we could sort on.
Two fields were strongly correlated with the order of the
records as given, and msort() was >6x faster on those.
Two fields were weakly correlated, and msort was a major
win on those (>25% speedup).
One field had many duplicates, with a highly skewed
distribution. msort was better than 2x faster on that.
But the rest (phone#, #employees, address) were
essentially randomly ordered, and msort was
systematically a few percent slower on those. That
wouldn't have been remarkable, except that the percentage
slowdown was a few times larger than the percentage by
which msort did more comparisons than sort().
I eventually figured out the obvious: the # of records
wasn't an exact power of 2, and on random data msort then
systematically arranged for the final merge to be between a
run with a large power-of-2 size, and a run with the little bit
left over. That adds a bunch of compares over perfectly
balanced merges, plus O(N) pointer copies, just to get that
little bit in place.
The new merge4.patch repairs that as best as (I think) non-
heroically possible, quickly picking a minimum run length in
advance that should almost never lead to a "bad" final
merge when the data is randomly ordered.
In each of Kevin's 3 "problem sorts", msort() does fewer
compares than sort() now, and the runtime is generally
within a fraction of a percent. These all-in-cache cases still
seem to favor sort(), though, and it appears to be because
msort() does a lot more data movement (note that
quicksorts do no more than one swap per compare, and
often none, while mergesorts do a copy on every
compare). The other 5 major-to-killer wins msort got on
this data remain intact.
The code changes needed were tiny, but the doc file
changed a lot more.
Note that this change has no effect on arrays with power-of-
2 sizes, so sortperf.py timings shouldn't change (and don't
on my box). The code change is solely to compute a good
minimum run length before the main loop begins, and it
happens to return the same value as was hard-coded
before when the array has a power-of-2 size.
More testing on real data would be most welcome! Kevin's
data was very helpful to me. |
|
Date |
User |
Action |
Args |
2007-08-23 15:14:16 | admin | link | issue587076 messages |
2007-08-23 15:14:16 | admin | create | |
|