Message 40670 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	tim.peters
Recipients
Date	2002-07-30.16:14:58
SpamBayes Score
Marked as misclassified
Message-id
In-reply-to

Content
Logged In: YES user_id=31435 In Kevin's company database (see Python-Dev), there were 8 fields we could sort on. Two fields were strongly correlated with the order of the records as given, and msort() was >6x faster on those. Two fields were weakly correlated, and msort was a major win on those (>25% speedup). One field had many duplicates, with a highly skewed distribution. msort was better than 2x faster on that. But the rest (phone#, #employees, address) were essentially randomly ordered, and msort was systematically a few percent slower on those. That wouldn't have been remarkable, except that the percentage slowdown was a few times larger than the percentage by which msort did more comparisons than sort(). I eventually figured out the obvious: the # of records wasn't an exact power of 2, and on random data msort then systematically arranged for the final merge to be between a run with a large power-of-2 size, and a run with the little bit left over. That adds a bunch of compares over perfectly balanced merges, plus O(N) pointer copies, just to get that little bit in place. The new merge4.patch repairs that as best as (I think) non- heroically possible, quickly picking a minimum run length in advance that should almost never lead to a "bad" final merge when the data is randomly ordered. In each of Kevin's 3 "problem sorts", msort() does fewer compares than sort() now, and the runtime is generally within a fraction of a percent. These all-in-cache cases still seem to favor sort(), though, and it appears to be because msort() does a lot more data movement (note that quicksorts do no more than one swap per compare, and often none, while mergesorts do a copy on every compare). The other 5 major-to-killer wins msort got on this data remain intact. The code changes needed were tiny, but the doc file changed a lot more. Note that this change has no effect on arrays with power-of- 2 sizes, so sortperf.py timings shouldn't change (and don't on my box). The code change is solely to compute a good minimum run length before the main loop begins, and it happens to return the same value as was hard-coded before when the array has a power-of-2 size. More testing on real data would be most welcome! Kevin's data was very helpful to me.

Logged In: YES 
user_id=31435

In Kevin's company database (see Python-Dev), there were 
8 fields we could sort on.

Two fields were strongly correlated with the order of the 
records as given, and msort() was >6x faster on those.

Two fields were weakly correlated, and msort was a major 
win on those (>25% speedup).

One field had many duplicates, with a highly skewed 
distribution.  msort was better than 2x faster on that.

But the rest (phone#, #employees, address) were 
essentially randomly ordered, and msort was 
systematically a few percent slower on those.  That 
wouldn't have been remarkable, except that the percentage 
slowdown was a few times larger than the percentage by 
which msort did more comparisons than sort().

I eventually figured out the obvious:  the # of records 
wasn't an exact power of 2, and on random data msort then 
systematically arranged for the final merge to be between a 
run with a large power-of-2 size, and a run with the little bit 
left over.  That adds a bunch of compares over perfectly 
balanced merges, plus O(N) pointer copies, just to get that 
little bit in place.

The new merge4.patch repairs that as best as (I think) non-
heroically possible, quickly picking a minimum run length in 
advance that should almost never lead to a "bad" final 
merge when the data is randomly ordered.

In each of Kevin's 3 "problem sorts", msort() does fewer 
compares than sort() now, and the runtime is generally 
within a fraction of a percent.  These all-in-cache cases still 
seem to favor sort(), though, and it appears to be because 
msort() does a lot more data movement (note that 
quicksorts do no more than one swap per compare, and 
often none, while mergesorts do a copy on every 
compare).  The other 5 major-to-killer wins msort got on 
this data remain intact.

The code changes needed were tiny, but the doc file 
changed a lot more.

Note that this change has no effect on arrays with power-of-
2 sizes, so sortperf.py timings shouldn't change (and don't 
on my box).  The code change is solely to compute a good 
minimum run length before the main loop begins, and it 
happens to return the same value as was hard-coded 
before when the array has a power-of-2 size.

More testing on real data would be most welcome!  Kevin's 
data was very helpful to me.

History
Date	User	Action	Args
2007-08-23 15:14:16	admin	link	issue587076 messages
2007-08-23 15:14:16	admin	create