This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author tim.peters
Recipients tim.peters
Date 2018-09-03.19:08:12
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1536001692.84.0.56676864532.issue34561@psf.upfronthosting.co.za>
In-reply-to
Content
Looks like all sorts of academics are exercised over the run-merging order now.  Here's a paper that's unhappy because timsort's strategy, and 2-merge too, aren't always near-optimal with respect to the entropy of the distribution of natural run lengths:

"Nearly-Optimal Mergesorts:  Fast, Practical Sorting Methods That Optimally Adapt to Existing Runs"
J. Ian Munro
Sebastian Wild
https://arxiv.org/abs/1805.04154

Their alternatives are worth looking at too.  Offhand, their "powersort" ordering looks more promising to me than their "peeksort".  It was very deliberate that timsort looks at whether to merge a run immediately after identifying one, for the "freshness in memory hierarchy" reason.  That peeksort does not is, I expect, more significant in real life than the "one little blemish" they call it.

In any case, don't flip out over this.  On randomly ordered input, timsort already strongly tends toward perfectly balanced merges, and there's no significant improvement to be had over that.  On the other hand, on inputs with significant order that _can_ be exploited by this approach, the gains are far more due to "galloping" than to the order in which runs are merged.  All these papers ignore galloping entirely.  Example:  take range(1_000_000), cut it in half, and riffle shuffle it.  So you get

    0 500000 1 500001 ... 499999 999999 or
    500000 0 500001 1 ... 999999 499999
    
Either way, there are half a million natural runs, each of length 2.  Any balanced way of merging them is as good as it gets.  It's galloping alone that buys huge improvements in cases like this.

Especially in the context of Python, where object comparisons are (in general) far more expensive than moving pointers around, some excess in worst-case memory copying is, well, "one little blemish" ;-)

But still worth addressing if feasible.  Now that sorting often also adapts to avoid the relatively massive expense of PyObject_RichCompareBool, memory-movement costs become more important.  Approaches like 2-merge also give simpler merge_collapse() code that's easier to reason about, and reduces the worst-case stack size (which becomes very easy to compute in advance instead of the current puzzle).
History
Date User Action Args
2018-09-03 19:08:12tim.peterssetrecipients: + tim.peters
2018-09-03 19:08:12tim.peterssetmessageid: <1536001692.84.0.56676864532.issue34561@psf.upfronthosting.co.za>
2018-09-03 19:08:12tim.peterslinkissue34561 messages
2018-09-03 19:08:12tim.peterscreate