This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author tim.peters
Recipients Stefan Pochmann, rhettinger, tim.peters
Date 2021-10-20.23:49:35
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1634773775.93.0.8165644534.issue45530@roundup.psfhosted.org>
In-reply-to
Content
It's rare that an optimization is a _pure_ win. Some cases win, others lose. There's almost never "a proof" of net gain ending with "QED".

Of course it's dead easy to construct examples where "many duplicates in the first tuple position" rules, and even easy to construct realistic examples.

But, as you already said in the SO report, in those cases it's a better idea to do multiple single-key sorts instead of a single multiple-keys-in-a-tuple sort. The base sorting algorithm can exploit duplicates in a single-key sort - indeed, the more duplicates there are, the more benefit it can get.

The sorting HOWTO could do a service by discussing this for CPython's sorting algorithm - it's not obvious, and can have major effects.

In the meantime, I'm only aiming to speed tuple keys for cases where they're well-suited to begin with: the first tuple elements _are_ mostly distinct. Giving up a significant win for the relative handful of people who know how to exploit the algorithm is not worth it, to me, to avoid making suboptimal uses even less optimal.

I'm more a pragmatist than a Utopian :-;

Extending the idea to positions beyond the first is dubious on those grounds: if the first tuple positions in fact often match, then the optimization has already backfired. Time to admit defeat then, not double down on failed arrogance ;-)

One idea I'm playing with: keep track of how often, during a tuple-key sort, a compare _is_ resolved by the first elements. Then adjust `unsafe_tuple_compare()` to switch to "the other" approach when "the current" approach isn't working out.
History
Date User Action Args
2021-10-20 23:49:35tim.peterssetrecipients: + tim.peters, rhettinger, Stefan Pochmann
2021-10-20 23:49:35tim.peterssetmessageid: <1634773775.93.0.8165644534.issue45530@roundup.psfhosted.org>
2021-10-20 23:49:35tim.peterslinkissue45530 messages
2021-10-20 23:49:35tim.peterscreate