Author rhettinger
Recipients Dennis Sweeney, bbayles, rhettinger, serhiy.storchaka, tim.peters
Date 2020-05-16.00:06:59
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1589587620.22.0.435512733113.issue38938@roundup.psfhosted.org>
In-reply-to
Content
The nested generators approach looks promising. I hope you keep working on that :-)

I'm not too keen on PR 18427 because it is a massive explosion in code volume and complexity.

With some continued effort, I expect we'll get to something much simpler and more maintainable.  The existing code already performs well for typical applications, so there is no need to "go gonzo" and throw a massive wall of code at the problem.  The ensuing maintenance costs would be too high.

To save time and effort, please hold-off on a C implementation until we've agreed to a pure python version of the algorithm.

For timings, try using strings, tuples, or dataclass instances.  Timing integer inputs isn't representative of how merge() is used and it tends to exaggerate improvements.  For interesting merges, the dominant cost is the comparison step.

By switching to a binary tree arrangement, the payoff we're aiming for is to avoid the double comparison in the tuple inequality logic — "('a',1)<('b',2)" tests both 'a'=='b' and 'a'<='b'.  I suggest aiming for the simplest possible code that avoids the double test.
History
Date User Action Args
2020-05-16 00:07:00rhettingersetrecipients: + rhettinger, tim.peters, serhiy.storchaka, bbayles, Dennis Sweeney
2020-05-16 00:07:00rhettingersetmessageid: <1589587620.22.0.435512733113.issue38938@roundup.psfhosted.org>
2020-05-16 00:07:00rhettingerlinkissue38938 messages
2020-05-16 00:06:59rhettingercreate