Message369115
It seems to me that the code sprawl mostly comes from the separate handling of the four keyed/unkeyed and forward/reverse cases, which as far as I can tell requires a branch in the innermost loop if not unrolled into separate cases. I think recursive_merge.py is about as concise as possible while still efficiently handling all four cases.
Is there any issue with recursion in this application? Even if there are 2**64-1 iterables, this would only mean 64 nested next() calls, which should be within the stack on most machines, no?
I did the following benchmark:
py -3.9 -m pyperf timeit -s "from [FILENAME] import merge; from collections import deque" "deque(merge(open('english.txt'), open('dutch.txt'), open('french.txt'), open('german.txt'), open('italian.txt')), maxlen=0)"
new_merge.py: Mean +- std dev: 773 ms +- 16 ms
tournament_heap.py: Mean +- std dev: 665 ms +- 42 ms
losers.py: Mean +- std dev: 470 ms +- 31 ms
Existing heapq: Mean +- std dev: 313 ms +- 2 ms
recursive_merge.py: Mean +- std dev: 266 ms +- 3 ms
I can look at some more diverse benchmarks (types, iterable length imbalance, lengths of an iterator's win-streak, etc.) soon. |
|
Date |
User |
Action |
Args |
2020-05-17 12:15:45 | Dennis Sweeney | set | recipients:
+ Dennis Sweeney, tim.peters, rhettinger, serhiy.storchaka, bbayles |
2020-05-17 12:15:45 | Dennis Sweeney | set | messageid: <1589717745.08.0.737191396269.issue38938@roundup.psfhosted.org> |
2020-05-17 12:15:45 | Dennis Sweeney | link | issue38938 messages |
2020-05-17 12:15:44 | Dennis Sweeney | create | |
|