Message 369012 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	Dennis Sweeney
Recipients	Dennis Sweeney, bbayles, rhettinger, serhiy.storchaka, tim.peters
Date	2020-05-16.05:36:08
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1589607369.26.0.0640500047821.issue38938@roundup.psfhosted.org>
In-reply-to

Content
The attached recursive_merge.py should be much less ugly and still somewhat performant. It should be the same algorithm as that PR, just written recursively rather than iteratively. I got some text files from http://www.gwicks.net/dictionaries.htm and tried merging them line-by-line: py -3.9 -m pyperf timeit -s "from heapq import merge; from collections import deque" "deque(merge(open('english.txt'), open('dutch.txt'), open('french.txt'), open('german.txt'), open('italian.txt')), maxlen=0)" Mean +- std dev: 391 ms +- 9 ms py -3.9 -m pyperf timeit -s "from recursive_merge import merge; from collections import deque" "deque(merge(open('english.txt'), open('dutch.txt'), open('french.txt'), open('german.txt'), open('italian.txt')), maxlen=0)" Mean +- std dev: 262 ms +- 9 ms Perhaps that's a more real-world benchmark.

The attached recursive_merge.py should be much less ugly and still somewhat performant.

It should be the same algorithm as that PR, just written recursively rather than iteratively.

I got some text files from http://www.gwicks.net/dictionaries.htm and tried merging them line-by-line:

py -3.9 -m pyperf timeit -s "from heapq import merge; from collections import deque" "deque(merge(open('english.txt'), open('dutch.txt'), open('french.txt'), open('german.txt'), open('italian.txt')), maxlen=0)"

    Mean +- std dev: 391 ms +- 9 ms

py -3.9 -m pyperf timeit -s "from recursive_merge import merge; from collections import deque" "deque(merge(open('english.txt'), open('dutch.txt'), open('french.txt'), open('german.txt'), open('italian.txt')), maxlen=0)"

    Mean +- std dev: 262 ms +- 9 ms

Perhaps that's a more real-world benchmark.

History
Date	User	Action	Args
2020-05-16 05:36:09	Dennis Sweeney	set	recipients: + Dennis Sweeney, tim.peters, rhettinger, serhiy.storchaka, bbayles
2020-05-16 05:36:09	Dennis Sweeney	set	messageid: <1589607369.26.0.0640500047821.issue38938@roundup.psfhosted.org>
2020-05-16 05:36:09	Dennis Sweeney	link	issue38938 messages
2020-05-16 05:36:09	Dennis Sweeney	create