Message 197174 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	wbolster
Recipients	wbolster
Date	2013-09-07.18:26:27
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1378578388.4.0.196112791647.issue18962@psf.upfronthosting.co.za>
In-reply-to

Content
The heapq.merge() function merges multiple sorted iterables into a single sorted output. The function uses a heap queue that is repeatedly looped over until it has generated all output. If only a single iterable is passed to heapq.merge(), the heap will have len(h) == 1, which means the looping and heap manipulation just degrades to a convoluted "yield from iterables[0]". This negatively impacts performance. The attached patch adds a short-circuit for this single input case. The behaviour of the function remains unchanged: >>> list(merge_before([1,3,5,7], [0,2,4,8], [5,10,15,20], [], [25])) [0, 1, 2, 3, 4, 5, 5, 7, 8, 10, 15, 20, 25] >>> list(merge_after([1,3,5,7], [0,2,4,8], [5,10,15,20], [], [25])) [0, 1, 2, 3, 4, 5, 5, 7, 8, 10, 15, 20, 25] For multiple inputs, there is no measurable performance difference (measured using IPython's %timeit). Without patch: >>> %timeit list(merge_before([1,3,5,7], [0,2,4,8], [5,10,15,20], [], [25])) 100000 loops, best of 3: 13.7 µs per loop >>> %timeit list(merge_before([1,3,5,7], [0,2,4,8], [5,10,15,20], [], [25])) 100000 loops, best of 3: 13.4 µs per loop >>> %timeit list(merge_before([1,3,5,7], [0,2,4,8], [5,10,15,20], [], [25])) 100000 loops, best of 3: 13.5 µs per loop With patch: >>> %timeit list(merge_after([1,3,5,7], [0,2,4,8], [5,10,15,20], [], [25])) 100000 loops, best of 3: 13.7 µs per loop >>> %timeit list(merge_after([1,3,5,7], [0,2,4,8], [5,10,15,20], [], [25])) 100000 loops, best of 3: 13.6 µs per loop >>> %timeit list(merge_after([1,3,5,7], [0,2,4,8], [5,10,15,20], [], [25])) 100000 loops, best of 3: 13.6 µs per loop >>> %timeit list(merge_after([1,3,5,7], [0,2,4,8], [5,10,15,20], [], [25])) 100000 loops, best of 3: 13.8 µs per loop The real performance gain is of course when only a single iterable is passed. Without patch: >>> %timeit for x in merge_before(range(10000)): pass 100 loops, best of 3: 2.65 ms per loop >>> %timeit for x in merge_before(range(10000)): pass 100 loops, best of 3: 2.52 ms per loop >>> %timeit for x in merge_before(range(10000)): pass 100 loops, best of 3: 2.51 ms per loop With patch: >>> %timeit for x in merge_after(range(10000)): pass 1000 loops, best of 3: 604 µs per loop >>> %timeit for x in merge_after(range(10000)): pass 1000 loops, best of 3: 609 µs per loop >>> %timeit for x in merge_after(range(10000)): pass 1000 loops, best of 3: 603 µs per loop This is a ~4x speedup for an input consisting of 10000 items. Compared to plain iteration: >>> %timeit for x in range(10000): pass 1000 loops, best of 3: 263 µs per loop >>> %timeit for x in range(10000): pass 1000 loops, best of 3: 268 µs per loop >>> %timeit for x in range(10000): pass 1000 loops, best of 3: 273 µs per loop Without the patch, heapq.merge() is ~10x as slow as plain iteration. With the patch, this is reduced to ~2.5x. (Note: all testing done using Python 3.3.2 on a 64bit Linux machine.)

The heapq.merge() function merges multiple sorted iterables into a single sorted output.  The function uses a heap queue that is repeatedly looped over until it has generated all output.

If only a single iterable is passed to heapq.merge(), the heap will have len(h) == 1, which means the looping and heap manipulation just degrades to a convoluted "yield from iterables[0]". This negatively impacts performance. The attached patch adds a short-circuit for this single input case.

The behaviour of the function remains unchanged:

>>> list(merge_before([1,3,5,7], [0,2,4,8], [5,10,15,20], [], [25]))
[0, 1, 2, 3, 4, 5, 5, 7, 8, 10, 15, 20, 25]
>>> list(merge_after([1,3,5,7], [0,2,4,8], [5,10,15,20], [], [25]))
[0, 1, 2, 3, 4, 5, 5, 7, 8, 10, 15, 20, 25]

For multiple inputs, there is no measurable performance difference (measured
using IPython's %timeit). Without patch:

>>> %timeit list(merge_before([1,3,5,7], [0,2,4,8], [5,10,15,20], [], [25]))
100000 loops, best of 3: 13.7 µs per loop
>>> %timeit list(merge_before([1,3,5,7], [0,2,4,8], [5,10,15,20], [], [25]))
100000 loops, best of 3: 13.4 µs per loop
>>> %timeit list(merge_before([1,3,5,7], [0,2,4,8], [5,10,15,20], [], [25]))
100000 loops, best of 3: 13.5 µs per loop

With patch:

>>> %timeit list(merge_after([1,3,5,7], [0,2,4,8], [5,10,15,20], [], [25]))
100000 loops, best of 3: 13.7 µs per loop
>>> %timeit list(merge_after([1,3,5,7], [0,2,4,8], [5,10,15,20], [], [25]))
100000 loops, best of 3: 13.6 µs per loop
>>> %timeit list(merge_after([1,3,5,7], [0,2,4,8], [5,10,15,20], [], [25]))
100000 loops, best of 3: 13.6 µs per loop
>>> %timeit list(merge_after([1,3,5,7], [0,2,4,8], [5,10,15,20], [], [25]))
100000 loops, best of 3: 13.8 µs per loop

The real performance gain is of course when only a single iterable is passed. Without patch:

>>> %timeit for x in merge_before(range(10000)): pass
100 loops, best of 3: 2.65 ms per loop
>>> %timeit for x in merge_before(range(10000)): pass
100 loops, best of 3: 2.52 ms per loop
>>> %timeit for x in merge_before(range(10000)): pass
100 loops, best of 3: 2.51 ms per loop

With patch:

>>> %timeit for x in merge_after(range(10000)): pass
1000 loops, best of 3: 604 µs per loop
>>> %timeit for x in merge_after(range(10000)): pass
1000 loops, best of 3: 609 µs per loop
>>> %timeit for x in merge_after(range(10000)): pass
1000 loops, best of 3: 603 µs per loop

This is a ~4x speedup for an input consisting of 10000 items. Compared to plain iteration:

>>> %timeit for x in range(10000): pass
1000 loops, best of 3: 263 µs per loop
>>> %timeit for x in range(10000): pass
1000 loops, best of 3: 268 µs per loop
>>> %timeit for x in range(10000): pass
1000 loops, best of 3: 273 µs per loop

Without the patch, heapq.merge() is ~10x as slow as plain iteration. With the patch, this is reduced to ~2.5x.

(Note: all testing done using Python 3.3.2 on a 64bit Linux machine.)

History
Date	User	Action	Args
2013-09-07 18:26:28	wbolster	set	recipients: + wbolster
2013-09-07 18:26:28	wbolster	set	messageid: <1378578388.4.0.196112791647.issue18962@psf.upfronthosting.co.za>
2013-09-07 18:26:28	wbolster	link	issue18962 messages
2013-09-07 18:26:27	wbolster	create