Message91134
From http://docs.python.org/library/heapq.html:
> The latter two functions (nlargest and nsmallest) perform best for
> smaller values of n. For larger values, it is more efficient to use
> the sorted() function. Also, when n==1, it is more efficient to use
> the builtin min() and max() functions.
A quick usability win for the heapq module: Be more specific than "smaller" and
"larger", e.g. "when n is O(...(len(iterable)))".
From http://groups.google.com/group/comp.lang.python/msg/9556f56ae25ecf3b:
> I find it interesting that the heapq functions tell you in the
> documentation that they aren't suitable for use where n==1 or where
> n is near the total size of the sequence whereas random.sample()
> chooses what it thinks is the best algorithm based on the input. At
> the very least I would have thought the heapq functions could check
> for n==1 and call min/max when it is.
+1. It looks like the pure Python implementation of nsmallest is actually already
choosing either an insort algorithm or a minheap algorithm based on whether n * 10 <=
len(iterable), but the C implementation of nsmallest unconditionally uses a minheap
algorithm. Neither the pure Python nor the C implementation of nlargest chooses its
algorithm conditionally. For the sake of consistency and usability, all
implementations of nsmallest and nlargest should choose the most efficient algorithm
from among min/max, insort, and minheap, and the docs should be updated to describe
this behavior.
Also, in looking through the heapq.py that came with my Python 2.6.2 distribution, I
noticed that line 196 seems to have no reason to be there:
_heappushpop = heappushpop
This is the only occurrence of "_heappushpop" in the file.
I made a patch for heapq.py, but since I'm not a CPython hacker, I thought it might
be better to leave the changes up to someone who could do both the pure Python and
the C implementations in tandem. I'd be happy to contribute documentation when the
time comes if that would help. |
|
Date |
User |
Action |
Args |
2009-07-31 16:56:02 | jab | set | recipients:
+ jab |
2009-07-31 16:56:02 | jab | set | messageid: <1249059362.21.0.599959669928.issue6614@psf.upfronthosting.co.za> |
2009-07-31 16:56:00 | jab | link | issue6614 messages |
2009-07-31 16:55:59 | jab | create | |
|