classification
Title: Bisect optimization in heapq.nsmallest is never used
Type: performance Stage: needs patch
Components: Library (Lib) Versions: Python 3.4
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: rhettinger Nosy List: Ramchandra Apte, francismb, haldean, jcea, python-dev, r.david.murray, rhettinger
Priority: low Keywords: patch

Created on 2012-10-01 11:39 by haldean, last changed 2013-03-05 07:21 by python-dev. This issue is now closed.

Files
File name Uploaded Description Edit
heapq-use-bisect.20121001.patch haldean, 2012-10-01 11:39 review
Messages (9)
msg171706 - (view) Author: Will Haldean Brown (haldean) * Date: 2012-10-01 11:39
The implementation of nsmallest in heapq contains an optimization for when n is an order of magnitude less than the size of the data, which uses bisect to find the n-smallest elements. This optimization is guarded by a check to ensure that the data iterable has a length method.

This method is then decorated to add support for the key kwarg. The decorator creates a zip object and passes the zip object to the decorated nsmallest. As zip objects are generators, they do not have a __len__ attribute, and the bisect optimization is never used.

The attached patch file detects whether the data passed to the decorator has a length attribute, and if it does, it creates a list with the data before passing it to the decorated nsmallest. This is my first patch, so if I've done something wrong please let me know. Thanks!
msg171926 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2012-10-04 07:49
Just so you know, in CPython the bisect version of nsmallest() isn't  called at all.  Instead, the C implementation is used and it employs the heap based algorithm rather than the bisect algorithm.

Instead of mucking with code that call nsmallest(), I'll likely just update nsmallest() at some point to use the same algorithm as the code for nlargest().  That with bring the C code and pure Python code fully in-sync.
msg171946 - (view) Author: Ramchandra Apte (Ramchandra Apte) * Date: 2012-10-04 13:50
> This is my first patch, so if I've done something wrong please let me know.
The Python Development Guide helps you with all this and more: http://docs.python.org/devguide/
msg171952 - (view) Author: Will Haldean Brown (haldean) * Date: 2012-10-04 14:13
Thanks for the clarification Raymond!
msg171958 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012-10-04 14:51
Raymond said he would make a code change relevant to this at some point, so we should probably leave this issue open until he does.  (Or if he doesn't want the issue open, we can let him close it).
msg172107 - (view) Author: Francis MB (francismb) * Date: 2012-10-05 18:29
I've to say that I also ran into that by following coverage numbers and as a newbie is not so easy to see what Raymond told. IMHO that information could be put as a comment.

Regards

francis
msg183421 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2013-03-04 07:21
New changeset 362826298fdb by Raymond Hettinger in branch 'default':
Issue #16098:  Update heapq.nsmallest to use the same algorithm as nlargest.
http://hg.python.org/cpython/rev/362826298fdb
msg183511 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2013-03-05 06:49
New changeset de6627cf81f4 by Raymond Hettinger in branch '3.3':
Issue #16098:  Update heapq.nsmallest to use the same algorithm as nlargest.
http://hg.python.org/cpython/rev/de6627cf81f4
msg183513 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2013-03-05 07:21
New changeset aa9ed98a49cb by Raymond Hettinger in branch '2.7':
Issue #16098:  Update heapq.nsmallest to use the same algorithm as nlargest.
http://hg.python.org/cpython/rev/aa9ed98a49cb
History
Date User Action Args
2013-03-05 07:21:07python-devsetmessages: + msg183513
2013-03-05 06:49:56python-devsetmessages: + msg183511
2013-03-04 07:23:17rhettingersetstatus: open -> closed
resolution: fixed
2013-03-04 07:21:20python-devsetnosy: + python-dev
messages: + msg183421
2012-10-06 04:10:48jceasetnosy: + jcea
2012-10-05 18:29:40francismbsetnosy: + francismb
messages: + msg172107
2012-10-04 14:51:15r.david.murraysetstatus: closed -> open

nosy: + r.david.murray
messages: + msg171958

resolution: not a bug -> (no value)
stage: patch review -> needs patch
2012-10-04 14:13:16haldeansetstatus: open -> closed
resolution: not a bug
messages: + msg171952
2012-10-04 13:50:25Ramchandra Aptesetnosy: + Ramchandra Apte
messages: + msg171946
2012-10-04 07:49:25rhettingersetmessages: + msg171926
2012-10-02 05:04:21rhettingersetmessages: - msg171773
2012-10-02 05:04:09rhettingersetnosy: - stutzbach
2012-10-02 04:44:43rhettingersetpriority: normal -> low

messages: + msg171773
2012-10-02 04:37:03rhettingersetassignee: rhettinger
2012-10-01 13:19:16pitrousetnosy: + rhettinger, stutzbach

stage: patch review
2012-10-01 11:39:26haldeancreate