This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Simplify and speed-up heaqp.nlargest()
Type: performance Stage:
Components: Library (Lib) Versions: Python 3.5
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: python-dev, rhettinger, tim.peters
Priority: low Keywords: patch

Created on 2014-05-04 05:57 by rhettinger, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
rid_nlargest.py rhettinger, 2014-05-04 05:57 Fold all nlargest() work into pure Python review
rip_nsmallest.diff rhettinger, 2014-05-11 10:23 Draft patch for nsmallest() review
Messages (3)
msg217859 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2014-05-04 05:57
Consolidate the logic for nlargest() into a single function.  Remove both the C and pure Python base underlying code. 

With all the logic in a single function, it only becomes necessary to create, store, and compare the data tuples when a need item is added to the heap.  This means that the rest of the comparisons (checking to see whether a new item needs to be added to the heap) can run faster and not need to create a (key, order, elem) tuple.

The change reduces the number of tuples created and the number of ordering integers created.

When rich comparisons were introduced, tuple ordering comparisons became twice as expensive (they are compared elementwise for equality and then there is an additional comparison call to order the first differing element).  Under the existing nlargest() code, we pay that price for every lement in the iterable.  In the new code, we pay that price only for the small subset of the iterable that actually gets added to the heap.

After this, another patch for simplifying nsmallest() is forthcoming.
msg218254 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2014-05-11 08:56
New changeset fadc06047314 by Raymond Hettinger in branch 'default':
Issue #21424:  Optimize heaqp.nlargest() to make fewer tuple comparisons.
http://hg.python.org/cpython/rev/fadc06047314
msg218300 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2014-05-11 21:21
New changeset 31950174f60f by Raymond Hettinger in branch 'default':
Issue 21424:  Apply the nlargest() optimizations to nsmallest() as well.
http://hg.python.org/cpython/rev/31950174f60f
History
Date User Action Args
2022-04-11 14:58:03adminsetgithub: 65623
2014-05-11 21:40:03rhettingersetstatus: open -> closed
resolution: fixed
2014-05-11 21:21:29python-devsetmessages: + msg218300
2014-05-11 10:23:48rhettingersetfiles: + rip_nsmallest.diff
keywords: + patch
2014-05-11 08:56:08python-devsetnosy: + python-dev
messages: + msg218254
2014-05-04 22:05:53pitrousetnosy: + tim.peters
2014-05-04 05:57:09rhettingercreate