Simplify and speed-up heaqp.nlargest() #65623

rhettinger · 2014-05-04T05:57:10Z

BPO	21424
Nosy	@tim-one, @rhettinger
Files	rid_nlargest.py: Fold all nlargest() work into pure Python rip_nsmallest.diff: Draft patch for nsmallest()

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2014-05-11.21:40:03.236>
created_at = <Date 2014-05-04.05:57:09.808>
labels = ['library', 'performance']
title = 'Simplify and speed-up heaqp.nlargest()'
updated_at = <Date 2014-05-11.21:40:03.235>
user = 'https://github.com/rhettinger'

bugs.python.org fields:

activity = <Date 2014-05-11.21:40:03.235>
actor = 'rhettinger'
assignee = 'none'
closed = True
closed_date = <Date 2014-05-11.21:40:03.236>
closer = 'rhettinger'
components = ['Library (Lib)']
creation = <Date 2014-05-04.05:57:09.808>
creator = 'rhettinger'
dependencies = []
files = ['35148', '35213']
hgrepos = []
issue_num = 21424
keywords = ['patch']
message_count = 3.0
messages = ['217859', '218254', '218300']
nosy_count = 3.0
nosy_names = ['tim.peters', 'rhettinger', 'python-dev']
pr_nums = []
priority = 'low'
resolution = 'fixed'
stage = None
status = 'closed'
superseder = None
type = 'performance'
url = 'https://bugs.python.org/issue21424'
versions = ['Python 3.5']

rhettinger · 2014-05-04T05:57:06Z

Consolidate the logic for nlargest() into a single function. Remove both the C and pure Python base underlying code.

With all the logic in a single function, it only becomes necessary to create, store, and compare the data tuples when a need item is added to the heap. This means that the rest of the comparisons (checking to see whether a new item needs to be added to the heap) can run faster and not need to create a (key, order, elem) tuple.

The change reduces the number of tuples created and the number of ordering integers created.

When rich comparisons were introduced, tuple ordering comparisons became twice as expensive (they are compared elementwise for equality and then there is an additional comparison call to order the first differing element). Under the existing nlargest() code, we pay that price for every lement in the iterable. In the new code, we pay that price only for the small subset of the iterable that actually gets added to the heap.

After this, another patch for simplifying nsmallest() is forthcoming.

python-dev · 2014-05-11T08:56:08Z

New changeset fadc06047314 by Raymond Hettinger in branch 'default':
Issue bpo-21424: Optimize heaqp.nlargest() to make fewer tuple comparisons.
http://hg.python.org/cpython/rev/fadc06047314

python-dev · 2014-05-11T21:21:30Z

New changeset 31950174f60f by Raymond Hettinger in branch 'default':
bpo-21424: Apply the nlargest() optimizations to nsmallest() as well.
http://hg.python.org/cpython/rev/31950174f60f

rhettinger added stdlib Python modules in the Lib dir performance Performance or resource usage labels May 4, 2014

rhettinger closed this as completed May 11, 2014

ezio-melotti transferred this issue from another repository Apr 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simplify and speed-up heaqp.nlargest() #65623

Simplify and speed-up heaqp.nlargest() #65623

rhettinger commented May 4, 2014

rhettinger commented May 4, 2014

python-dev mannequin commented May 11, 2014

python-dev mannequin commented May 11, 2014

Simplify and speed-up heaqp.nlargest() #65623

Simplify and speed-up heaqp.nlargest() #65623

Comments

rhettinger commented May 4, 2014

rhettinger commented May 4, 2014

python-dev mannequin commented May 11, 2014

python-dev mannequin commented May 11, 2014