msg150927 - (view) |
Author: Simon Sapin (ssapin) |
Date: 2012-01-09 10:35 |
Hi,
The attached patch adds a 'key' optional parameter to the heapq.merge function that behaves as in sorted().
Related discussion: http://mail.python.org/pipermail/python-ideas/2012-January/013295.html
This is my first contribution to CPython.
|
msg150928 - (view) |
Author: Simon Sapin (ssapin) |
Date: 2012-01-09 10:43 |
The attached script benchmarks the basline (current implementation) against 3 new implementations, as suggested on http://mail.python.org/pipermail/python-ideas/2012-January/013296.html
On my machine, the output is:
merge_baseline
per run, min of 3 = 7.527 ms
merge_1
per run, min of 3 = 9.894 ms
131.449 % of baseline
merge_2
per run, min of 3 = 7.948 ms
105.594 % of baseline
merge_3
per run, min of 3 = 7.581 ms
100.716 % of baseline
On this particular input, merge_2 adds 6% of overhead when the key parameter is not used. While merge_3 only adds 1% of overhead, it almost doubles the amount of code. (Which was admittedly not that long to begin with.)
The patch in the previous message is with the merge_2 implementation, which seemed like the best compromise to me.
|
msg150931 - (view) |
Author: Simon Sapin (ssapin) |
Date: 2012-01-09 11:10 |
Oops, the patch to the documentation would also need 'New in 3.3: the key parameter', with the right Sphinx directive. But that depends on whether this change ends up in 3.3 or 3.4.
Does 3.3 still get new features?
|
msg150954 - (view) |
Author: Éric Araujo (eric.araujo) *  |
Date: 2012-01-09 16:53 |
Yes, 3.3 is still in the early development stage, and new features will be accepted until the first beta (in June, see PEP 398). “.. versionadded:: 3.3 The *key* parameter” will do.
|
msg150969 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2012-01-09 19:33 |
Simon, please keep the original version fast by creating two code paths:
if key is None:
original_code
else:
new_code using the key_function
|
msg150983 - (view) |
Author: Simon Sapin (ssapin) |
Date: 2012-01-09 22:13 |
Raymond, please have a look at merge_3 in benchmark_heapq_merge.py. It is implemented as you say.
Do you think the speed is worth the code duplication?
|
msg151369 - (view) |
Author: Simon Sapin (ssapin) |
Date: 2012-01-16 14:58 |
heapq_merge_key_duplicate.patch is a new patch with two code path. It also updates the function’s docstring (which the previous patch did not). Raymond, do you think the speed is worth the DRY violation?
|
msg152802 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2012-02-07 04:11 |
I'll look at this in the next couple of weeks. Hang tight :-)
|
msg152984 - (view) |
Author: Terry J. Reedy (terry.reedy) *  |
Date: 2012-02-09 20:39 |
FWIW, Guido approves of the idea, msg152969 in #4356
|
msg168070 - (view) |
Author: Simon Sapin (ssapin) |
Date: 2012-08-13 07:51 |
I just remembered about this. I suppose it is too late for 3.3?
|
msg168116 - (view) |
Author: Éric Araujo (eric.araujo) *  |
Date: 2012-08-13 15:20 |
Yes, 3.3 is already in beta.
|
msg185259 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2013-03-26 06:54 |
Attaching a rough draft implementation for a fully encapsulated Heap() class that is thread-safe, supports minheaps and maxheaps, and efficiently implements key-functions (called no more than once per key).
|
msg188066 - (view) |
Author: Mark Dickinson (mark.dickinson) *  |
Date: 2013-04-29 12:14 |
heap2.diff contains only a single line's change. Wrong file attached?
|
msg188067 - (view) |
Author: Mark Dickinson (mark.dickinson) *  |
Date: 2013-04-29 12:14 |
Ah, I see the new file now (I'd failed to refresh my browser); sorry for the noise.
|
msg188080 - (view) |
Author: Mark Dickinson (mark.dickinson) *  |
Date: 2013-04-29 17:51 |
Looks pretty good to me.
- There's a bonus print call in the diff.
- Should the "len(self._data)" call be protected by the lock? I can't immediately think of any reason why that would be necessary (e.g., pushpop nd poppush never change the size of self._data, so there's no risk of getting a bogus length there), but the lack of the lock makes me nervous.
- Support for iter() seems a bit out of place to me. What are the use-cases for this? Would it make sense to leave this out (for now)?
|
msg188085 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2013-04-29 18:55 |
There is already one heap class in the stdlib: queue.PriorityQueue. Why create a duplicate instead extend queue.PriorityQueue with desired features?
May be name the maxheap parameter as reverse?
|
msg219380 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2014-05-30 09:28 |
New changeset f5521f5dec4a by Raymond Hettinger in branch 'default':
Issue #13742: Add key and reverse parameters to heapq.merge()
http://hg.python.org/cpython/rev/f5521f5dec4a
|
msg233792 - (view) |
Author: Tommy Carstensen (Tommy.Carstensen) |
Date: 2015-01-10 01:19 |
I noticed 3.5 alpha1 is not released until February 1st. Is there any way I can get my hands on this new functionality?
|
msg233793 - (view) |
Author: Berker Peksag (berker.peksag) *  |
Date: 2015-01-10 01:24 |
Hi Tommy, the patch is already committed to Python 3.5. See https://docs.python.org/3.5/library/heapq.html#heapq.merge
|
msg233795 - (view) |
Author: Tommy Carstensen (Tommy.Carstensen) |
Date: 2015-01-10 01:55 |
Yes, but 3.5 has not been pre-released yet.
|
msg233804 - (view) |
Author: Terry J. Reedy (terry.reedy) *  |
Date: 2015-01-10 08:47 |
You can set up mecurial on your machine, make a read-only clone of the cpython repository, and compile it just as do other people, whether core-developers or otherwise. See docs.python.org/devguide for details.
|
|
Date |
User |
Action |
Args |
2022-04-11 14:57:25 | admin | set | github: 57951 |
2015-01-10 08:47:47 | terry.reedy | set | messages:
+ msg233804 |
2015-01-10 01:55:18 | Tommy.Carstensen | set | messages:
+ msg233795 |
2015-01-10 01:24:07 | berker.peksag | set | nosy:
+ berker.peksag
messages:
+ msg233793 stage: patch review -> resolved |
2015-01-10 01:19:39 | Tommy.Carstensen | set | nosy:
+ Tommy.Carstensen messages:
+ msg233792
|
2014-05-30 09:29:28 | rhettinger | set | status: open -> closed resolution: fixed |
2014-05-30 09:28:45 | python-dev | set | nosy:
+ python-dev messages:
+ msg219380
|
2014-05-29 08:42:18 | rhettinger | set | files:
+ keymerge2.diff |
2014-05-29 08:42:00 | rhettinger | set | files:
- keymerge2.diff |
2014-05-29 08:32:46 | rhettinger | set | files:
+ keymerge2.diff |
2014-05-27 01:31:45 | rhettinger | set | files:
+ keymerge.diff |
2014-05-27 01:30:59 | rhettinger | set | versions:
+ Python 3.5, - Python 3.4 |
2014-05-12 17:57:43 | eric.snow | set | nosy:
+ eric.snow
|
2013-04-29 18:55:40 | serhiy.storchaka | set | nosy:
+ serhiy.storchaka messages:
+ msg188085
|
2013-04-29 17:51:35 | mark.dickinson | set | messages:
+ msg188080 |
2013-04-29 12:14:41 | mark.dickinson | set | messages:
+ msg188067 |
2013-04-29 12:14:03 | mark.dickinson | set | nosy:
+ mark.dickinson messages:
+ msg188066
|
2013-04-29 11:43:52 | rhettinger | set | files:
- heap2.diff |
2013-04-29 11:43:35 | rhettinger | set | files:
+ heap2.diff |
2013-04-29 10:54:52 | rhettinger | set | priority: low -> normal files:
+ heap2.diff |
2013-03-26 06:54:31 | rhettinger | set | files:
+ heap.diff
messages:
+ msg185259 |
2012-08-13 15:20:10 | eric.araujo | set | keywords:
+ needs review
stage: patch review messages:
+ msg168116 versions:
+ Python 3.4, - Python 3.3 |
2012-08-13 07:51:59 | ssapin | set | messages:
+ msg168070 |
2012-02-09 20:39:38 | terry.reedy | set | nosy:
+ terry.reedy messages:
+ msg152984
|
2012-02-07 08:44:03 | giampaolo.rodola | set | nosy:
+ giampaolo.rodola
|
2012-02-07 04:11:24 | rhettinger | set | messages:
+ msg152802 |
2012-01-16 14:58:54 | ssapin | set | files:
+ heapq_merge_key_duplicate.patch
messages:
+ msg151369 |
2012-01-09 22:13:45 | ssapin | set | messages:
+ msg150983 |
2012-01-09 19:33:36 | rhettinger | set | priority: normal -> low
messages:
+ msg150969 |
2012-01-09 18:44:52 | stutzbach | set | nosy:
+ stutzbach
|
2012-01-09 16:53:30 | eric.araujo | set | nosy:
+ eric.araujo
messages:
+ msg150954 versions:
- Python 3.4 |
2012-01-09 11:10:14 | ssapin | set | messages:
+ msg150931 |
2012-01-09 10:51:00 | eric.smith | set | assignee: rhettinger |
2012-01-09 10:43:36 | ssapin | set | files:
+ benchmark_heapq_merge.py
messages:
+ msg150928 |
2012-01-09 10:35:41 | ssapin | create | |