classification
Title: Performance optimization for min() and max() over lists
Type: performance Stage: resolved
Components: Interpreter Core Versions: Python 3.2, Python 2.7
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: Nosy List: ajaksu2, akuchling, hniksic, jcea, kristjan.jonsson, mark.dickinson, pitrou, rhettinger
Priority: low Keywords: needs review, patch

Created on 2008-10-22 15:18 by kristjan.jonsson, last changed 2010-02-22 17:30 by mark.dickinson. This issue is now closed.

Files
File name Uploaded Description Edit
minmax.patch kristjan.jonsson, 2008-10-22 15:18 Patch implementing min() max() optimization
Messages (12)
msg75082 - (view) Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer) Date: 2008-10-22 15:18
This adds a special case for min() and max() when iterating over lists.
For simple lists of floats, the improvement is some 15% on a windows 
machine using release build (non pgo)
msg75099 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-10-22 18:53
I haven't tried the patch as is but I can spot two problems:
- you should use PyList_CheckExact instead of PyList_Check, because a
list subclass could override __getitem__
- when keyfunc is not NULL, you can't assume that the list size will
stay constant; indeed, calling keyfunc may mutate the list (try
something like `max(l, key=l.pop)`)

I've got no opinion on whether the speedup is worth the added
complexity. Perhaps a way of simplifying the patch would be to enable
the special path only when keyfunc==NULL. Others may comment.
msg75100 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2008-10-22 18:58
Not that excited about adding this much code for such a small speedup.

Also, the list can change size during iteration so the for-loop needs to
be changed to:

     for(i = 1; i<PyList_GET_SIZE(v); i++)
msg75101 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2008-10-22 19:02
Antoine, the list can mutate even when the keyfunc is NULL.  The rich
comparison can callback to arbitrary Python code.
msg75102 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-10-22 19:03
> Antoine, the list can mutate even when the keyfunc is NULL.  The rich
> comparison can callback to arbitrary Python code.

Ouch, you are right.
msg75139 - (view) Author: Hrvoje Nikšić (hniksic) * Date: 2008-10-23 11:12
Note that the item retrieved by PyList_GET_ITEM must be increffed before
being passed to the function.  Otherwise mutating the list can remove
the item from the list and destroy the underlying object, in which case
the current maxitem can refer to a freed object.  This pitfall is
described in http://docs.python.org/extending/extending.html#thin-ice in
some detail.
msg87849 - (view) Author: Daniel Diniz (ajaksu2) (Python triager) Date: 2009-05-16 01:20
Given the drawbacks mentioned (and the fact that the current patch would
break when the list mutates under its feet), is this still valid?
msg88021 - (view) Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer) Date: 2009-05-18 09:24
Perhaps not.
I had however written a general list locking system, generalizing the 
way sort() locks a list for direct manipulation, and rewritten min and 
max to be able to use that.  Perhaps that approach would be of interest?
msg88107 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009-05-20 08:51
I am with Raymond here: I don't think the performance improvement would
be worth a significant complification of the code - unless the
improvement is /very/ large.
msg99789 - (view) Author: A.M. Kuchling (akuchling) * (Python committer) Date: 2010-02-22 17:16
Should this patch just be rejected, then?  Or is the more general locking suggested in msg88021 of interest?
msg99792 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-02-22 17:23
I think the patch should just be rejected. Workloads where min() / max() performance is a bottleneck have to be very rare.
msg99793 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-02-22 17:25
I agree that the performance improvement isn't worth the extra code, or the risk of introducing bugs (the comments so far show that it's not trivial to get this right).

Closing as rejected.
History
Date User Action Args
2010-02-22 17:30:55mark.dickinsonsetkeywords: patch, patch, needs review
stage: test needed -> resolved
versions: + Python 3.2, - Python 3.1
2010-02-22 17:25:53mark.dickinsonsetstatus: open -> closed

nosy: + mark.dickinson
messages: + msg99793

keywords: patch, patch, needs review
resolution: rejected
2010-02-22 17:23:59pitrousetkeywords: patch, patch, needs review

messages: + msg99792
2010-02-22 17:16:15akuchlingsetkeywords: patch, patch, needs review
nosy: + akuchling
messages: + msg99789

2009-05-20 08:51:59pitrousetkeywords: patch, patch, needs review

messages: + msg88107
2009-05-18 09:24:42kristjan.jonssonsetkeywords: patch, patch, needs review

messages: + msg88021
2009-05-16 01:20:23ajaksu2setnosy: + ajaksu2
messages: + msg87849

keywords: patch, patch, needs review
stage: test needed
2009-01-29 12:57:20jceasetkeywords: patch, patch, needs review
nosy: + jcea
2008-10-23 11:12:25hniksicsetnosy: + hniksic
messages: + msg75139
2008-10-22 19:03:34pitrousetmessages: + msg75102
2008-10-22 19:02:39rhettingersetkeywords: patch, patch, needs review
messages: + msg75101
2008-10-22 18:58:29rhettingersetpriority: low
keywords: patch, patch, needs review
messages: + msg75100
2008-10-22 18:53:07pitrousetkeywords: patch, patch, needs review
nosy: + rhettinger, pitrou
messages: + msg75099
versions: + Python 3.1, Python 2.7, - Python 2.5.3
2008-10-22 15:18:08kristjan.jonssoncreate