classification
Title: Document 'stability' of builtin min() and max()
Type: enhancement Stage:
Components: None Versions: Python 3.2, Python 3.3, Python 2.7
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: rhettinger Nosy List: Stephen.Evans, docs@python, jyasskin, mark.dickinson, mattheww, pitrou, rhettinger
Priority: normal Keywords: patch

Created on 2010-09-08 19:30 by mattheww, last changed 2011-03-04 12:32 by Stephen.Evans. This issue is now closed.

Files
File name Uploaded Description Edit
maxmin.patch mattheww, 2010-09-08 19:30 patch for max() and min() docs in functions.rst
functions.rst.patch mattheww, 2010-09-12 12:44 updated patch for max() and min() docs in functions.rst
min_max.py Stephen.Evans, 2011-03-04 12:32 builtin max() being the converse of builtin min()
Messages (8)
msg115892 - (view) Author: Matthew Woodcraft (mattheww) Date: 2010-09-08 19:30
In CPython, the builtin max() and min() have the property that if there are items with equal keys, the first item is returned. From a quick look at their source, I think this is true for Jython and IronPython too.

I propose making this a documented guarantee.

On Python-dev, Raymond Hettinger said:
<<
That seems like a reasonable request. This behavior has been around for a very long time is unlikely to change. Elsewhere, we've made efforts to document sort stability (i.e. sorted(), heapq.nlargest(), heapq.nsmallest, merge(), etc).
>>
(<http://mail.python.org/pipermail/python-dev/2010-September/103543.html>)

I'm attaching a patch with a concrete suggestion for a change to
functions.rst, modelled on the documentation of heapq.nlargest().
msg115898 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-09-08 20:52
Thanks for the patch!

Comments:

(1) Shouldn't 'reverse=True' be omitted in the second doc addition?

(2) I'd also suggest adding a brief comment about what this means for distinct, but equal, objects;  otherwise it's not really obvious what the point of the doc addition is.

(3) As a matter of clarity, perhaps replace "this is" with "max(iterable, key=key) is", and similarly for min.

As an aside, I still like Jeffrey Yasskin's suggestion on the python-dev mailing list that the sensible definition for max would maintain the invariant that max(iterable) be equivalent to sorted(iterable)[-1];  see Alexander Stepanov's writings in e.g., http://www.cs.rpi.edu/~musser/gsd/notes-on-programming-2006-10-13.pdf for more.  But that's (a) another issue, and (b) perhaps not a significant enough benefit to be worth changing Python's semantics for.
msg116183 - (view) Author: Matthew Woodcraft (mattheww) Date: 2010-09-12 12:44
> (1) Shouldn't 'reverse=True' be omitted in the second doc
> addition?

Yes, of course, sorry.

> (2) I'd also suggest adding a brief comment about what this
> means for distinct, but equal, objects; otherwise it's not
> really obvious what the point of the doc addition is.

> (3) As a matter of clarity, perhaps replace "this is" with
> "max(iterable, key=key) is", and similarly for min.

I've attached a new patch incorporating these suggestions.
msg116186 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-09-12 12:52
> As an aside, I still like Jeffrey Yasskin's suggestion on the
> python-dev mailing list that the sensible definition for max would
> maintain the invariant that max(iterable) be equivalent to
> sorted(iterable)[-1]

What's interesting is the practical consequence that:

x, y = min(x, y), max(x, y)

cannot give you twice the same object.

Of course, there are subtle implications of how it will be implemented (especially with objects which have a partial order relationship to each other). Since max() is supposed to work on any iterator, we probably don't want to build an intermediate sequence and fetch elements in reverse order; instead perhaps use (not Py_LT) instead of Py_GT.
msg116344 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-09-13 19:37
> Of course, there are subtle implications of how it will be implemented

Indeed.  Ideally, as you mention, the implementation would only use __lt__ (as with sort and bisect).  I think that constraint only leaves one reasonable choice: namely, max and min for multiple args would be functionally equivalent to max_list and min_list below:

def min2(x, y):
    return y if y < x else x

def max2(x, y):
    return x if y < x else y

def min_list(xs):
    return reduce(min2, xs)

def max_list(xs):
    return reduce(max2, xs)
msg116425 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2010-09-14 23:14
Documented the current behavior in r84822.

Mark, if you're free on IRC at some point, I would like to discuss further.
msg122010 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2010-11-21 22:48
As discussed with Mark, am closing this one after having applied documentation changes.
msg130049 - (view) Author: Stephen Evans (Stephen.Evans) Date: 2011-03-04 12:32
As suggested by Mark following my post on comp.lang.python I am adding further comments to the discussion on this (closed) issue.

For a more mathematical consideration of the issue:

Stepanov, Alexander and Paul McJones. 2009. Elements of Programming. Addison Wesley. Pages 52-53

The problem with the builtin max() is with weak comparisons. Consider two python objects a and b that are equivalent and where the following are True:

a is not b
repr([a, b]) == repr(sorted([a, b]))
repr([a, b]) == repr(sorted([a, b], reverse=True))
repr([b, a]) == repr(sorted([b, a]))
repr([b, a]) == repr(sorted([b, a], reverse=True))

Assuming repr() implemented correctly for a and b. The only Python rich comparison required is (weak) __lt__. If (weak) __eq__ is implemented then the following are True:

a == b
b == a

In bltinmodule.c builtin_max() uses Py_GT. For correctness this should use the converse of builtin_min() i.e. the boolean negation of PyObject_RichCompare using Py_LT (for valid results). If using Python rich comparisions then only __lt__ would be required for both min() and max() as with list.sort(). The following will then be True:

min([a, b]) is a
max([a, b]) is b

min([b, a]) is b
max([b, a]) is a

min([a, b]) is max([b, a])
min([a, b]) is not min([b, a])
max([a, b]) is min([b, a])
max([a, b]) is not max([b, a])

The above will work if Py_GE is subtituted for Py_GT in builtin_max(), though this will require the implementation of __ge__ which is inconsistent with list.sort() and is a point of potential failure if the implementation of __ge__ is not the converse of the implementation __lt__.

To reiterate - builtin max() should be the converse of builtin min().
History
Date User Action Args
2011-03-04 12:32:59Stephen.Evanssetfiles: + min_max.py
versions: + Python 2.7, Python 3.3
nosy: + Stephen.Evans

messages: + msg130049

components: + None, - Documentation
2010-11-21 22:48:26rhettingersetstatus: open -> closed
resolution: rejected
messages: + msg122010
2010-09-14 23:14:46rhettingersetmessages: + msg116425
2010-09-13 19:37:57mark.dickinsonsetmessages: + msg116344
2010-09-12 12:52:37pitrousetnosy: + pitrou

messages: + msg116186
versions: + Python 3.2
2010-09-12 12:44:28matthewwsetfiles: + functions.rst.patch

messages: + msg116183
2010-09-08 22:06:45rhettingersetassignee: docs@python -> rhettinger
2010-09-08 20:52:07mark.dickinsonsetnosy: + rhettinger, mark.dickinson, jyasskin
messages: + msg115898
2010-09-08 19:30:29matthewwcreate