classification
Title: Add a default argument to min & max
Type: enhancement Stage: patch review
Components: Interpreter Core Versions: Python 3.4
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: rhettinger Nosy List: Julian, dabeaz, doughellmann, eric.araujo, gvanrossum, ncoghlan, nedbat, pconnell, python-dev, r.david.murray, rhettinger, skrah, twouters
Priority: normal Keywords: patch

Created on 2013-05-31 23:21 by Julian, last changed 2014-02-13 14:20 by eric.araujo. This issue is now closed.

Files
File name Uploaded Description Edit
minmaxdefault.patch Julian, 2013-05-31 23:29
minmax.patch Julian, 2013-06-05 23:56 review
minmax.patch Julian, 2013-06-09 02:08 review
Messages (35)
msg190426 - (view) Author: Julian Berman (Julian) * Date: 2013-05-31 23:21
This has come up a number of times I'm aware, but no one has ever written a patch for it as far as a quick look uncovered.

So here's one (written by Thomas Wouters but with permission to submit). Submitting without tests and docs, but those are incoming when I get a moment.

The justification here is mostly related to being able to call min/max on an iterable of unknown length when there's a sensible default (which is usually going to be None, but that's not the default for backwards compat obviously).
msg190432 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2013-06-01 01:38
Issue 7153 and the discussions linked therefrom is presumably relevant here.

Do you have a concrete use case?
msg190464 - (view) Author: Julian Berman (Julian) * Date: 2013-06-02 01:42
Thanks for finding that, I thought there was an issue that came out of that p-i thread but couldn't find it.

I'd like to be more concrete, but "calling max on an iterable" seems concrete enough to me. If you'd like to know more though, personally I've wanted this at least twice in the past 4 or 5 months. Currently, I have code that looks like:

    def best_match(stuff):
        first = next(stuff, None)
        if first is None:
            return
        return max(itertools.chain([first], stuff))

which finds the best matching (error it happens to be) in the given stuff. A few months ago I had a similar need in a different application.

The issues in that thread from 2009 revolved around a bunch of confusing and not so related things. And I definitely agree that `start` is a really bad name for this. But I don't find `default` to be at all confusing, and in fact this has come up in #python a few times and each time there hasn't really been a problem explaining to someone what `default` would do (or how it would interact with `key` for that matter, although if a precedent is desired, the default in `argparse` just returns the default, it doesn't call `type` or anything on it).
msg190465 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2013-06-02 02:43
So you aren't really asking for a default, you are asking for a version of max/min that returns a sentinel instead of raising an error on an empty list. You could just write utility function that has a try/except in it.  I'm not sure it is worth complicating min and max for this use case.  (I'm not sure it isn't, either.)  I wonder if we have any other functions in python that have an argument that turns a ValueError into a sentinel-return instead?
msg190466 - (view) Author: Julian Berman (Julian) * Date: 2013-06-02 02:53
Yes that's a good description. I'm not sure the type of exception is the thing to look at as much as the behavior, I.e. I think next() is a good precedent.

And it's not really pleasant to do this with a try/except. Of course everything's possible, but to catch the ValueError sanely requires checking the text of the exception so that the except isn't too broad.
msg190469 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2013-06-02 03:27
I don't think there's any other way to get a ValueError out of min/max, but I haven't actually looked at the code to check.  Of course, if we want people to rely on that, we'd need to document it.

'next's default is used to return a sentinel when the list is exhausted, not when it would otherwise raise a ValueError.  It is a somewhat related case, but is not exactly parallel.  The sentinel for next indicates the end of an ongoing process.  The proposed argument to min/max would *replace* an error with a sentinel value.
msg190470 - (view) Author: Julian Berman (Julian) * Date: 2013-06-02 03:37
It's not exactly the same of course, but calling next on a thing that might be empty would be somewhat close, and also is replacing an exception with a sentinel (an exception that is much easier to differentiate).

You can always get a ValueError out of min/max, they're going to be ultimately calling __lt__ on stuff which can do what it wants, but that's admittedly quite unlikely.

It's not really that readable though on the other hand.
msg190477 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2013-06-02 10:04
I'm -1 on expanding this API further.  It already is pushing the limits with the dual signature and with the key-function.

Many languages have min/max functions.  AFAICT, none of them have an API with a default argument.  This suggests that this isn't an essential capability.
msg190481 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2013-06-02 10:52
That's a good point about the __lt__.  It occurred to me as well just before I read your post :).  

Raymond, do any other languages have an iterator protocol as a core language feature?  It's the fact that it is in Python, and that it is not simple to LBYL when dealing with iterators, that brings this issue up for min and max.
msg190482 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2013-06-02 10:55
Oh, and I don't think Haskell counts, since you'd expect them to stick strictly to the mathematical definition, with no consideration of practicality :)

Note that I'm not saying I'm +1 on adding this (I haven't decided), I'm just trying to clarify the argument.
msg190484 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2013-06-02 12:26
I'd use foldl() in functional languages, where the default is part
of foldl() and not of max().


Translated to Python, I'm thinking of:

it = iter([328, 28, 2989, 22])
functools.reduce(max, it, next(it, None))
2989


I agree with Raymond that a default arg in max() looks out of place.
msg190514 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2013-06-03 03:43
Thanks Stephan.  I'm going to close this one.  The case for adding it is too weak and isn't worth making the API more complex.

If someone wants a default with an iterable of arbitrary size including zero, there are already a number of ways to do it (using itertools.chain for example or just catching the exception).
msg190515 - (view) Author: Julian Berman (Julian) * Date: 2013-06-03 04:06
I don't really care to push this much harder, but I'll just repeat that I've already made an argument against catching the exception. Calling this making the API too complex also seems quite silly to me. It's a thing that someone looking for would find and someone who wasn't wouldn't.
msg190538 - (view) Author: Thomas Wouters (twouters) * (Python committer) Date: 2013-06-03 11:48
For the record, Raymond, I think you're wrong about this. Itertools isn't always a solution to every problem, and it makes for a very awkward way around a silly limitation in min() and max(). Their API is already awkward -- because they already take a keyword argument as well as *args or an iterable -- and this does not make it worse in any way. It's trivial to add this, it's trivial to explain -- return a specific value instead of raising a particular exception -- and it's wasteful, complex, fragile or unreadable (except if you have itertools on the mind, I guess) to do the same thing in another way.
msg190542 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2013-06-03 12:51
+1 for adding this. It's simple to implement, simple to explain and the alternatives for dealing with the empty iterable case (or even the fact it may need to be handled at all) are definitely not obvious.

The relationship to next() is straightforward: the supplied value is effectively used as the default value for the first next call when iterating and then ignored thereafter.
msg190545 - (view) Author: Ned Batchelder (nedbat) * Date: 2013-06-03 14:13
I find the workarounds mentioned here to be baroque and confusing.  The concept of a default value to return in the case of an empty iterator is straightforward.  I'm +1 on adding this as well.
msg190548 - (view) Author: Doug Hellmann (doughellmann) * (Python committer) Date: 2013-06-03 15:27
+1 on adding this

I found today via @dabeaz's cookbook that iter() has a sentinel-detection use case. Having one in min/max seems *far* more obviously useful. It's also consistent with quite a few methods on builtin types where we provide a way to deal with unknown data safely by having a default instead of catching exceptions directly.
msg190569 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2013-06-03 22:25
As stated, I don't agree with the closure of this one. min/max deserve a more elegant mechanism for dealing with the empty iterable edge case.
msg190571 - (view) Author: David Beazley (dabeaz) Date: 2013-06-03 22:50
I could have used this feature myself somewhat recently.  It was in some code involving document matching where zero or more possible candidates were assigned a score and I was trying to find the max score.  The fact that an empty list was a possibility complicated everything because I had to add extra checks for it.   max(scores, default=0) would have been a lot simpler.
msg190575 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2013-06-04 01:29
I still think complicating the API isn't worth it.  Of late, we've gotten in the habit of a complexity to even the simplest of things.

In the case of sequences, we already have a reasonable solution:

    low = min(seq) if seq else default

In the rarer case of non-sequence iterables, catching a Value error is the traditional solution -- not pretty, but not difficult either.

In the past, Guido has rejected that notion of "add a default value to every function than can raise an exception".  For example, someone wanted to add a get() method to lists so they could avoid catching an IndexError, something like s.get(index, default).  IIRC, his motivation was that "avoiding API clutter" was more important than supporting an uncommon use case that already had viable solutions using plain Python.

My own principle is that it is okay to add an initial extra feature to function, but not multiple extra features.  This becomes more important when the function already has API issues.

The min() / max() functions started-out with an API problem because of their dual signature:  min(s)  versus  min(a, b, c).   That creates an ambiguity in that case of min(*t) where the result depends on the length of the input.   IMO, that issue would be exacerbated by the addition of a default argument (i.e. it would tend to hide the error from test cases):

    for t in [(10, 20, 30), (10, 20), (10,), ()]:
        print min(*t, default=100)

Also, I don't think we should forget the lessons learned about adding unnecessary arguments to functions.  For example, we let someone add start and end arguments to str.startswith() and str.endswith() because "it seemed more parallel with str.find()" and because "someone wanted it once a piece of code somewhere".  The downside only became apparent later when there was a legitimate real use case for another feature request:  searching multiple prefixes and suffixes.  Because we had wasted the positional arguments, we ended-up with the awkward str.startswith(str-or-tuple [,start [,end]]) signature.   That is why we have to write, filename.endswith(('.py', '.pyc')).  The double parens are part of the cost of API bloat.

Lastly, when it comes to common-place functions like min() and max(), we can assess needs easily by looking at what other languages have done.  Looking at Excel, SQL, Java, Haskell, C# etc, I don't see a precedent for this feature request.

Grepping for min/max in my own code and third-party libraries, I don't see any cases where the code had a need for this feature.  If the need arises, it certainly doesn't come of very often.

If you guys all think this is an good feature request, then by all means, go ahead and add it.  My recommendation is to show design restraint and refuse the temptation to grow this already awkward API when you don't have to.
msg190577 - (view) Author: Julian Berman (Julian) * Date: 2013-06-04 01:48
Raymond, I respect that in your opinion this seems to be overcomplexity, but you haven't addressed any of the arguments made, nor responded to any of the arguments against this being added complexity.

I really don't understand the parallels you're making to str.*with, but as for other languages, as David pointed out already, you are looking at things in a vacuum. This is needed because min and max are already silly. In languages like Ruby and Clojure, which were the quickest I had handy, of course you don't need this, because calling min and max *by default* returns None. I'd bet Python 2's silly type comparison history had something to do with the return value not defaulting to None, but what's done is done. I don't like hard-fast rules, but I don't think APIs should ever be penalized for their own mistakes. We should make sane things possible in pleasant ways.

If it's OK then (turning back to the patch), unless anyone has something additional to add I'm going to carve up some tests.
msg190579 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2013-06-04 02:53
To me, the Python-specific difference that makes this useful for us but not for others is *precisely* the fact that the simple idiom:

    x = min(seq) if seq else default

is broken for iterators that don't provide __len__ or __bool__, while the even simpler:

    x = min(seq)

is broken for the empty iterable.

However, I think we should explicitly disallow the combination of multiple positional arguments *and* the new default argument. If you don't know the length of the input iterable, you should *not* be using the multiple argument form.
msg190580 - (view) Author: Julian Berman (Julian) * Date: 2013-06-04 03:03
Personally I don't care either way, I basically never use the multiple positional arg form, but what are we trying to prevent exactly? It's bad code, but it (would) do what the person was expecting. Am I not getting the point that's being made about that case?
msg190582 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2013-06-04 03:34
Guido, this is your language.  What would you like to do?

The OP wants a default argument on min() and max() so he won't have to use an "except ValueError" for non-sequence iterables that are potentially empty.

At first, I thought the functions were already as complex as we would want to get, but several proponents have emerged, so I'm stepping aside.

The proposed patch would allow:
   max(iterable, key=somefunc, default=sentinel)
and would return sentinel_value when len(list(iterable))==0.

It would not allow:
   max(*iterable, key=somefunc, default=sentinel_value)
where s is an empty iterable.
msg190584 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2013-06-04 03:43
+1
msg190585 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2013-06-04 03:48
Just one final design philosophy comment from me (I know it isn't needed since Guido already ack'ed the suggestion):

As far as the lessons learned from the historical startswith() case go, avoiding taking up the positional slots with optional flags and configuration parameters is one of the big reasons we added keyword only arguments (with the other being readability at the call site).

I agree we need to be cautious with API complexity, I just think in this case clean handling of empty iterators is a net win (so long as we rule out the conceptually broken case of combining the new parameter with multiple positional arguments).
msg190586 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2013-06-04 04:02
Julian, when your tests are ready, I'll be happy to review and apply the patch.
msg190595 - (view) Author: David Beazley (dabeaz) Date: 2013-06-04 11:43
To me, the fact that m = max(s) if s else default doesn't work with iterators alone makes this worthy of consideration.   

I would also note that min/max are the only reduction functions that don't have the ability to work with a possibly empty sequence.  For example:

    >>> sum([])
    0
    >>> any([])
    False
    >>> all([])
    True
    >>> functools.reduce(lambda x,y: x+y, [], 0)
    0
    >>> math.fsum([])
    0.0
    >>>
msg190703 - (view) Author: Julian Berman (Julian) * Date: 2013-06-05 23:56
Thanks for the offer Raymond.

Here's a patch with some tests and doc updates, incorporating Nick's suggestion (but without a note in the docs, I didn't feel it's notable).

Please let me know if anything needs tidying.
msg190839 - (view) Author: Julian Berman (Julian) * Date: 2013-06-09 02:08
New patchset addressing review. Not sure why upload.py isn't working for me (I assume it should?)
msg191239 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2013-06-15 21:55
I find it interesting that I just came across a case where I want this functionality, involving a generator derived from a possibly-empty sql table.  I'm using Stefan's functional expression for now :)
msg191836 - (view) Author: Roundup Robot (python-dev) Date: 2013-06-25 05:43
New changeset 76196691b5d0 by Raymond Hettinger in branch 'default':
Issue 18111: Add a default argument to min() and max()
http://hg.python.org/cpython/rev/76196691b5d0
msg192003 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2013-06-28 16:32
Julian, could you please submit a contributor agreement?  (http://www.python.org/psf/contrib)
msg192006 - (view) Author: Roundup Robot (python-dev) Date: 2013-06-28 17:31
New changeset 34ff27b431d0 by R David Murray in branch 'default':
#18111: Add What's New entry for max/min default.
http://hg.python.org/cpython/rev/34ff27b431d0
msg211158 - (view) Author: √Čric Araujo (eric.araujo) * (Python committer) Date: 2014-02-13 14:20
Note that the docs were changed, but versionchanged directives are missing.
History
Date User Action Args
2014-02-13 14:20:55eric.araujosetnosy: + eric.araujo
messages: + msg211158
2013-06-28 17:31:31python-devsetmessages: + msg192006
2013-06-28 16:32:21r.david.murraysetmessages: + msg192003
2013-06-25 05:44:54rhettingersetstatus: open -> closed
resolution: fixed
2013-06-25 05:43:17python-devsetnosy: + python-dev
messages: + msg191836
2013-06-15 21:55:32r.david.murraysetmessages: + msg191239
2013-06-09 02:08:13Juliansetfiles: + minmax.patch

messages: + msg190839
2013-06-08 10:12:48serhiy.storchakasetstage: patch review
versions: - Python 3.5
2013-06-08 09:29:49pconnellsetnosy: + pconnell
2013-06-05 23:56:00Juliansetfiles: + minmax.patch

messages: + msg190703
2013-06-04 11:43:57dabeazsetmessages: + msg190595
2013-06-04 04:02:29rhettingersetassignee: rhettinger
messages: + msg190586
2013-06-04 03:48:20ncoghlansetmessages: + msg190585
2013-06-04 03:44:47gvanrossumsetassignee: gvanrossum -> (no value)
2013-06-04 03:43:12gvanrossumsetmessages: + msg190584
2013-06-04 03:34:18rhettingersetassignee: gvanrossum

messages: + msg190582
nosy: + gvanrossum
2013-06-04 03:03:55Juliansetmessages: + msg190580
2013-06-04 02:53:55ncoghlansetmessages: + msg190579
2013-06-04 01:48:01Juliansetmessages: + msg190577
2013-06-04 01:29:40rhettingersetmessages: + msg190575
2013-06-03 22:50:10dabeazsetnosy: + dabeaz
messages: + msg190571
2013-06-03 22:25:12ncoghlansetstatus: closed -> open
resolution: rejected -> (no value)
messages: + msg190569
2013-06-03 15:27:27doughellmannsetnosy: + doughellmann
messages: + msg190548
2013-06-03 14:13:20nedbatsetnosy: + nedbat
messages: + msg190545
2013-06-03 12:51:44ncoghlansetnosy: + ncoghlan
messages: + msg190542
2013-06-03 11:48:11twouterssetmessages: + msg190538
2013-06-03 04:06:05Juliansetmessages: + msg190515
2013-06-03 03:43:44rhettingersetstatus: open -> closed
resolution: rejected
messages: + msg190514
2013-06-02 12:26:40skrahsetnosy: + skrah
messages: + msg190484
2013-06-02 10:55:41r.david.murraysetmessages: + msg190482
2013-06-02 10:52:18r.david.murraysetmessages: + msg190481
2013-06-02 10:04:16rhettingersetnosy: + rhettinger
messages: + msg190477
2013-06-02 03:37:12Juliansetmessages: + msg190470
2013-06-02 03:27:30r.david.murraysetmessages: + msg190469
2013-06-02 02:53:15Juliansetmessages: + msg190466
2013-06-02 02:43:29r.david.murraysetmessages: + msg190465
2013-06-02 01:42:05Juliansetmessages: + msg190464
2013-06-01 01:38:12r.david.murraysetnosy: + r.david.murray
messages: + msg190432
2013-05-31 23:29:12Juliansetfiles: + minmaxdefault.patch
keywords: + patch
2013-05-31 23:21:06Juliancreate