Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a default argument to min & max #62311

Closed
Julian mannequin opened this issue May 31, 2013 · 35 comments
Closed

Add a default argument to min & max #62311

Julian mannequin opened this issue May 31, 2013 · 35 comments
Assignees
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement

Comments

@Julian
Copy link
Mannequin

Julian mannequin commented May 31, 2013

BPO 18111
Nosy @gvanrossum, @Yhg1s, @rhettinger, @ncoghlan, @nedbat, @merwok, @dhellmann, @bitdancer, @skrah, @Julian, @phmc
Files
  • minmaxdefault.patch
  • minmax.patch
  • minmax.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/rhettinger'
    closed_at = <Date 2013-06-25.05:44:54.927>
    created_at = <Date 2013-05-31.23:21:05.977>
    labels = ['interpreter-core', 'type-feature']
    title = 'Add a default argument to min & max'
    updated_at = <Date 2014-02-13.14:20:55.638>
    user = 'https://github.com/Julian'

    bugs.python.org fields:

    activity = <Date 2014-02-13.14:20:55.638>
    actor = 'eric.araujo'
    assignee = 'rhettinger'
    closed = True
    closed_date = <Date 2013-06-25.05:44:54.927>
    closer = 'rhettinger'
    components = ['Interpreter Core']
    creation = <Date 2013-05-31.23:21:05.977>
    creator = 'Julian'
    dependencies = []
    files = ['30437', '30479', '30510']
    hgrepos = []
    issue_num = 18111
    keywords = ['patch']
    message_count = 35.0
    messages = ['190426', '190432', '190464', '190465', '190466', '190469', '190470', '190477', '190481', '190482', '190484', '190514', '190515', '190538', '190542', '190545', '190548', '190569', '190571', '190575', '190577', '190579', '190580', '190582', '190584', '190585', '190586', '190595', '190703', '190839', '191239', '191836', '192003', '192006', '211158']
    nosy_count = 13.0
    nosy_names = ['gvanrossum', 'twouters', 'rhettinger', 'ncoghlan', 'nedbat', 'eric.araujo', 'doughellmann', 'r.david.murray', 'skrah', 'dabeaz', 'Julian', 'python-dev', 'pconnell']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'patch review'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue18111'
    versions = ['Python 3.4']

    @Julian
    Copy link
    Mannequin Author

    Julian mannequin commented May 31, 2013

    This has come up a number of times I'm aware, but no one has ever written a patch for it as far as a quick look uncovered.

    So here's one (written by Thomas Wouters but with permission to submit). Submitting without tests and docs, but those are incoming when I get a moment.

    The justification here is mostly related to being able to call min/max on an iterable of unknown length when there's a sensible default (which is usually going to be None, but that's not the default for backwards compat obviously).

    @Julian Julian mannequin added interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement labels May 31, 2013
    @bitdancer
    Copy link
    Member

    bpo-7153 and the discussions linked therefrom is presumably relevant here.

    Do you have a concrete use case?

    @Julian
    Copy link
    Mannequin Author

    Julian mannequin commented Jun 2, 2013

    Thanks for finding that, I thought there was an issue that came out of that p-i thread but couldn't find it.

    I'd like to be more concrete, but "calling max on an iterable" seems concrete enough to me. If you'd like to know more though, personally I've wanted this at least twice in the past 4 or 5 months. Currently, I have code that looks like:

        def best_match(stuff):
            first = next(stuff, None)
            if first is None:
                return
            return max(itertools.chain([first], stuff))

    which finds the best matching (error it happens to be) in the given stuff. A few months ago I had a similar need in a different application.

    The issues in that thread from 2009 revolved around a bunch of confusing and not so related things. And I definitely agree that start is a really bad name for this. But I don't find default to be at all confusing, and in fact this has come up in #python a few times and each time there hasn't really been a problem explaining to someone what default would do (or how it would interact with key for that matter, although if a precedent is desired, the default in argparse just returns the default, it doesn't call type or anything on it).

    @bitdancer
    Copy link
    Member

    So you aren't really asking for a default, you are asking for a version of max/min that returns a sentinel instead of raising an error on an empty list. You could just write utility function that has a try/except in it. I'm not sure it is worth complicating min and max for this use case. (I'm not sure it isn't, either.) I wonder if we have any other functions in python that have an argument that turns a ValueError into a sentinel-return instead?

    @Julian
    Copy link
    Mannequin Author

    Julian mannequin commented Jun 2, 2013

    Yes that's a good description. I'm not sure the type of exception is the thing to look at as much as the behavior, I.e. I think next() is a good precedent.

    And it's not really pleasant to do this with a try/except. Of course everything's possible, but to catch the ValueError sanely requires checking the text of the exception so that the except isn't too broad.

    @bitdancer
    Copy link
    Member

    I don't think there's any other way to get a ValueError out of min/max, but I haven't actually looked at the code to check. Of course, if we want people to rely on that, we'd need to document it.

    'next's default is used to return a sentinel when the list is exhausted, not when it would otherwise raise a ValueError. It is a somewhat related case, but is not exactly parallel. The sentinel for next indicates the end of an ongoing process. The proposed argument to min/max would *replace* an error with a sentinel value.

    @Julian
    Copy link
    Mannequin Author

    Julian mannequin commented Jun 2, 2013

    It's not exactly the same of course, but calling next on a thing that might be empty would be somewhat close, and also is replacing an exception with a sentinel (an exception that is much easier to differentiate).

    You can always get a ValueError out of min/max, they're going to be ultimately calling __lt__ on stuff which can do what it wants, but that's admittedly quite unlikely.

    It's not really that readable though on the other hand.

    @rhettinger
    Copy link
    Contributor

    I'm -1 on expanding this API further. It already is pushing the limits with the dual signature and with the key-function.

    Many languages have min/max functions. AFAICT, none of them have an API with a default argument. This suggests that this isn't an essential capability.

    @bitdancer
    Copy link
    Member

    That's a good point about the __lt__. It occurred to me as well just before I read your post :).

    Raymond, do any other languages have an iterator protocol as a core language feature? It's the fact that it is in Python, and that it is not simple to LBYL when dealing with iterators, that brings this issue up for min and max.

    @bitdancer
    Copy link
    Member

    Oh, and I don't think Haskell counts, since you'd expect them to stick strictly to the mathematical definition, with no consideration of practicality :)

    Note that I'm not saying I'm +1 on adding this (I haven't decided), I'm just trying to clarify the argument.

    @skrah
    Copy link
    Mannequin

    skrah mannequin commented Jun 2, 2013

    I'd use foldl() in functional languages, where the default is part
    of foldl() and not of max().

    Translated to Python, I'm thinking of:

    it = iter([328, 28, 2989, 22])
    functools.reduce(max, it, next(it, None))
    2989

    I agree with Raymond that a default arg in max() looks out of place.

    @rhettinger
    Copy link
    Contributor

    Thanks Stephan. I'm going to close this one. The case for adding it is too weak and isn't worth making the API more complex.

    If someone wants a default with an iterable of arbitrary size including zero, there are already a number of ways to do it (using itertools.chain for example or just catching the exception).

    @Julian
    Copy link
    Mannequin Author

    Julian mannequin commented Jun 3, 2013

    I don't really care to push this much harder, but I'll just repeat that I've already made an argument against catching the exception. Calling this making the API too complex also seems quite silly to me. It's a thing that someone looking for would find and someone who wasn't wouldn't.

    @Yhg1s
    Copy link
    Member

    Yhg1s commented Jun 3, 2013

    For the record, Raymond, I think you're wrong about this. Itertools isn't always a solution to every problem, and it makes for a very awkward way around a silly limitation in min() and max(). Their API is already awkward -- because they already take a keyword argument as well as *args or an iterable -- and this does not make it worse in any way. It's trivial to add this, it's trivial to explain -- return a specific value instead of raising a particular exception -- and it's wasteful, complex, fragile or unreadable (except if you have itertools on the mind, I guess) to do the same thing in another way.

    @ncoghlan
    Copy link
    Contributor

    ncoghlan commented Jun 3, 2013

    +1 for adding this. It's simple to implement, simple to explain and the alternatives for dealing with the empty iterable case (or even the fact it may need to be handled at all) are definitely not obvious.

    The relationship to next() is straightforward: the supplied value is effectively used as the default value for the first next call when iterating and then ignored thereafter.

    @nedbat
    Copy link
    Member

    nedbat commented Jun 3, 2013

    I find the workarounds mentioned here to be baroque and confusing. The concept of a default value to return in the case of an empty iterator is straightforward. I'm +1 on adding this as well.

    @dhellmann
    Copy link
    Member

    +1 on adding this

    I found today via @dabeaz's cookbook that iter() has a sentinel-detection use case. Having one in min/max seems *far* more obviously useful. It's also consistent with quite a few methods on builtin types where we provide a way to deal with unknown data safely by having a default instead of catching exceptions directly.

    @ncoghlan
    Copy link
    Contributor

    ncoghlan commented Jun 3, 2013

    As stated, I don't agree with the closure of this one. min/max deserve a more elegant mechanism for dealing with the empty iterable edge case.

    @ncoghlan ncoghlan reopened this Jun 3, 2013
    @dabeaz
    Copy link
    Mannequin

    dabeaz mannequin commented Jun 3, 2013

    I could have used this feature myself somewhat recently. It was in some code involving document matching where zero or more possible candidates were assigned a score and I was trying to find the max score. The fact that an empty list was a possibility complicated everything because I had to add extra checks for it. max(scores, default=0) would have been a lot simpler.

    @rhettinger
    Copy link
    Contributor

    I still think complicating the API isn't worth it. Of late, we've gotten in the habit of a complexity to even the simplest of things.

    In the case of sequences, we already have a reasonable solution:

        low = min(seq) if seq else default

    In the rarer case of non-sequence iterables, catching a Value error is the traditional solution -- not pretty, but not difficult either.

    In the past, Guido has rejected that notion of "add a default value to every function than can raise an exception". For example, someone wanted to add a get() method to lists so they could avoid catching an IndexError, something like s.get(index, default). IIRC, his motivation was that "avoiding API clutter" was more important than supporting an uncommon use case that already had viable solutions using plain Python.

    My own principle is that it is okay to add an initial extra feature to function, but not multiple extra features. This becomes more important when the function already has API issues.

    The min() / max() functions started-out with an API problem because of their dual signature: min(s) versus min(a, b, c). That creates an ambiguity in that case of min(*t) where the result depends on the length of the input. IMO, that issue would be exacerbated by the addition of a default argument (i.e. it would tend to hide the error from test cases):

        for t in [(10, 20, 30), (10, 20), (10,), ()]:
            print min(*t, default=100)

    Also, I don't think we should forget the lessons learned about adding unnecessary arguments to functions. For example, we let someone add start and end arguments to str.startswith() and str.endswith() because "it seemed more parallel with str.find()" and because "someone wanted it once a piece of code somewhere". The downside only became apparent later when there was a legitimate real use case for another feature request: searching multiple prefixes and suffixes. Because we had wasted the positional arguments, we ended-up with the awkward str.startswith(str-or-tuple [,start [,end]]) signature. That is why we have to write, filename.endswith(('.py', '.pyc')). The double parens are part of the cost of API bloat.

    Lastly, when it comes to common-place functions like min() and max(), we can assess needs easily by looking at what other languages have done. Looking at Excel, SQL, Java, Haskell, C# etc, I don't see a precedent for this feature request.

    Grepping for min/max in my own code and third-party libraries, I don't see any cases where the code had a need for this feature. If the need arises, it certainly doesn't come of very often.

    If you guys all think this is an good feature request, then by all means, go ahead and add it. My recommendation is to show design restraint and refuse the temptation to grow this already awkward API when you don't have to.

    @Julian
    Copy link
    Mannequin Author

    Julian mannequin commented Jun 4, 2013

    Raymond, I respect that in your opinion this seems to be overcomplexity, but you haven't addressed any of the arguments made, nor responded to any of the arguments against this being added complexity.

    I really don't understand the parallels you're making to str.*with, but as for other languages, as David pointed out already, you are looking at things in a vacuum. This is needed because min and max are already silly. In languages like Ruby and Clojure, which were the quickest I had handy, of course you don't need this, because calling min and max *by default* returns None. I'd bet Python 2's silly type comparison history had something to do with the return value not defaulting to None, but what's done is done. I don't like hard-fast rules, but I don't think APIs should ever be penalized for their own mistakes. We should make sane things possible in pleasant ways.

    If it's OK then (turning back to the patch), unless anyone has something additional to add I'm going to carve up some tests.

    @ncoghlan
    Copy link
    Contributor

    ncoghlan commented Jun 4, 2013

    To me, the Python-specific difference that makes this useful for us but not for others is *precisely* the fact that the simple idiom:

        x = min(seq) if seq else default

    is broken for iterators that don't provide __len__ or __bool__, while the even simpler:

        x = min(seq)

    is broken for the empty iterable.

    However, I think we should explicitly disallow the combination of multiple positional arguments *and* the new default argument. If you don't know the length of the input iterable, you should *not* be using the multiple argument form.

    @Julian
    Copy link
    Mannequin Author

    Julian mannequin commented Jun 4, 2013

    Personally I don't care either way, I basically never use the multiple positional arg form, but what are we trying to prevent exactly? It's bad code, but it (would) do what the person was expecting. Am I not getting the point that's being made about that case?

    @rhettinger
    Copy link
    Contributor

    Guido, this is your language. What would you like to do?

    The OP wants a default argument on min() and max() so he won't have to use an "except ValueError" for non-sequence iterables that are potentially empty.

    At first, I thought the functions were already as complex as we would want to get, but several proponents have emerged, so I'm stepping aside.

    The proposed patch would allow:
    max(iterable, key=somefunc, default=sentinel)
    and would return sentinel_value when len(list(iterable))==0.

    It would not allow:
    max(*iterable, key=somefunc, default=sentinel_value)
    where s is an empty iterable.

    @gvanrossum
    Copy link
    Member

    +1

    @gvanrossum gvanrossum removed their assignment Jun 4, 2013
    @ncoghlan
    Copy link
    Contributor

    ncoghlan commented Jun 4, 2013

    Just one final design philosophy comment from me (I know it isn't needed since Guido already ack'ed the suggestion):

    As far as the lessons learned from the historical startswith() case go, avoiding taking up the positional slots with optional flags and configuration parameters is one of the big reasons we added keyword only arguments (with the other being readability at the call site).

    I agree we need to be cautious with API complexity, I just think in this case clean handling of empty iterators is a net win (so long as we rule out the conceptually broken case of combining the new parameter with multiple positional arguments).

    @rhettinger
    Copy link
    Contributor

    Julian, when your tests are ready, I'll be happy to review and apply the patch.

    @rhettinger rhettinger self-assigned this Jun 4, 2013
    @dabeaz
    Copy link
    Mannequin

    dabeaz mannequin commented Jun 4, 2013

    To me, the fact that m = max(s) if s else default doesn't work with iterators alone makes this worthy of consideration.

    I would also note that min/max are the only reduction functions that don't have the ability to work with a possibly empty sequence. For example:

        >>> sum([])
        0
        >>> any([])
        False
        >>> all([])
        True
        >>> functools.reduce(lambda x,y: x+y, [], 0)
        0
        >>> math.fsum([])
        0.0
        >>>

    @Julian
    Copy link
    Mannequin Author

    Julian mannequin commented Jun 5, 2013

    Thanks for the offer Raymond.

    Here's a patch with some tests and doc updates, incorporating Nick's suggestion (but without a note in the docs, I didn't feel it's notable).

    Please let me know if anything needs tidying.

    @Julian
    Copy link
    Mannequin Author

    Julian mannequin commented Jun 9, 2013

    New patchset addressing review. Not sure why upload.py isn't working for me (I assume it should?)

    @bitdancer
    Copy link
    Member

    I find it interesting that I just came across a case where I want this functionality, involving a generator derived from a possibly-empty sql table. I'm using Stefan's functional expression for now :)

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Jun 25, 2013

    New changeset 76196691b5d0 by Raymond Hettinger in branch 'default':
    bpo-18111: Add a default argument to min() and max()
    http://hg.python.org/cpython/rev/76196691b5d0

    @bitdancer
    Copy link
    Member

    Julian, could you please submit a contributor agreement? (http://www.python.org/psf/contrib)

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Jun 28, 2013

    New changeset 34ff27b431d0 by R David Murray in branch 'default':
    bpo-18111: Add What's New entry for max/min default.
    http://hg.python.org/cpython/rev/34ff27b431d0

    @merwok
    Copy link
    Member

    merwok commented Feb 13, 2014

    Note that the docs were changed, but versionchanged directives are missing.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    8 participants