classification
Title: Add "default" kw argument to operator.itemgetter and operator.attrgetter
Type: enhancement Stage: resolved
Components: Library (Lib) Versions: Python 3.3
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: Nosy List: gvanrossum, r.david.murray, rhettinger, tebeka
Priority: normal Keywords:

Created on 2012-03-22 01:09 by tebeka, last changed 2019-08-02 07:28 by rhettinger. This issue is now closed.

Messages (9)
msg156531 - (view) Author: Miki Tebeka (tebeka) * Date: 2012-03-22 01:09
This way they will behave more like getattr and the dictionary get.

If default is not specified, then if the item/attr not found, an execption will be raised, which is the current behavior.

However if default is specified, then return it in case when item/attr not found - default value will be returned.

I wanted this when trying to get configuration from a list of objects. I'd like to do
    get = attrgetter('foo', None)
    return get(args) or get(config) or get(env)
msg156533 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012-03-22 01:35
Thanks for the suggestion. 

It is an interesting idea, but there are some issues.  attrgetter and itemgetter can take more than one key.  It would probably make more sense to have the returned function be the one that takes the default, but in that case you might as well just use getattr.  If we did accept a keyword-only argument for a default, what does the default mean if there is more than one key?  And what if there are dotted names in the attrgetter keys?

I'm inclined to reject this as too complex for a marginal a use case.  But perhaps a python-ideas discussion would be appropriate.
msg156592 - (view) Author: Miki Tebeka (tebeka) * Date: 2012-03-22 16:46
IMO in the case of multiple items/attrs then you return default in their place:
    attrgetter('x', 'y', default=7)(None) => (7, 7)

In case of dotted attribute again it'll return default value of any of the attributes is not found:
    attrgetter('x.y', default=7)(None) => 7

BTW: This is inspired from Clojure's get-in (http://bit.ly/GGzqjh) function.

I'll bring this up in python-ideas.
msg156593 - (view) Author: Miki Tebeka (tebeka) * Date: 2012-03-22 16:50
python-ideas post at https://groups.google.com/d/msg/python-ideas/lc_hkpKNvAg/ledftgY0mFUJ
msg156595 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012-03-22 16:57
Great.  I'm going to close the issue, but it's only "for now": if you get a good response and a design agreement on python-ideas please reopen the issue.  Of course, we'd need a patch, too; and I believe there would need to be close to zero impact on performance, since as I understand it a large part of the point of attrgetter and friends is speed.
msg316181 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2018-05-04 17:10
I think represents a legitimate use case. There's another request for this on python-ideas: https://groups.google.com/d/msg/python-ideas/0jeftqQpm9c/yZ_uKO84BAAJ
msg316196 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2018-05-05 05:29
FWIW, Joe Jevnik worked hard to squeeze every possible nanosecond out of these calls because they were frequently used.  Extending this API will tend to undo his efforts.

The original intent of itemgetter() was to provide a minimal alternative to writing a simple lambda -- it did not have the goal of being a parameterized way to express all possible calling patterns -- it just aspired to be good at the common cases.  In particular, itemgetter/attrgetter/methodcaller aimed at building key-functions for sorted/min/max/nlargest/nsmallest applied to a homogenous list of records.  In none of those cases would it have been common or normal to have a missing fields with meaningful default values.

Over the years, we've had various requests to extend the functionality in all kinds of exotic ways (i.e. multi-level gets like s[5][1][3] or Tcl-like keyed-list chains).  Some of those went far beyond the original scope, were non-harmonious with the current API, or ended-up needing more complex and complete solutions.

Most of the requests were rarely accompanied by meaningful use cases. For example, the referenced python-ideas post only included toy examples and its sole expressed motivation was a half thought-out notion of consistency with getattr().  The consistency argument doesn't make much sense because the itemgetter() and attrgetter() API had already gone down a different road. The getattr() function only looks-up one attribute, while the itemgetter() and attrgetter() callable factories do multiple lookups such as attrgetter(lastname, firstname) or itemgetter(3, 8, 2).  It isn't clear that a default argument would make sense for those cases, nor would it handle cases where only one field had a default value but the others did not (I expect this would likely be more common than having a meaningful default value for every field).  Also, since multiple positional arguments are allowed, the default parameter would have to be a keyword argument (which isn't parallel with getattr()).

IIRC, GvR at one time rejected a request for a list.get(index, default) method on the basis that it rarely made sense for indexed lookups; however, that seems very much like what is being proposed here for itemgetter().

Lastly, I'm concerned that every bit of extra functionality makes this API more complex (meaning harder to learn and remember) and makes it slower (undoing previous optimization efforts in order to speed-up its primary use cases for namedtuple() and as a key function).

Even now, these APIs have are complex enough to where very fewer developers even know what functionality is already provided.  Even Python experts won't immediately have correct interpretations for the likes of methodcaller('name', 'foo', bar=1) or itemgetter(slice(2,None))('ABCDEFG').  In both cases (both of which are documented), the code would be better-off with a plain lambda or def.  ISTM that pushing the itemgetter/attrgetter/methodcaller API further would be a readability anti-pattern.

If this does go forward, I think we should look for some actual use cases in real code to help inform the decision of whether this would be an net win or whether it would result in code that wouldn't pass a code review.
msg316222 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2018-05-05 21:35
Here are a few more thoughts and a concrete example for clarity.

Itemgetter/Attrgetter were designed to make multiple lookups (possibly multi-level) and to return a tuple that need not be homogenous.  A single default value likely doesn't make sense in any of those contexts. 

Consider:

    attrgetter('programmer.lang', 'team.dept')

Various behaviors you might want:

    (o.programmer.lang if hasattr(o, 'programmer') else o.avg_joe.lang, o.team.dept)

    (o.programmer.lang if hasattr(o.programmer) else 'python', o.team.dept)

    (o.programmer.long, o.team.dept if hasattr(o.team, 'dept') else 'accounting')

    try:
        return (o.programmer.lang, o.team.dept)
    except AttributeError:
        return ('python', 'accounting')

Other than the OP's example with a universal default value of None (which would like require more tests downstream), I can't think of any meaningful default value that would apply to both programmer.lang and team.dept.

If this proposal is to go forward, its needs a more informative use case than the toy example provided on python-ideas:
                
    p1 = {'x': 43; 'y': 55} 
    x, y, z = itemgetter('x', 'y', 'z', default=0)(values) 
    print(x, y, z) 
    43, 55, 0 

And if it does go forward, I would suggest that the semantics of default be an entire result tuple rather than a scalar applied on an attribute by attribute basis:

    attrgetter('programmer.lang', 'team.dept', default=('python', 'accounting'))
    attrgetter('temperature', 'humidity', 'altitude', default=('22°C', '50%', "0 meters MSL"))
    attrgetter('foo', default=None)

That would still meet the OP's use case, but would be more broadly applicable and not make the assumption that every field has the same default value.  

Even with that improvement, users would still likely be better-off writing a function that explicitly says what they want to do.  For example, the OP's use case can already be written like this:

    get = lambda obj:  getattr(obj, 'foo', None)
    return get(args) or get(config) or get(env)

That is easy and unambiguous.  Writing a separate function allows all of the above cases to be written cleanly.  There isn't any need to trick-out itemgetter() for this.
msg348890 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2019-08-02 07:28
Marking as closed for the reasons listed above.  And there has been no further comments for over a year.
History
Date User Action Args
2019-08-02 07:28:15rhettingersetstatus: open -> closed
resolution: rejected
messages: + msg348890
2018-05-05 21:35:58rhettingersetmessages: + msg316222
2018-05-05 05:29:31rhettingersetassignee: rhettinger ->
messages: + msg316196
2018-05-04 17:10:23gvanrossumsetstatus: closed -> open

nosy: + gvanrossum
messages: + msg316181

resolution: later -> (no value)
2012-03-22 19:12:55rhettingersetassignee: rhettinger
2012-03-22 16:57:27r.david.murraysetstatus: open -> closed

nosy: + rhettinger
messages: + msg156595

resolution: later
stage: resolved
2012-03-22 16:50:25tebekasetmessages: + msg156593
2012-03-22 16:46:27tebekasetmessages: + msg156592
2012-03-22 01:35:32r.david.murraysetversions: - Python 3.4
nosy: + r.david.murray

messages: + msg156533

type: enhancement
2012-03-22 01:09:18tebekacreate