classification
Title: Add namedattrgetter function which acts like attrgetter but uses namedtuple
Type: enhancement Stage: resolved
Components: Library (Lib) Versions: Python 3.7
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: rhettinger Nosy List: Isaac Morland, rhettinger
Priority: normal Keywords:

Created on 2017-07-31 01:19 by Isaac Morland, last changed 2017-08-01 06:01 by rhettinger. This issue is now closed.

Messages (5)
msg299534 - (view) Author: Isaac Morland (Isaac Morland) Date: 2017-07-31 01:19
This is meant to replace my proposal in #30020 to change attrgetter to use namedtuple.  By creating a new function implemented in Python, I avoid making changes to the existing attrgetter, which means that both the need of implementing a C version and the risk of changing the performance or other characteristics of the existing function are eliminated.

My suggestion is to put this in the collections module next to namedtuple.  This eliminates the circular import problem and is a natural fit as it is an application of namedtuple.
msg299535 - (view) Author: Isaac Morland (Isaac Morland) Date: 2017-07-31 01:21
Here is the diff.  Note that I assume implementation of #31085, which allows me to push determination of a name for the namedtuple down into namedtuple itself:

diff --git a/Lib/collections/__init__.py b/Lib/collections/__init__.py
index 62cf708..d507d23 100644
--- a/Lib/collections/__init__.py
+++ b/Lib/collections/__init__.py
@@ -14,7 +14,8 @@ list, set, and tuple.
 
 '''
 
-__all__ = ['deque', 'defaultdict', 'namedtuple', 'UserDict', 'UserList',
+__all__ = ['deque', 'defaultdict', 'namedtuple', 'namedattrgetter',
+            'UserDict', 'UserList',
             'UserString', 'Counter', 'OrderedDict', 'ChainMap']
 
 # For backwards compatibility, continue to make the collections ABCs
@@ -23,7 +24,7 @@ from _collections_abc import *
 import _collections_abc
 __all__ += _collections_abc.__all__
 
-from operator import itemgetter as _itemgetter, eq as _eq
+from operator import itemgetter as _itemgetter, attrgetter as _attrgetter, eq as _eq
 from keyword import iskeyword as _iskeyword
 import sys as _sys
 import heapq as _heapq
@@ -451,6 +452,14 @@ def namedtuple(typename, field_names, *, verbose=False, rename=False, module=Non
 
     return result
 
+def namedattrgetter (attr, *attrs):
+    ag = _attrgetter (attr, *attrs)
+
+    if attrs:
+        nt = namedtuple (None, (attr,) + attrs, rename=True)
+        return lambda obj: nt._make (ag (obj))
+    else:
+        return ag
 
 ########################################################################
 ###  Counter
msg299547 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2017-07-31 09:23
The principal use case for attrgetter() was to work with key-functions for sorted/min/max/groupby/nsmallest/nlargest.  Secondarily, it worked nicely with map() and filter() as a field extractor in a chain of iterators.  Neither these use cases would benefit from creating a namedtuple.

What are your use cases that are creating a need for a variant of attrgetter that returns namedtuples?

Also, how would this be useful with rename=True?  The user of the result wouldn't know the fields names in advance and hence wouldn't be able to access them.

Do you know of any cases where someone has used this recipe is real code?  Has it been tried out on users other than yourself?
msg299553 - (view) Author: Isaac Morland (Isaac Morland) Date: 2017-07-31 12:39
Maybe the issue is that I work with SQL constantly.  In SQL, if I say "SELECT a, b, c FROM t" and table t has columns a, b, c, d, e, f, I can still select a, b, and c from the result.  So to me it is natural that getting a bunch of attributes returns something (row or object, depending on the context), where the attributes are still labelled.

I understand why this was rejected as a universal change to attrgetter - in particular, I didn't re-evaluate the appropriateness of the change once I realized that attrgetter has a C implementation - but I don't understand why this isn't considered a natural option to provide.

Using rename=True is just a way of having it not blow up if an attribute name requiring renaming is supplied.  I agree that actually using such an attribute requires either guessing the name generated by the rename logic in namedtuple or using numeric indexing.  If namedtuple didn't have rename=True then I wouldn't try to re-implement it but since it does I figure it's worth typing ", rename=True" once - it's hardly going to hurt anything.

Finally as to use cases, I agree that if the only thing one is doing is sorting it doesn't matter.  But with groupby it can be very useful.  Say I have an iterator providing objects with fields (heading_id, heading_text, item_id, item_text).  I want to display each heading, followed by its items.

So, I groupby attrgetter ('heading_id', 'heading_text'), and write a loop something like this:

for heading, items in groupby (source, attrgetter ('heading_id', 'heading_text')):
    # display heading
    # refer to heading.heading_id and heading.heading_text
    for item in items:
        # display item
        # refer to item.item_id and item.item_text

Except I can't, because heading doesn't have attribute names.  If I replace attrgetter with namedattrgetter then I'm fine.  How would you write this?  In the past I've used items[0] but that is (a) ugly and (b) requires "items = list(items)" which is just noise.

I feel like depending on what is being done with map and filter you could have a similar situation where you want to refer to the specific fields of the tuple coming back from the function returned by attrgetter.
msg299602 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2017-08-01 06:01
Sorry Issac, but I'm going to decline this feature request.  I know you're enthusiastic about this or some other variation but I don't think it is worthy of becoming part of the standard library.  I do encourage you to post this somewhere as recipe (personally, I've used the ASPN cookbook to post my ideas) or as an offering on PyPI.

Reasons:

* The use cases are thin and likely to be uncommon.

* The recipe is short and doesn't add much value.

* The anonymous or autogenerated typename is unhelpful
  and the output doesn't look nice.

* It is already possible to combine a namedtuple with field
  extraction using a simple lambda.

* List comprehensions are clearer, easier, and more flexible
  for the task of extracting fields into a new named tuple.

* The combination of an anonymous or autogenerated typename
  along with automatic field renaming will likely cause
  more problems than it is worth.

* I don't expect this to mesh well with typing.NamedTuple
  and the needs of static typing tools

* Debugging may become more challenging with implicitly
  created named tuples that have autogenerated type names.


-- My experiments with the API ------------------------------

from collections import namedtuple
from operator import attrgetter
from pprint import pprint

def namedattrgetter (attr, *attrs):
    ag = attrgetter (attr, *attrs)
    if attrs:
        nt = namedtuple ('_', (attr,) + attrs, rename=True)
        return lambda obj: nt._make (ag (obj))
    else:
        return ag

Person = namedtuple('Person', ['fname', 'lname', 'age', 'email'])
FullName = namedtuple('FullName', ['lname', 'fname'])

people = [
	Person('tom', 'smith', 50, 'tsmith@example.com'),
	Person('sue', 'henry', 40, 'shenry@example.com'),
	Person('hank', 'jones', 30, 'hjones@example.com'),
	Person('meg', 'davis', 20, 'mdavis@example.com'),
]

# Proposed way
pprint(list(map(namedattrgetter('lname', 'fname'), people)))

# Existing way with two-steps (attrgetter followed by nt._make)
pprint(list(map(FullName._make, map(attrgetter(*FullName._fields), people))))

# Existing way using a lambda
pprint(list(map(lambda p: FullName(p.lname, p.fname), people)))

# Best way with a plain list comprehension
pprint([FullName(p.lname, p.fname) for p in people])
History
Date User Action Args
2017-08-01 06:01:15rhettingersetstatus: open -> closed
resolution: rejected
messages: + msg299602

stage: resolved
2017-07-31 12:39:16Isaac Morlandsetmessages: + msg299553
2017-07-31 09:23:02rhettingersetassignee: rhettinger

messages: + msg299547
nosy: + rhettinger
2017-07-31 01:21:23Isaac Morlandsetmessages: + msg299535
2017-07-31 01:19:07Isaac Morlandcreate