Issue42765
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2020-12-28 15:26 by conqp, last changed 2022-04-11 14:59 by admin.
Messages (5) | |||
---|---|---|---|
msg383897 - (view) | Author: Richard Neumann (conqp) * | Date: 2020-12-28 15:26 | |
I have use cases in which I use named tuples to represent data sets, e.g: class BasicStats(NamedTuple): """Basic statistics response packet.""" type: Type session_id: BigEndianSignedInt32 motd: str game_type: str map: str num_players: int max_players: int host_port: int host_ip: IPAddressOrHostname I want them to behave as intended, i.e. that unpacking them should behave as expected from a tuple: type, session_id, motd, … = BasicStats(…) I also want to be able to serialize them to a JSON-ish dict. The NamedTuple has an _asdict method, that I could use. json = BasicStats(…)._asdict() But for the dict to be passed to JSON, I need customization of the dict representation, e.g. set host_ip to str(self.host_ip), since it might be a non-serializable ipaddress.IPv{4,6}Address. Doing this in an object hook of json.dumps() is a non-starter, since I cannot force the user to remember, which types need to be converted on the several data structures. Also, using _asdict() seems strange as an exposed API, since it's an underscore method and users hence might not be inclined to use it. So what I did is to add a method to_json() to convert the named tuple into a JSON-ish dict: def to_json(self) -> dict: """Returns a JSON-ish dict.""" return { 'type': self.type.value, 'session_id': self.session_id, 'motd': self.motd, 'game_type': self.game_type, 'map': self.map, 'num_players': self.num_players, 'max_players': self.max_players, 'host_port': self.host_port, 'host_ip': str(self.host_ip) } It would be nicer to have my type just return this appropriate dict when invoking dict(BasicStats(…)). This would require me to override the __iter__() method to yield key / value tuples for the dict. However, this would break the natural behaviour of tuple unpacking as described above. Hence, I propose to add a method __iter_items__(self) to the python data model with the following properties: 1) __iter_items__ is expected to return an iterator of 2-tuples representing key / value pairs. 2) the built-in function dict(), when called on an object, will attempt to create the object from __iter_items__ first and fall back to __iter__. Alternative names could also be __items__ or __iter_dict__. |
|||
msg383909 - (view) | Author: Raymond Hettinger (rhettinger) * ![]() |
Date: 2020-12-28 18:01 | |
This core of this idea is plausible. It is a common problem for people to want to teach a class how to convert itself to and from JSON. Altering the API for dicts is a major step, so you would need to take this to python-ideas to start getting buy-in. A much smaller API change would be to just teach the JSON module to recognize a __json__ method. Presumably if a robust serialization solution is created, people will need a way to deserialize back into a named tuple, data class, or custom class. Offhand, the only way I can think of to do this would be to add a field that could be recognized by json.load(). Some care would be needed to not create a pickle-like risk of arbitrary code execution. |
|||
msg383933 - (view) | Author: Steven D'Aprano (steven.daprano) * ![]() |
Date: 2020-12-28 21:49 | |
Hi Richard, > Also, using _asdict() seems strange as an exposed API, since it's an underscore method and users hence might not be inclined to use it. I don't consider this a strong argument. Named tuple in general has to use a naming convention for public methods that cannot clash with field names, hence the single underscore, but your concrete named tuple class can offer any methods you like since you know which field names are used and which are not. Just add a public method "asdict" or any name you prefer, and delegate to the single underscore method. > It would be nicer to have my type just return this appropriate dict when invoking dict(BasicStats(…)). As we speak there is a discussion on Python-Ideas about this. https://mail.python.org/archives/list/python-ideas@python.org/thread/2HMRGJ672NDZJZ5PVLMNVW6KP7OHMQDI/#UYDIPMY2HXGL4OLEEFXBTZ2T4CK6TSVU Your input would be appreciated. > This would require me to override the __iter__() method to yield key / value tuples for the dict. The dict constructor does not require that. See discussion on the thread above. If you search the Python-Ideas archives, I am sure you will find past proposals for a `__json__` protocol. If I recall correctly, there was some concern about opening the flood-gates for dunder protocols (will this be followed with demands for __yaml__, __xml__, __cson__, __toml__, etc?) but perhaps the time is right to revisit this idea. |
|||
msg384486 - (view) | Author: Richard Neumann (conqp) * | Date: 2021-01-06 10:53 | |
Thank you all for your input. I had a look at aforementioned discussion and learned something new. So I tried to implement the dict data model by implementing keys() and __getitem__() accordingly: from typing import NamedTuple class Spamm(NamedTuple): foo: int bar: str def __getitem__(self, item): if isinstance(item, str): try: return getattr(self, item) except AttributeError: raise KeyError(item) from None return super().__getitem__(item) def keys(self): yield 'foo' yield 'bar' def main(): spamm = Spamm(12, 'hello') print(spamm.__getitem__) print(spamm.__getitem__(1)) d = dict(spamm) if __name__ == '__main__': main() Unfortunately this will result in an error: Traceback (most recent call last): File "/home/neumann/test.py", line 4, in <module> class Spamm(NamedTuple): RuntimeError: __class__ not set defining 'Spamm' as <class '__main__.Spamm'>. Was __classcell__ propagated to type.__new__? Which seems to be caused by the __getitem__ implementation. I found a corresponding issue here: https://bugs.python.org/issue41629 Can I assume, that this is a pending bug and thusly I cannot implement the desired behaviour until a fix? |
|||
msg384488 - (view) | Author: Richard Neumann (conqp) * | Date: 2021-01-06 10:58 | |
Okay, I found the solution. Not using super() works: from typing import NamedTuple class Spamm(NamedTuple): foo: int bar: str def __getitem__(self, index_or_key): if isinstance(index_or_key, str): try: return getattr(self, index_or_key) except AttributeError: raise KeyError(index_or_key) from None return tuple.__getitem__(self, index_or_key) def keys(self): yield 'foo' yield 'bar' def main(): spamm = Spamm(12, 'hello') print(spamm.__getitem__) print(spamm.__getitem__(1)) d = dict(spamm) print(d) if __name__ == '__main__': main() Result: <bound method Spamm.__getitem__ of Spamm(foo=12, bar='hello')> hello {'foo': 12, 'bar': 'hello'} |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:59:39 | admin | set | github: 86931 |
2021-01-06 10:58:49 | conqp | set | messages: + msg384488 |
2021-01-06 10:53:10 | conqp | set | messages: + msg384486 |
2020-12-29 00:56:45 | eric.smith | set | nosy:
+ eric.smith |
2020-12-28 21:49:50 | steven.daprano | set | nosy:
+ steven.daprano messages: + msg383933 |
2020-12-28 18:02:53 | rhettinger | set | nosy:
+ bob.ippolito |
2020-12-28 18:01:33 | rhettinger | set | messages: + msg383909 |
2020-12-28 15:37:30 | xtreak | set | nosy:
+ rhettinger |
2020-12-28 15:26:40 | conqp | create |