Issue36662
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2019-04-18 20:31 by gsakkis, last changed 2022-04-11 14:59 by admin.
Messages (4) | |||
---|---|---|---|
msg340511 - (view) | Author: George Sakkis (gsakkis) | Date: 2019-04-18 20:31 | |
I'd like to propose two new optional boolean parameters to the @dataclass() decorator, `asdict` and `astuple`, that if true, the respective methods are generated as equivalent to the module-level namesake functions. In addition to saving an extra imported name, the main benefit is performance. By having access to the specific fields of the decorated class, it should be possible to generate a more efficient implementation than the one in the respective function. To illustrate the difference in performance, the asdict method is 28 times faster than the function in the following PEP 557 example: @dataclass class InventoryItem: '''Class for keeping track of an item in inventory.''' name: str unit_price: float quantity_on_hand: int = 0 def asdict(self): return { 'name': self.name, 'unit_price': self.unit_price, 'quantity_on_hand': self.quantity_on_hand, } In [4]: i = InventoryItem(name='widget', unit_price=3.0, quantity_on_hand=10) In [5]: asdict(i) == i.asdict() Out[5]: True In [6]: %timeit asdict(i) 5.45 µs ± 14.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) In [7]: %timeit i.asdict() 193 ns ± 0.443 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each) Thoughts? |
|||
msg340523 - (view) | Author: Karthikeyan Singaravelan (xtreak) * | Date: 2019-04-19 05:01 | |
asdict method in the benchmark does a direct dictionary construction. Meanwhile dataclasses.asdict does more work in https://github.com/python/cpython/blob/e8113f51a8bdf33188ee30a1c038a298329e7bfa/Lib/dataclasses.py#L1023 . Hence in the example i.asdict() and asdict(i) are not equivalent. import timeit from dataclasses import dataclass, asdict @dataclass class InventoryItem: '''Class for keeping track of an item in inventory.''' name: str unit_price: float quantity_on_hand: int = 0 def asdict(self): data = {'name': self.name, 'unit_price': self.unit_price, 'quantity_on_hand': self.quantity_on_hand, } return data i = InventoryItem(name='widget', unit_price=3.0, quantity_on_hand=10) setup = """from dataclasses import dataclass, asdict; @dataclass class InventoryItem: '''Class for keeping track of an item in inventory.''' name: str unit_price: float quantity_on_hand: int = 0 def asdict(self): data = {'name': self.name, 'unit_price': self.unit_price, 'quantity_on_hand': self.quantity_on_hand, } return data i = InventoryItem(name='widget', unit_price=3.0, quantity_on_hand=10)""" print("asdict(i)") print(timeit.Timer("asdict(i)", setup=f"{setup}").timeit(number=1_000_000)) print("i.asdict()") print(timeit.Timer("i.asdict()", setup=f"{setup}").timeit(number=1_000_000)) print("i.inlined_asdict()") print(timeit.Timer("i.inlined_asdict(i)", setup=f"{setup}; i.inlined_asdict = asdict").timeit(number=1_000_000)) i.inlined_asdict = asdict assert asdict(i) == i.asdict() == i.inlined_asdict(i) ./python.exe ../backups/bpo36662.py asdict(i) 11.585838756000001 i.asdict() 0.44129350699999925 i.inlined_asdict() 11.858042807999999 |
|||
msg340532 - (view) | Author: Eric V. Smith (eric.smith) * | Date: 2019-04-19 08:55 | |
I think the best thing to do is write another decorator that adds this method. I've often thought that having a dataclasses_tools third-party module would be a good idea. It could include my add_slots decorator in https://github.com/ericvsmith/dataclasses/blob/master/dataclass_tools.py Such a decorator could then deal with all the complications that I don't want to add to @dataclass. For example, choosing a method name. @dataclass doesn't inject any non-dunder names in the class, but the new decorator could, or it could provide a way to customize the member name. Also, note that your example asdict method doesn't do the same thing as dataclasses.asdict. While you get some speedup by knowing the field names in advance, you also don't do the recursive generation that dataclasses.asdict does. In order to skip the recursive dict generation, you'd either have to test the type of each member (using some heuristic about what doesn't need recursion), or assume the member type matches the type defined in the class. I don't want dataclasses.asdict to make the assumption that the member type matches the declared type. There's nowhere else it does this. I'm not sure how much of the speedup you're seeing is the result of hard-coding the member names, and how much is avoiding recursion. If all of the improvement is by eliminating recursion, then it's not worth doing. I'm not saying the existing dataclasses.asdict can't be sped up: surely it can. But I don't want to remove features or add complexity to do so. |
|||
msg340537 - (view) | Author: George Sakkis (gsakkis) | Date: 2019-04-19 10:53 | |
> I think the best thing to do is write another decorator that adds this method. I've often thought that having a dataclasses_tools third-party module would be a good idea. I'd be happy with a separate decorator in the standard library for adding these methods. Not so sure about a third-party module, the added value is probably not high enough to justify an extra dependency (assuming one is aware it exists in the first place). > or assume the member type matches the type defined in the class. This doesn't seem an unreasonable assumption to me. If I'm using a dataclass, I probably care enough about its member types to bother declaring them and I wouldn't mind if a particular method expects that the members actually match the types. This behaviour would be clearly documented. Alternatively, if we go with a separate decorator, whether this assumption holds could be a parameter, something like: def add_asdict(cls, name='asdict', strict=True) |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:59:14 | admin | set | github: 80843 |
2019-04-19 10:53:51 | gsakkis | set | messages: + msg340537 |
2019-04-19 08:55:08 | eric.smith | set | assignee: eric.smith messages: + msg340532 |
2019-04-19 08:37:53 | matrixise | set | nosy:
+ matrixise |
2019-04-19 05:01:37 | xtreak | set | nosy:
+ xtreak messages: + msg340523 |
2019-04-19 02:35:44 | xtreak | set | nosy:
+ rhettinger, eric.smith |
2019-04-18 20:31:10 | gsakkis | create |