classification
Title: dataclasses.astuple does deepcopy on all fields
Type: Stage: patch review
Components: Library (Lib) Versions: Python 3.8, Python 3.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: eric.smith Nosy List: andrei.avk, eric.smith, mandolaerik, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2021-04-21 12:19 by mandolaerik, last changed 2021-06-21 19:42 by andrei.avk.

Pull Requests
URL Status Linked Edit
PR 26154 open andrei.avk, 2021-05-15 22:19
Messages (7)
msg391516 - (view) Author: Erik Carstensen (mandolaerik) Date: 2021-04-21 12:19
It seems that the 'dataclass.astuple' function does a deepcopy of all fields. This is not documented. Two problems:

1. Dictionary keys that rely on object identity are ruined:
    import dataclasses
    @dataclasses.dataclass
    class Foo:
        key: object
    key = object()
    lut = {key: 5}
    (y,) = dataclasses.astuple(Foo(x))
    # KeyError
    lut[y]

2. dataclasses can only be converted to a tuple if all fields are serializable:

    import dataclasses
    @dataclasses.dataclass
    class Foo:
        f: object
    foo = Foo(open('test.py'))
    dataclasses.astuple(foo)

->

TypeError: cannot pickle '_io.TextIOWrapper' object


In my use case, I just want a list of all fields. I can do the following as a workaround:
  (getattr(foo, field.name) for field in dataclasses.fields(foo))

Tested on Python 3.8.7 and 3.7.9.
msg391518 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2021-04-21 12:25
Unfortunately this can't be changed, although I suppose it should be documented.

In general I think this API was a mistake, and should not have been added. There are just too many cases where it doesn't do what you want, or where it fails.

I'd like to deprecate it and remove it (along with asdict), but I fear that would be too disruptive.

Your approach seems reasonable to me for your use case.
msg391519 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2021-04-21 12:52
Why deepcopy is used at all? It is a very specific feature which should not be used by default. If you want to make a deep copy of fields, you can call copy.deepcopy() explicitly.

    copy.deepcopy(dataclasses.astuple(obj))

or

    dataclasses.astuple(copy.deepcopy(obj))
msg391520 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2021-04-21 13:20
The reason for the deep copying was to support changing a hierarchy of dataclasses into something that could be JSON serialized. But it didn't really work out. It recurses into dataclasses, namedtuples, lists, tuples, and dicts, and deep copies everything else.

As I said, it's a design flaw.
msg394015 - (view) Author: Erik Carstensen (mandolaerik) Date: 2021-05-20 11:43
Would it make sense to make dataclasses iterable, like so?

    def __iter__(self):
        return (getattr(self, field.name) for field in fields(self))

With that in place, deprecating astuple would maybe be less disruptive?
msg394019 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2021-05-20 12:10
No, iteration is explicitly a non-goal of PEP 557. See the section on namedtuple for why: https://www.python.org/dev/peps/pep-0557/#why-not-just-use-namedtuple
msg396284 - (view) Author: Andrei Kulakov (andrei.avk) * (Python triager) Date: 2021-06-21 19:42
I've added a PR here: https://github.com/python/cpython/pull/26154
History
Date User Action Args
2021-06-21 19:42:03andrei.avksetmessages: + msg396284
2021-05-20 12:10:46eric.smithsetmessages: + msg394019
2021-05-20 11:43:00mandolaeriksetmessages: + msg394015
2021-05-15 22:19:51andrei.avksetkeywords: + patch
nosy: + andrei.avk

pull_requests: + pull_request24788
stage: patch review
2021-04-21 13:20:47eric.smithsetmessages: + msg391520
2021-04-21 12:52:41serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg391519
2021-04-21 12:25:03eric.smithsetassignee: eric.smith

messages: + msg391518
nosy: + eric.smith
2021-04-21 12:19:17mandolaerikcreate