This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Frozen dataclass deep copy doesn't work with __slots__
Type: behavior Stage: patch review
Components: Versions: Python 3.10, Python 3.9
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: eric.smith, jfuruness, josh.r, sobolevn
Priority: normal Keywords: patch

Created on 2021-10-18 23:18 by jfuruness, last changed 2022-04-11 14:59 by admin.

Pull Requests
URL Status Linked Edit
PR 29147 open sobolevn, 2021-10-22 08:08
Messages (5)
msg404245 - (view) Author: Justin Furuness (jfuruness) Date: 2021-10-18 23:18
If you define a frozen dataclass with slots and deep copy it, an error will occur. If you run the same code and remove the slots, the error will not occur. I assume this behavior is not intentional? Apologies if I'm submitting this wrong, this is the first time I've submitted an issue here so I'm not quite sure how to do it properly.

Example below:

```
from dataclasses import dataclass
from copy import deepcopy

@dataclass(frozen=True)
class FrozenData:
    # Without slots no errors occur?
    __slots__ = "my_string",

    my_string: str

deepcopy(FrozenData(my_string="initial"))
```

Error that occurs:
```
dataclasses.FrozenInstanceError: cannot assign to field 'my_string'
```
msg404260 - (view) Author: Josh Rosenberg (josh.r) * (Python triager) Date: 2021-10-19 01:25
When I define this with the new-in-3.10 slots=True argument to dataclass rather than manually defining __slots__ it works just fine. Looks like the pickle format changes rather dramatically to accommodate it.

>>> @dataclass(frozen=True, slots=True)
... class FrozenData:
...     my_string: str
...
>>> deepcopy(FrozenData('initial'))
FrozenData(my_string='initial')

Is there a strong motivation to support manually defined __slots__ on top of slots=True that warrants fixing it for 3.10 onward?
msg404273 - (view) Author: Justin Furuness (jfuruness) Date: 2021-10-19 07:56
I didn't realize it was possible to add slots in that way (slots=True). I
think for normal classes manually defining __slots__ is the common way of
adding __slots__ (at least for me), so it feels weird to me that manually
defining __slots__ instead of adding slots=True produces different
behaviour. That being said, no strong motivation to change it, this fixes
my issue, feel free to close this.

Thanks,
Justin

On Mon, Oct 18, 2021 at 9:25 PM Josh Rosenberg <report@bugs.python.org>
wrote:

>
> Josh Rosenberg <shadowranger+python@gmail.com> added the comment:
>
> When I define this with the new-in-3.10 slots=True argument to dataclass
> rather than manually defining __slots__ it works just fine. Looks like the
> pickle format changes rather dramatically to accommodate it.
>
> >>> @dataclass(frozen=True, slots=True)
> ... class FrozenData:
> ...     my_string: str
> ...
> >>> deepcopy(FrozenData('initial'))
> FrozenData(my_string='initial')
>
> Is there a strong motivation to support manually defined __slots__ on top
> of slots=True that warrants fixing it for 3.10 onward?
>
> ----------
> nosy: +josh.r
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <https://bugs.python.org/issue45520>
> _______________________________________
>
msg404354 - (view) Author: Josh Rosenberg (josh.r) * (Python triager) Date: 2021-10-19 21:28
You're right that in non-dataclass scenarios, you'd just use __slots__.

The slots=True thing was necessary for any case where any of the dataclass's attributes have default values (my_int: int = 0), or are defined with fields (my_list: list = field(default_factory=list)). The problem is that __slots__ is implemented by, after the class definition ends, creating descriptors on the class to access the data stored at known offsets in the underlying PyObject structure. Those descriptors themselves being class attributes means that when the type definition machinery tries to use __slots__ to create them, it finds conflicting class attributes (the defaults/fields) that already exist and explodes.

Adding support for slots=True means it does two things:

1. It completely defines the class without slots, extracts the stuff it needs to make the dataclass separately, then deletes it from the class definition namespace and makes a *new* class with __slots__ defined (so no conflict occurs)
2. It checks if the dataclass is also frozen, and applies alternate __getstate__/__setstate__ methods that are compatible with a frozen, slotted dataclass

#2 is what fixes this bug (while #1 makes it possible to use the full range of dataclass features without sacrificing the ability to use __slots__). If you need this to work in 3.9, you could borrow the 3.10 implementations that make this work for frozen dataclasses to explicitly define __getstate__/__setstate__ for your frozen slotted dataclasses:

def __getstate__(self):
    return [getattr(self, f.name) for f in fields(self)]


def __setstate__(self, state):
    for field, value in zip(fields(self), state):
        # use setattr because dataclass may be frozen
        object.__setattr__(self, field.name, value)

I'm not closing this since backporting just the fix for frozen slotted dataclasses (without backporting the full slots=True functionality that's a new feature) is possibly within scope for a bugfix release of 3.9 (it wouldn't change the behavior of working code, and fixes broken code that might reasonably be expected to work).
msg404525 - (view) Author: Justin Furuness (jfuruness) Date: 2021-10-20 21:01
Thank you for the in-depth explanation. That all makes sense to me, I have
run into the __slots__ with defaults issues before, I'll be sure to try out
these fixes. I appreciate you taking the time.

Thanks,
Justin

On Tue, Oct 19, 2021 at 5:28 PM Josh Rosenberg <report@bugs.python.org>
wrote:

>
> Josh Rosenberg <shadowranger+python@gmail.com> added the comment:
>
> You're right that in non-dataclass scenarios, you'd just use __slots__.
>
> The slots=True thing was necessary for any case where any of the
> dataclass's attributes have default values (my_int: int = 0), or are
> defined with fields (my_list: list = field(default_factory=list)). The
> problem is that __slots__ is implemented by, after the class definition
> ends, creating descriptors on the class to access the data stored at known
> offsets in the underlying PyObject structure. Those descriptors themselves
> being class attributes means that when the type definition machinery tries
> to use __slots__ to create them, it finds conflicting class attributes (the
> defaults/fields) that already exist and explodes.
>
> Adding support for slots=True means it does two things:
>
> 1. It completely defines the class without slots, extracts the stuff it
> needs to make the dataclass separately, then deletes it from the class
> definition namespace and makes a *new* class with __slots__ defined (so no
> conflict occurs)
> 2. It checks if the dataclass is also frozen, and applies alternate
> __getstate__/__setstate__ methods that are compatible with a frozen,
> slotted dataclass
>
> #2 is what fixes this bug (while #1 makes it possible to use the full
> range of dataclass features without sacrificing the ability to use
> __slots__). If you need this to work in 3.9, you could borrow the 3.10
> implementations that make this work for frozen dataclasses to explicitly
> define __getstate__/__setstate__ for your frozen slotted dataclasses:
>
> def __getstate__(self):
>     return [getattr(self, f.name) for f in fields(self)]
>
>
> def __setstate__(self, state):
>     for field, value in zip(fields(self), state):
>         # use setattr because dataclass may be frozen
>         object.__setattr__(self, field.name, value)
>
> I'm not closing this since backporting just the fix for frozen slotted
> dataclasses (without backporting the full slots=True functionality that's a
> new feature) is possibly within scope for a bugfix release of 3.9 (it
> wouldn't change the behavior of working code, and fixes broken code that
> might reasonably be expected to work).
>
> ----------
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <https://bugs.python.org/issue45520>
> _______________________________________
>
History
Date User Action Args
2022-04-11 14:59:51adminsetgithub: 89683
2021-10-22 08:08:50sobolevnsetkeywords: + patch
nosy: + sobolevn

pull_requests: + pull_request27423
stage: patch review
2021-10-20 21:01:44jfurunesssetmessages: + msg404525
2021-10-19 21:28:37josh.rsetmessages: + msg404354
2021-10-19 07:56:45jfurunesssetmessages: + msg404273
2021-10-19 01:25:42josh.rsetnosy: + josh.r
messages: + msg404260
2021-10-19 00:52:45FFY00setnosy: + eric.smith
2021-10-18 23:18:22jfurunesscreate