classification
Title: dataclass(slots=True) does not account for slots in base classes
Type: behavior Stage:
Components: Library (Lib) Versions: Python 3.11, Python 3.10
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: eric.smith Nosy List: AlexWaygood, Spencer Brown, ariebovenberg, eric.smith, hynek, sobolevn
Priority: normal Keywords:

Created on 2022-01-14 19:31 by ariebovenberg, last changed 2022-01-20 08:05 by ariebovenberg.

Messages (9)
msg410591 - (view) Author: Arie Bovenberg (ariebovenberg) * Date: 2022-01-14 19:31
@dataclass(slots=True) adds slots to dataclasses. It adds a slot per field. 
However, it doesn't account for slots already present in base classes:

>>> class Base:
...     __slots__ = ('a', )
...
>>> @dataclass(slots=True)
... class Foo(Base):
...     a: int
...     b: float
...
>>> Foo.__slots__
('a', 'b')  # should be: ('b', )


The __slots__ documentation says:

    If a class defines a slot also defined in a base class, the instance variable 
    defined by the base class slot is inaccessible (except by retrieving its descriptor 
    directly from the base class). This renders the meaning of the program undefined. 
    In the future, a check may be added to prevent this.

Solution: don't add slots which are already defined in any base classes:

>>> @dataclass
... class Bla(Base):
...     __slots__ = ('b', )
...     a: int
...     b: float
...
>>> Bla(4, 5.65)
Bla(a=4, b=5.65)

If you agree, I'd like to submit a PR to fix this. I already have a prototype working.
msg410601 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2022-01-14 21:32
I'll have to do some more research. But your analysis looks correct to me, so far.
msg410628 - (view) Author: Arie Bovenberg (ariebovenberg) * Date: 2022-01-15 07:33
There are already 2 complexities I can think of:

1. This behavior may break some people's code, if they use __slots__ to iterate over
   the fields of a dataclass. Solution: explicitly mention in the docs that
   not every field may get a slot on the new class. Advise them to use
   `fields()` to iterate over the fields.
2. It's technically allowed for __slots__ to be an iterator (which will then be 
   exhausted at class creation). Finding the __slots__ of such a class
   may require more elaborate introspection.
msg410630 - (view) Author: Nikita Sobolev (sobolevn) * (Python triager) Date: 2022-01-15 08:23
Arie, can you please explain what is the technical difference between these two cases:

```python
class A:
    __slots__ = ('a', )
    # fields

class B(A):
    __slots__ = ('a', 'b')
    # fields
```

And:

```python
class C:
    __slots__ = ('a', )
    # fields

class D(C):
    __slots__ = ('b', )
    # fields
```

?
msg410641 - (view) Author: Spencer Brown (Spencer Brown) * Date: 2022-01-15 11:05
Both will function, but class B will add its slots after A's, causing there to be an extra unused slot in the object that you can only access by directly using the A.a descriptor. So all slotted inheriting dataclasses cause the object to use more memory than necessary.
msg410642 - (view) Author: Arie Bovenberg (ariebovenberg) * Date: 2022-01-15 11:58
Spencer is correct.

The documentation even adds: "This renders the meaning of the program undefined."

It's clear it doesn't break anything users would often encounter (we would have heard about it), but it's still undefined behavior.
msg410820 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2022-01-17 20:13
It would also be interesting to see what attrs does in this case.
msg410841 - (view) Author: Hynek Schlawack (hynek) * (Python committer) Date: 2022-01-18 06:11
>>> @attrs.define
... class C(Base):
...   a: int
...   b: int
...
>>> C.__slots__
('b', '__weakref__')

We've got a test specifically for this use case: https://github.com/python-attrs/attrs/blob/5f36ba9b89d4d196f80147d4f2961fb2f97ae2e5/tests/test_slots.py#L309-L334
msg411010 - (view) Author: Arie Bovenberg (ariebovenberg) * Date: 2022-01-20 08:05
@hynek interesting! 

The discussion in https://github.com/python-attrs/attrs/pull/420 on the weakref slot is very interesting as well.

Considering __weakref__ is something we don't want to make impossible in dataclasses, @eric.smith what would be your preferred solution?
History
Date User Action Args
2022-01-20 08:05:06ariebovenbergsetmessages: + msg411010
2022-01-18 06:11:28hyneksetmessages: + msg410841
2022-01-17 22:04:25AlexWaygoodsetnosy: + hynek
2022-01-17 20:13:11eric.smithsetmessages: + msg410820
2022-01-15 11:58:23ariebovenbergsetmessages: + msg410642
2022-01-15 11:13:08AlexWaygoodsetnosy: + AlexWaygood
2022-01-15 11:05:34Spencer Brownsetnosy: + Spencer Brown
messages: + msg410641
2022-01-15 08:23:19sobolevnsetnosy: + sobolevn
messages: + msg410630
2022-01-15 07:33:17ariebovenbergsetmessages: + msg410628
2022-01-14 21:32:28eric.smithsetassignee: eric.smith

messages: + msg410601
nosy: + eric.smith
2022-01-14 19:31:43ariebovenbergcreate