classification
Title: Bad dataclass post-init example
Type: behavior Stage:
Components: Documentation Versions: Python 3.11, Python 3.10, Python 3.9, Python 3.8
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: eric.smith Nosy List: MicaelJarniac, docs@python, eric.smith
Priority: normal Keywords:

Created on 2021-06-09 15:33 by MicaelJarniac, last changed 2021-06-11 02:17 by MicaelJarniac.

Messages (6)
msg395427 - (view) Author: Micael Jarniac (MicaelJarniac) Date: 2021-06-09 15:33
https://docs.python.org/3/library/dataclasses.html#post-init-processing

https://github.com/python/cpython/blob/3.9/Doc/library/dataclasses.rst#post-init-processing

In the example, a base class "Rectangle" is defined, and then a "Square" class inherits from it.

On reading the example, it seems like the Square class is meant to be used like:

>>> square = Square(5)

Since the Square class seems to be supposed to be a "shortcut" to creating a Rectangle with equal sides.

However, the Rectangle class has two required init arguments, and when Square inherits from it, those arguments are still required, so using Square like in the above example, with a single argument, results in an error:

>>> square = Square(5)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: __init__() missing 2 required positional arguments: 'width' and 'side'

To "properly" use the Square class, it'd need to be instantiated like so:

>>> square = Square(0, 0, 5)
>>> square
Square(height=5, width=5, side=5)

Which, in my opinion, is completely counter-intuitive, and basically invalidates this example.
msg395449 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2021-06-09 17:42
Agreed that that's not a good (or even workable) example. Thanks for pointing it out.

I'll come up with something better.
msg395537 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2021-06-10 13:31
The example was added in https://github.com/python/cpython/pull/25967

When reviewing it, I think I missed the fact that the base class is a dataclass. The example and text make more sense if Rectangle isn't a dataclass. Still, I don't like the example at all. I think deleting it might be the best thing to do. Or maybe come up with a case where the base class is some existing class in the stdlib that isn't a dataclass.
msg395541 - (view) Author: Micael Jarniac (MicaelJarniac) Date: 2021-06-10 15:11
I'm trying to think of an example, and what I've thought of so far is having a base dataclass that has a `__post_init__` method, and another dataclass that inherits from it and also has a `__post_init__` method.

In that case, the subclass might need to call `super().__post_init__()` inside its own `__post_init__` method, because otherwise, that wouldn't get called automatically.

Something along those lines:

>>> from dataclasses import dataclass, field
>>>
>>> @dataclass
... class A:
...     x: int
...     y: int
...     xy: int = field(init=False)
...
...     def __post_init__(self) -> None:
...         self.xy = self.x * self.y
...
>>> @dataclass
... class B(A):
...     m: int
...     n: int
...     mn: int = field(init=False)
...
...     def __post_init__(self) -> None:
...         super().__post_init__()
...         self.mn = self.m * self.n
...
>>> b = B(x=2, y=4, m=3, n=6)
>>> b
B(x=2, y=4, xy=8, m=3, n=6, mn=18)

In this example, if not for the `super().__post_init__()` call inside B's `__post_init__`, we'd get an error `AttributeError: 'B' object has no attribute 'xy'`.

I believe this could be an actual pattern that could be used when dealing with dataclasses.
msg395573 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2021-06-10 20:54
I'm not sure directly calling __post_init__ is a good pattern. Why would not calling __init__, like you would with any other class, not be the preferred thing to do?
msg395596 - (view) Author: Micael Jarniac (MicaelJarniac) Date: 2021-06-11 02:17
Well, at least for this example, to call `super().__init__()`, I'd need to provide it the two arguments it expects, `x` and `y`, otherwise it'd give an error:

> TypeError: __init__() missing 2 required positional arguments: 'x' and 'y'

If I try calling it as `super().__init__(self.x, self.y)`, I get an infinite recursion error:

> RecursionError: maximum recursion depth exceeded while calling a Python object

That's mostly why I've chosen to call `__post_init__` instead.

And if we're dealing with `InitVar`s, they can nicely be chained like so:

>>> from dataclasses import dataclass, field, InitVar
>>>
>>> @dataclass
... class A:
...     x: int
...     y: InitVar[int]
...     xy: int = field(init=False)
...
...     def __post_init__(self, y: int) -> None:
...         self.xy = self.x * y
...
>>> @dataclass
... class B(A):
...     m: int
...     n: InitVar[int]
...     mn: int = field(init=False)
...
...     def __post_init__(self, y: int, n: int) -> None:
...         super().__post_init__(y)
...         self.mn = self.m * n
...
>>> b = B(x=2, y=4, m=3, n=6)
>>> b
B(x=2, xy=8, m=3, mn=18)
History
Date User Action Args
2021-06-11 02:17:40MicaelJarniacsetmessages: + msg395596
2021-06-10 20:54:18eric.smithsetmessages: + msg395573
2021-06-10 15:11:05MicaelJarniacsetmessages: + msg395541
2021-06-10 13:31:36eric.smithsetmessages: + msg395537
2021-06-09 17:42:49eric.smithsetassignee: docs@python -> eric.smith
messages: + msg395449
versions: + Python 3.9, Python 3.10, Python 3.11
2021-06-09 15:44:02xtreaksetnosy: + eric.smith
2021-06-09 15:33:08MicaelJarniaccreate