This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Dataclasses don't call base class __init__
Type: Stage:
Components: Library (Lib) Versions: Python 3.10, Python 3.9, Python 3.8
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: eric.smith Nosy List: Paul Pinterits, eric.smith
Priority: normal Keywords:

Created on 2021-04-13 22:21 by Paul Pinterits, last changed 2022-04-11 14:59 by admin.

Messages (9)
msg391006 - (view) Author: Paul Pinterits (Paul Pinterits) Date: 2021-04-13 22:21
It's documented behavior that @dataclass won't generate an __init__ method if the class already defines one. It's also documented that a dataclass may inherit from another dataclass.

But what happens if you inherit from a dataclass that implements a custom __init__? Well, that custom __init__ is never called:

```
import dataclasses

@dataclasses.dataclass
class Foo:
    foo: int
    
    def __init__(self, *args, **kwargs):
        print('Foo.__init__')  # This is never printed

@dataclasses.dataclass
class Bar(Foo):
    bar: int

obj = Bar(1, 2)
print(vars(obj))  # {'foo': 1, 'bar': 2}
```

So if a dataclass uses a custom __init__, all its child classes must also use a custom __init__. This is 1) incredibly inconvenient, and 2) bad OOP. A child class should (almost) always chain-call its base class's __init__.
msg391014 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2021-04-13 23:29
dataclasses doesn't know the signature of the base class's __init__, so it can't know how to call it. I realize you've given an example that would accept any parameters, but that isn't typical.

What if the base class was:

@dataclasses.dataclass
class Foo:
    foo: int
    
    def __init__(self, baz):
        ...
        pass

What would the generated Bar.__init__() look like if it were calling the base class __init__()? What would get passed to baz?

> So if a dataclass uses a custom __init__, all its child classes must also use a custom __init__

This isn't true. The typical way to handle this is for the derived class to add a __post_init__() that calls into the base class's __init__(). This way, you can use the normally generated __init__() in the derived class, yet still call the base class's __init__() (which presumably you have some knowledge of). If that doesn't work for some reason (for example, you strictly require that the base class is initialized before the derived class, for some reason), then yes, you'd need to write a custom __init__() in the derived class. dataclasses isn't designed to handle every case: just the most common ones.

In your case, you could do:

@dataclasses.dataclass
class Bar(Foo):
    bar: int

    def __post_init__(self):
        Foo.__init__(self, baz=self.bar) # or whatever
msg391022 - (view) Author: Paul Pinterits (Paul Pinterits) Date: 2021-04-14 00:17
> dataclasses doesn't know the signature of the base class's __init__, so it can't know how to call it.

The dataclass doesn't need to know what arguments the parent __init__ accepts. It should consume the arguments it needs to initialize its instance attributes, and forward the rest to the parent __init__.

> The typical way to handle this is for the derived class to add a __post_init__() that calls into the base class's __init__().

How is that supposed to work? Like you said, the class doesn't know what arguments its parent constructor requires. If the derived class doesn't implement a custom __init__, @dataclass will generate one with an (probably) incorrect signature. Changing the signature is pretty much the only reason why you would need to implement a custom __init__, after all. For example:

```
@dataclasses.dataclass
class Foo:
    foo: int
    
    def __init__(self):
        self.foo = 5
    
@dataclasses.dataclass
class Bar(Foo):
    bar: int
    
    def __post_init__(self):
        super().__init__()

bar = Bar(1)  # TypeError: __init__() missing 1 required positional argument: 'bar'
```

And even if this workaround actually worked, it would only be applicable if you *know* you're dealing with a dataclass and you'll have this problem. If you're writing something like a class decorator (similar to @dataclass), you don't know if the decorated class will be a dataclass or not. If it's a regular class, you can rely on the __init__s being chain-called. If it's a dataclass, you can't. Therefore, a class decorator that generates/replaces the __init__ method would need to take special care to be compatible with dataclasses, just because dataclasses don't follow basic OOP design principles.

Consider this trivial class decorator that generates an __init__ method:

```
def log_init(cls):
    try:
        original_init = vars(cls)['__init__']
    except KeyError:
        def original_init(self, *args, **kwargs):
            super(cls, self).__init__(*args, **kwargs)
    
    def __init__(self, *args, **kwargs):
        print(f'{cls.__name__}.__init__ was called')
        original_init(self, *args, **kwargs)
    
    cls.__init__ = __init__
    return cls

@log_init
@dataclasses.dataclass
class Foo:
    foo: int
    
@dataclasses.dataclass
class Bar(Foo):
    bar: int
    
Foo(1)  # Prints "Foo.__init__ was called"
Bar(1, 2)  # Prints nothing
```

How do you implement this in a way that is compatible with @dataclass?
msg391023 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2021-04-14 00:26
> The dataclass doesn't need to know what arguments the parent __init__ accepts. It should consume the arguments it needs to initialize its instance attributes, and forward the rest to the parent __init__.

The generated __init__() uses every parameter to initialize instance attributes (if we ignore InitVar). So are you saying it should call the base class with no parameters?
msg391040 - (view) Author: Paul Pinterits (Paul Pinterits) Date: 2021-04-14 07:42
No, I'm saying Bar should initialize the 'bar' attribute, and then call Foo.__init__ to let it initialize the 'foo' attribute.
msg391046 - (view) Author: Paul Pinterits (Paul Pinterits) Date: 2021-04-14 08:36
Admittedly, with the way dataclasses accept their __init__ arguments, figuring out which arguments to consume and which to pass on isn't a trivial task.

If a dataclass Bar inherits from a dataclass Foo, then Bar.__init__ is (for all intents and purposes) defined as

    def __init__(self, foo, bar):

Because the arguments for the parents *precede* the arguments for Bar, it's not easy to create an equivalent __init__ without knowing anything about the base class(es)'s constructor arguments. But that doesn't mean it's impossible:

```
class Foo:
    foo: int
    
    def __init__(self):
        self.foo = 5
    
class Bar(Foo):
    bar: int
    
    def __init__(self, *args, **kwargs):
        if 'bar' in kwargs:
            self.bar = kwargs.pop('bar')
        else:
            *args, self.bar = args
        
        super().__init__(*args, **kwargs)

print([Bar(1), Bar(bar=1)])
```

Essentially, Bar.__init__ looks for a keyword argument named 'bar', and if that doesn't exist, it uses the last positional argument as the value for 'bar'.

This is backwards compatible with "normal" dataclasses, and improves support for dataclasses with custom __init__s.
msg391047 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2021-04-14 08:55
But Bar(1, 2), Bar(1, foo=2), Bar(bar=1, foo=2) all give errors. These are all valid if both Foo and Bar are decorated with @dataclass.

Calling base class __init__() functions is an incompatible change, and I don't think we'll make any change to do so.
msg391048 - (view) Author: Paul Pinterits (Paul Pinterits) Date: 2021-04-14 09:09
You're telling me that some people out there rely on their custom __init__ *not* being called? O.o
msg391049 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2021-04-14 09:11
Yes, I'm sure it happens.

from dataclasses import dataclass

@dataclass
class Foo:
    foo: int
    
    def __init__(self, a, b, c):
        self.foo = a * b * c

@dataclass
class Bar(Foo):
    bar: int
    

print(Bar(1, 2))
print(Foo(1, 2, 3))
History
Date User Action Args
2022-04-11 14:59:44adminsetgithub: 88001
2021-04-16 20:35:14terry.reedysetversions: - Python 3.7
2021-04-14 23:43:40eric.smithsetassignee: eric.smith
2021-04-14 09:11:54eric.smithsetmessages: + msg391049
2021-04-14 09:09:53Paul Pinteritssetmessages: + msg391048
2021-04-14 08:55:47eric.smithsetmessages: + msg391047
2021-04-14 08:36:52Paul Pinteritssetmessages: + msg391046
2021-04-14 07:42:03Paul Pinteritssetmessages: + msg391040
2021-04-14 00:26:11eric.smithsetmessages: + msg391023
2021-04-14 00:17:12Paul Pinteritssetmessages: + msg391022
2021-04-13 23:29:32eric.smithsetnosy: + eric.smith
messages: + msg391014
2021-04-13 22:21:40Paul Pinteritscreate