Issue43835
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2021-04-13 22:21 by Paul Pinterits, last changed 2022-04-11 14:59 by admin.
Messages (9) | |||
---|---|---|---|
msg391006 - (view) | Author: Paul Pinterits (Paul Pinterits) | Date: 2021-04-13 22:21 | |
It's documented behavior that @dataclass won't generate an __init__ method if the class already defines one. It's also documented that a dataclass may inherit from another dataclass. But what happens if you inherit from a dataclass that implements a custom __init__? Well, that custom __init__ is never called: ``` import dataclasses @dataclasses.dataclass class Foo: foo: int def __init__(self, *args, **kwargs): print('Foo.__init__') # This is never printed @dataclasses.dataclass class Bar(Foo): bar: int obj = Bar(1, 2) print(vars(obj)) # {'foo': 1, 'bar': 2} ``` So if a dataclass uses a custom __init__, all its child classes must also use a custom __init__. This is 1) incredibly inconvenient, and 2) bad OOP. A child class should (almost) always chain-call its base class's __init__. |
|||
msg391014 - (view) | Author: Eric V. Smith (eric.smith) * | Date: 2021-04-13 23:29 | |
dataclasses doesn't know the signature of the base class's __init__, so it can't know how to call it. I realize you've given an example that would accept any parameters, but that isn't typical. What if the base class was: @dataclasses.dataclass class Foo: foo: int def __init__(self, baz): ... pass What would the generated Bar.__init__() look like if it were calling the base class __init__()? What would get passed to baz? > So if a dataclass uses a custom __init__, all its child classes must also use a custom __init__ This isn't true. The typical way to handle this is for the derived class to add a __post_init__() that calls into the base class's __init__(). This way, you can use the normally generated __init__() in the derived class, yet still call the base class's __init__() (which presumably you have some knowledge of). If that doesn't work for some reason (for example, you strictly require that the base class is initialized before the derived class, for some reason), then yes, you'd need to write a custom __init__() in the derived class. dataclasses isn't designed to handle every case: just the most common ones. In your case, you could do: @dataclasses.dataclass class Bar(Foo): bar: int def __post_init__(self): Foo.__init__(self, baz=self.bar) # or whatever |
|||
msg391022 - (view) | Author: Paul Pinterits (Paul Pinterits) | Date: 2021-04-14 00:17 | |
> dataclasses doesn't know the signature of the base class's __init__, so it can't know how to call it. The dataclass doesn't need to know what arguments the parent __init__ accepts. It should consume the arguments it needs to initialize its instance attributes, and forward the rest to the parent __init__. > The typical way to handle this is for the derived class to add a __post_init__() that calls into the base class's __init__(). How is that supposed to work? Like you said, the class doesn't know what arguments its parent constructor requires. If the derived class doesn't implement a custom __init__, @dataclass will generate one with an (probably) incorrect signature. Changing the signature is pretty much the only reason why you would need to implement a custom __init__, after all. For example: ``` @dataclasses.dataclass class Foo: foo: int def __init__(self): self.foo = 5 @dataclasses.dataclass class Bar(Foo): bar: int def __post_init__(self): super().__init__() bar = Bar(1) # TypeError: __init__() missing 1 required positional argument: 'bar' ``` And even if this workaround actually worked, it would only be applicable if you *know* you're dealing with a dataclass and you'll have this problem. If you're writing something like a class decorator (similar to @dataclass), you don't know if the decorated class will be a dataclass or not. If it's a regular class, you can rely on the __init__s being chain-called. If it's a dataclass, you can't. Therefore, a class decorator that generates/replaces the __init__ method would need to take special care to be compatible with dataclasses, just because dataclasses don't follow basic OOP design principles. Consider this trivial class decorator that generates an __init__ method: ``` def log_init(cls): try: original_init = vars(cls)['__init__'] except KeyError: def original_init(self, *args, **kwargs): super(cls, self).__init__(*args, **kwargs) def __init__(self, *args, **kwargs): print(f'{cls.__name__}.__init__ was called') original_init(self, *args, **kwargs) cls.__init__ = __init__ return cls @log_init @dataclasses.dataclass class Foo: foo: int @dataclasses.dataclass class Bar(Foo): bar: int Foo(1) # Prints "Foo.__init__ was called" Bar(1, 2) # Prints nothing ``` How do you implement this in a way that is compatible with @dataclass? |
|||
msg391023 - (view) | Author: Eric V. Smith (eric.smith) * | Date: 2021-04-14 00:26 | |
> The dataclass doesn't need to know what arguments the parent __init__ accepts. It should consume the arguments it needs to initialize its instance attributes, and forward the rest to the parent __init__. The generated __init__() uses every parameter to initialize instance attributes (if we ignore InitVar). So are you saying it should call the base class with no parameters? |
|||
msg391040 - (view) | Author: Paul Pinterits (Paul Pinterits) | Date: 2021-04-14 07:42 | |
No, I'm saying Bar should initialize the 'bar' attribute, and then call Foo.__init__ to let it initialize the 'foo' attribute. |
|||
msg391046 - (view) | Author: Paul Pinterits (Paul Pinterits) | Date: 2021-04-14 08:36 | |
Admittedly, with the way dataclasses accept their __init__ arguments, figuring out which arguments to consume and which to pass on isn't a trivial task. If a dataclass Bar inherits from a dataclass Foo, then Bar.__init__ is (for all intents and purposes) defined as def __init__(self, foo, bar): Because the arguments for the parents *precede* the arguments for Bar, it's not easy to create an equivalent __init__ without knowing anything about the base class(es)'s constructor arguments. But that doesn't mean it's impossible: ``` class Foo: foo: int def __init__(self): self.foo = 5 class Bar(Foo): bar: int def __init__(self, *args, **kwargs): if 'bar' in kwargs: self.bar = kwargs.pop('bar') else: *args, self.bar = args super().__init__(*args, **kwargs) print([Bar(1), Bar(bar=1)]) ``` Essentially, Bar.__init__ looks for a keyword argument named 'bar', and if that doesn't exist, it uses the last positional argument as the value for 'bar'. This is backwards compatible with "normal" dataclasses, and improves support for dataclasses with custom __init__s. |
|||
msg391047 - (view) | Author: Eric V. Smith (eric.smith) * | Date: 2021-04-14 08:55 | |
But Bar(1, 2), Bar(1, foo=2), Bar(bar=1, foo=2) all give errors. These are all valid if both Foo and Bar are decorated with @dataclass. Calling base class __init__() functions is an incompatible change, and I don't think we'll make any change to do so. |
|||
msg391048 - (view) | Author: Paul Pinterits (Paul Pinterits) | Date: 2021-04-14 09:09 | |
You're telling me that some people out there rely on their custom __init__ *not* being called? O.o |
|||
msg391049 - (view) | Author: Eric V. Smith (eric.smith) * | Date: 2021-04-14 09:11 | |
Yes, I'm sure it happens. from dataclasses import dataclass @dataclass class Foo: foo: int def __init__(self, a, b, c): self.foo = a * b * c @dataclass class Bar(Foo): bar: int print(Bar(1, 2)) print(Foo(1, 2, 3)) |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:59:44 | admin | set | github: 88001 |
2021-04-16 20:35:14 | terry.reedy | set | versions: - Python 3.7 |
2021-04-14 23:43:40 | eric.smith | set | assignee: eric.smith |
2021-04-14 09:11:54 | eric.smith | set | messages: + msg391049 |
2021-04-14 09:09:53 | Paul Pinterits | set | messages: + msg391048 |
2021-04-14 08:55:47 | eric.smith | set | messages: + msg391047 |
2021-04-14 08:36:52 | Paul Pinterits | set | messages: + msg391046 |
2021-04-14 07:42:03 | Paul Pinterits | set | messages: + msg391040 |
2021-04-14 00:26:11 | eric.smith | set | messages: + msg391023 |
2021-04-14 00:17:12 | Paul Pinterits | set | messages: + msg391022 |
2021-04-13 23:29:32 | eric.smith | set | nosy:
+ eric.smith messages: + msg391014 |
2021-04-13 22:21:40 | Paul Pinterits | create |