classification
Title: Turning off generation of __match_args__ for dataclasses
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.10
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: eric.smith Nosy List: brandtbucher, eric.smith, freundTech, gvanrossum
Priority: normal Keywords: patch

Created on 2021-04-07 13:07 by freundTech, last changed 2021-04-11 01:29 by eric.smith. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 25284 merged brandtbucher, 2021-04-08 17:13
PR 25337 merged eric.smith, 2021-04-10 22:29
Messages (22)
msg390429 - (view) Author: Adrian Freund (freundTech) * Date: 2021-04-07 13:07
The dataclass decorator can take multiple parameters to enable or disable the generation of certain methods.

PEP 634 Structural Pattern Matching extends dataclasses to also generate a __match_args__ attribute.

I think adding a parameter to enable and disable generation of __match_args__ should be to dataclass should also be considered for consistency.

Note that setting compare=False on a dataclass.field already excludes that field from __match_args__, but there is no way to disable generation of __match_args__ completely.
msg390434 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2021-04-07 13:54
What's the situation where having __match_args__ is actually harmful in some way? I understand that if the generated version is wrong, you'd want to specify it yourself. But what's the use case for not having __match_args__ at all?
msg390453 - (view) Author: Brandt Bucher (brandtbucher) * (Python committer) Date: 2021-04-07 17:41
I agree with Eric. You can already disable the automatic creation of __match_args__ by setting it yourself on the class being decorated, and "__match_args__ = ()" will make the class behave the exact same as if __match_args__ is not defined at all.

>>> from dataclasses import dataclass
>>> @dataclass 
... class X:
...     __match_args__ = ()
...     a: int
...     b: int
...     c: int
... 
>>> X.__match_args__
()

I too have trouble imagining a case where the actual *presence* of the attribute matters. But assuming those cases exist, wouldn't a simple "del X.__match_args__" suffice?
msg390454 - (view) Author: Brandt Bucher (brandtbucher) * (Python committer) Date: 2021-04-07 17:48
> Note that setting compare=False on a dataclass.field already excludes that field from __match_args__...

It appears you did find a genuine bug though! I was surprised by this comment, and after digging a bit deeper into _process_class found that we should be generating these from "field_list", not "flds":

>>> @dataclass(repr=False, eq=False, init=False)
... class X:
...     a: int
...     b: int
...     c: int
... 
Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
  File "/home/bucher/src/cpython/Lib/dataclasses.py", line 1042, in wrap
    return _process_class(cls, init, repr, eq, order, unsafe_hash, frozen)
  File "/home/bucher/src/cpython/Lib/dataclasses.py", line 1020, in _process_class
    cls.__match_args__ = tuple(f.name for f in flds if f.init)
UnboundLocalError: local variable 'flds' referenced before assignment
msg390465 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2021-04-07 18:35
I assume the OP wants to have a class that doesn't allow positional patterns. The right way to spell that is indeed to add

    __match_args__ = ()

to the class, there's no need to add another flag to @dataclass.
msg390546 - (view) Author: Brandt Bucher (brandtbucher) * (Python committer) Date: 2021-04-08 19:54
New changeset d92c59f48680122ce0e4d1ccf69d92b983e8db01 by Brandt Bucher in branch 'master':
bpo-43764: Fix `__match_args__` generation logic for dataclasses (GH-25284)
https://github.com/python/cpython/commit/d92c59f48680122ce0e4d1ccf69d92b983e8db01
msg390674 - (view) Author: Adrian Freund (freundTech) * Date: 2021-04-09 22:55
> I assume the OP wants to have a class that doesn't allow positional patterns. The right way to spell that is indeed to add
>
>    __match_args__ = ()
>
>to the class, there's no need to add another flag to @dataclass.

The same however is also true for all the other stuff generated by @dataclass. You can for example disable generation of the init method using

    def __init__(self): pass

and dataclass still has a parameter to disable it.

I agree that a new parameter isn't strictly required to achieve functionality, however I would still argue that it should be added for consistency with the rest of the dataclass api.
msg390678 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2021-04-09 23:28
init=False is used to make sure there's no __init__ defined, because there's a difference between a class with an __init__ and one without. If there was a difference between __match_args__ being not present and __match_args__=(), then I'd support a matchargs=False argument.
msg390680 - (view) Author: Brandt Bucher (brandtbucher) * (Python committer) Date: 2021-04-09 23:55
> init=False is used to make sure there's no __init__ defined, because there's a difference between a class with an __init__ and one without. If there was a difference between __match_args__ being not present and __match_args__=(), then I'd support a matchargs=False argument.

Ah, I see now how this might possibly be useful.

If you want to inherit a parent's __match_args__ in a dataclass, it currently must be as spelled something like:

@dataclass
class Child(Parent):
    __match_args__ = Parent.__match_args__
    ...

It's even uglier when you're unsure if Parent defines __match_args__ at all, or if multiple-inheritance is involved:

@dataclass
class Child(Parent, Mixin):
    __match_args__ = ()
    ...

del Child.__match_args__

I'm not sure how likely it is that code out in the wild may need to look like this. As I understand it, though, the fact that dataclasses allow for "normal" inheritance is one of their big selling-points. So it could be a valid reason to include this option.

If it seems like the above code might become reasonably common, I agree that the proposed solution is much cleaner:

@dataclass(match_args=False)
class Child(Parent):
    ...
msg390681 - (view) Author: Brandt Bucher (brandtbucher) * (Python committer) Date: 2021-04-10 00:05
I just realized that in my first two examples, all that's really needed is a "del Child.__match_args__" following each class definition.

The main point still stands though: this may be a more common issue than I previously thought. If it is, then a dedicated kwarg to the dataclass decorator probably makes more sense than requiring users to "del" the generated attribute.
msg390682 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2021-04-10 00:20
Hmm, good point on the inheritance case. I'd forgotten that this is where init=False and others would come in handy. Although I'm having a hard time figuring out why you'd want a derived class that adds fields but wants to use the parent's __match_args__ (or __init__, for that matter), but I guess it's possible.

I don't like del Child.__match_args__. The same pattern could be used for init=False, after all, but there's a parameter for that.
msg390726 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2021-04-10 17:34
So, should we reopen this ad add the flag to suppress __match_args__ generation after all?
msg390734 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2021-04-10 19:17
> So, should we reopen this ad add the flag to suppress __match_args__ generation after all?

I think so, although I'm still contemplating it.

I can't decide if we should also add match_arg=False to field(). Or is it good enough to just tell the user to manually set __match_args__ if they want that level of control? I think this is different enough from repr, init, etc. that we don't need to allow the per-field control, although maybe doing so would make the code more robust in terms of re-ordering or adding fields at a later date.
msg390735 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2021-04-10 19:22
I don't think we need control at the field level here. Omitting one field
is just going to cause confusion.
msg390736 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2021-04-10 19:33
Okay, I'll re-open this to just add @dataclass(match_args=False).
msg390744 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2021-04-10 21:32
Here's a question. If __init__ is not being generated, either because the user supplied one to this class, or if init=False is specified, should __match_args__ be generated? I think the answer should be no, since the code has no idea what the parameters to __init__ will be. But I'd like to hear from people more familiar with pattern matching.

I'm working on a patch, and this is my last issue.
msg390745 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2021-04-10 21:36
Are there other cases where suppressing one thing affects others?
msg390746 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2021-04-10 21:42
> Are there other cases where suppressing one thing affects others?

Only the complex interactions among the unsafe_hash, eq, and frozen parameters.

It feels like if __init__ is not being generated, then the @dataclass code would have no idea what it should set __match_args__ to.

Not that this problem isn't strictly related to the match_args=False case, it just occurred to me that it's related while I was writing the documentation. Perhaps this should be a separate issue.
msg390747 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2021-04-10 22:08
There may be other reasons why `__init__` is not desired, and there's no
absolute requirement that `__match_args__` must match the constructor --
only a convention. I'd say don't tie them together.
msg390748 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2021-04-10 22:15
I can go either way. It's easy enough for the user to add their own __match_args__, so I won't link them.

Here's what I have for the documentation, which is why the issue came up:

   - ``match_args``: If true (the default is ``True``), the
     ``__match_args__`` tuple will be created from the list of
     parameters to the generated :meth:`__init__` method (even if
     :meth:`__init__` is not generated, see above).  If false, or if
     ``__match_args__`` is already defined in the class, then
     ``__match_args__`` will not be generated.
msg390749 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2021-04-10 22:17
LGTM
msg390760 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2021-04-11 01:28
New changeset 750f484752763fe9ac1d6455780aabcb67f25508 by Eric V. Smith in branch 'master':
bpo-43764: Add match_args=False parameter to dataclass decorator and to make_dataclasses function. (GH-25337)
https://github.com/python/cpython/commit/750f484752763fe9ac1d6455780aabcb67f25508
History
Date User Action Args
2021-04-11 01:29:17eric.smithsetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2021-04-11 01:28:45eric.smithsetmessages: + msg390760
2021-04-10 22:29:30eric.smithsetkeywords: + patch
stage: needs patch -> patch review
pull_requests: + pull_request24072
2021-04-10 22:17:39gvanrossumsetmessages: + msg390749
2021-04-10 22:15:05eric.smithsetmessages: + msg390748
2021-04-10 22:08:39gvanrossumsetmessages: + msg390747
2021-04-10 21:42:51eric.smithsetmessages: + msg390746
2021-04-10 21:36:27gvanrossumsetmessages: + msg390745
2021-04-10 21:32:01eric.smithsetmessages: + msg390744
2021-04-10 19:33:44eric.smithsetstatus: closed -> open
resolution: rejected -> (no value)
messages: + msg390736

stage: resolved -> needs patch
2021-04-10 19:22:58gvanrossumsetmessages: + msg390735
2021-04-10 19:17:45eric.smithsetmessages: + msg390734
2021-04-10 17:34:26gvanrossumsetmessages: + msg390726
2021-04-10 00:20:55eric.smithsetmessages: + msg390682
2021-04-10 00:05:48brandtbuchersetmessages: + msg390681
2021-04-09 23:55:19brandtbuchersetmessages: + msg390680
2021-04-09 23:28:13eric.smithsetmessages: + msg390678
2021-04-09 22:55:52freundTechsetmessages: + msg390674
2021-04-08 19:54:45brandtbuchersetmessages: + msg390546
2021-04-08 17:13:26brandtbuchersetpull_requests: + pull_request24020
2021-04-07 18:35:59gvanrossumsetstatus: open -> closed
resolution: rejected
messages: + msg390465

stage: resolved
2021-04-07 17:48:56brandtbuchersetmessages: + msg390454
2021-04-07 17:41:48brandtbuchersetmessages: + msg390453
2021-04-07 13:54:17eric.smithsetassignee: eric.smith
type: behavior
messages: + msg390434
2021-04-07 13:07:21freundTechcreate