Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Turning off generation of __match_args__ for dataclasses #87930

Closed
freundTech mannequin opened this issue Apr 7, 2021 · 22 comments
Closed

Turning off generation of __match_args__ for dataclasses #87930

freundTech mannequin opened this issue Apr 7, 2021 · 22 comments
Assignees
Labels
3.10 only security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@freundTech
Copy link
Mannequin

freundTech mannequin commented Apr 7, 2021

BPO 43764
Nosy @gvanrossum, @ericvsmith, @brandtbucher, @freundTech
PRs
  • bpo-43764: Fix __match_args__ generation logic for dataclasses #25284
  • bpo-43764: Add match_args=False parameter to dataclass decorator and to make_dataclasses function. #25337
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/ericvsmith'
    closed_at = <Date 2021-04-11.01:29:17.974>
    created_at = <Date 2021-04-07.13:07:21.686>
    labels = ['type-bug', 'library', '3.10']
    title = 'Turning off generation of __match_args__ for dataclasses'
    updated_at = <Date 2021-04-11.01:29:17.974>
    user = 'https://github.com/freundTech'

    bugs.python.org fields:

    activity = <Date 2021-04-11.01:29:17.974>
    actor = 'eric.smith'
    assignee = 'eric.smith'
    closed = True
    closed_date = <Date 2021-04-11.01:29:17.974>
    closer = 'eric.smith'
    components = ['Library (Lib)']
    creation = <Date 2021-04-07.13:07:21.686>
    creator = 'freundTech'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 43764
    keywords = ['patch']
    message_count = 22.0
    messages = ['390429', '390434', '390453', '390454', '390465', '390546', '390674', '390678', '390680', '390681', '390682', '390726', '390734', '390735', '390736', '390744', '390745', '390746', '390747', '390748', '390749', '390760']
    nosy_count = 4.0
    nosy_names = ['gvanrossum', 'eric.smith', 'brandtbucher', 'freundTech']
    pr_nums = ['25284', '25337']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue43764'
    versions = ['Python 3.10']

    @freundTech
    Copy link
    Mannequin Author

    freundTech mannequin commented Apr 7, 2021

    The dataclass decorator can take multiple parameters to enable or disable the generation of certain methods.

    PEP-634 Structural Pattern Matching extends dataclasses to also generate a __match_args__ attribute.

    I think adding a parameter to enable and disable generation of __match_args__ should be to dataclass should also be considered for consistency.

    Note that setting compare=False on a dataclass.field already excludes that field from __match_args__, but there is no way to disable generation of __match_args__ completely.

    @freundTech freundTech mannequin added 3.10 only security fixes stdlib Python modules in the Lib dir labels Apr 7, 2021
    @ericvsmith
    Copy link
    Member

    What's the situation where having __match_args__ is actually harmful in some way? I understand that if the generated version is wrong, you'd want to specify it yourself. But what's the use case for not having __match_args__ at all?

    @ericvsmith ericvsmith self-assigned this Apr 7, 2021
    @ericvsmith ericvsmith added the type-bug An unexpected behavior, bug, or error label Apr 7, 2021
    @ericvsmith ericvsmith self-assigned this Apr 7, 2021
    @ericvsmith ericvsmith added the type-bug An unexpected behavior, bug, or error label Apr 7, 2021
    @brandtbucher
    Copy link
    Member

    I agree with Eric. You can already disable the automatic creation of __match_args__ by setting it yourself on the class being decorated, and "__match_args__ = ()" will make the class behave the exact same as if __match_args__ is not defined at all.

    >>> from dataclasses import dataclass
    >>> @dataclass 
    ... class X:
    ...     __match_args__ = ()
    ...     a: int
    ...     b: int
    ...     c: int
    ... 
    >>> X.__match_args__
    ()

    I too have trouble imagining a case where the actual *presence* of the attribute matters. But assuming those cases exist, wouldn't a simple "del X.__match_args__" suffice?

    @brandtbucher
    Copy link
    Member

    Note that setting compare=False on a dataclass.field already excludes that field from __match_args__...

    It appears you did find a genuine bug though! I was surprised by this comment, and after digging a bit deeper into _process_class found that we should be generating these from "field_list", not "flds":

    >>> @dataclass(repr=False, eq=False, init=False)
    ... class X:
    ...     a: int
    ...     b: int
    ...     c: int
    ... 
    Traceback (most recent call last):
      File "<stdin>", line 2, in <module>
      File "/home/bucher/src/cpython/Lib/dataclasses.py", line 1042, in wrap
        return _process_class(cls, init, repr, eq, order, unsafe_hash, frozen)
      File "/home/bucher/src/cpython/Lib/dataclasses.py", line 1020, in _process_class
        cls.__match_args__ = tuple(f.name for f in flds if f.init)
    UnboundLocalError: local variable 'flds' referenced before assignment

    @gvanrossum
    Copy link
    Member

    I assume the OP wants to have a class that doesn't allow positional patterns. The right way to spell that is indeed to add

        __match_args__ = ()

    to the class, there's no need to add another flag to @DataClass.

    @brandtbucher
    Copy link
    Member

    New changeset d92c59f by Brandt Bucher in branch 'master':
    bpo-43764: Fix __match_args__ generation logic for dataclasses (GH-25284)
    d92c59f

    @freundTech
    Copy link
    Mannequin Author

    freundTech mannequin commented Apr 9, 2021

    I assume the OP wants to have a class that doesn't allow positional patterns. The right way to spell that is indeed to add

    __match_args__ = ()

    to the class, there's no need to add another flag to @DataClass.

    The same however is also true for all the other stuff generated by @DataClass. You can for example disable generation of the init method using

        def __init__(self): pass

    and dataclass still has a parameter to disable it.

    I agree that a new parameter isn't strictly required to achieve functionality, however I would still argue that it should be added for consistency with the rest of the dataclass api.

    @ericvsmith
    Copy link
    Member

    init=False is used to make sure there's no __init__ defined, because there's a difference between a class with an __init__ and one without. If there was a difference between __match_args__ being not present and __match_args__=(), then I'd support a matchargs=False argument.

    @brandtbucher
    Copy link
    Member

    init=False is used to make sure there's no __init__ defined, because there's a difference between a class with an __init__ and one without. If there was a difference between __match_args__ being not present and __match_args__=(), then I'd support a matchargs=False argument.

    Ah, I see now how this might possibly be useful.

    If you want to inherit a parent's __match_args__ in a dataclass, it currently must be as spelled something like:

    @dataclass
    class Child(Parent):
        __match_args__ = Parent.__match_args__
        ...

    It's even uglier when you're unsure if Parent defines __match_args__ at all, or if multiple-inheritance is involved:

    @dataclass
    class Child(Parent, Mixin):
        __match_args__ = ()
        ...

    del Child.__match_args__

    I'm not sure how likely it is that code out in the wild may need to look like this. As I understand it, though, the fact that dataclasses allow for "normal" inheritance is one of their big selling-points. So it could be a valid reason to include this option.

    If it seems like the above code might become reasonably common, I agree that the proposed solution is much cleaner:

    @dataclass(match_args=False)
    class Child(Parent):
        ...

    @brandtbucher
    Copy link
    Member

    I just realized that in my first two examples, all that's really needed is a "del Child.__match_args__" following each class definition.

    The main point still stands though: this may be a more common issue than I previously thought. If it is, then a dedicated kwarg to the dataclass decorator probably makes more sense than requiring users to "del" the generated attribute.

    @ericvsmith
    Copy link
    Member

    Hmm, good point on the inheritance case. I'd forgotten that this is where init=False and others would come in handy. Although I'm having a hard time figuring out why you'd want a derived class that adds fields but wants to use the parent's __match_args__ (or __init__, for that matter), but I guess it's possible.

    I don't like del Child.__match_args__. The same pattern could be used for init=False, after all, but there's a parameter for that.

    @gvanrossum
    Copy link
    Member

    So, should we reopen this ad add the flag to suppress __match_args__ generation after all?

    @ericvsmith
    Copy link
    Member

    So, should we reopen this ad add the flag to suppress __match_args__ generation after all?

    I think so, although I'm still contemplating it.

    I can't decide if we should also add match_arg=False to field(). Or is it good enough to just tell the user to manually set __match_args__ if they want that level of control? I think this is different enough from repr, init, etc. that we don't need to allow the per-field control, although maybe doing so would make the code more robust in terms of re-ordering or adding fields at a later date.

    @gvanrossum
    Copy link
    Member

    I don't think we need control at the field level here. Omitting one field
    is just going to cause confusion.

    @ericvsmith
    Copy link
    Member

    Okay, I'll re-open this to just add @DataClass(match_args=False).

    @ericvsmith ericvsmith reopened this Apr 10, 2021
    @ericvsmith ericvsmith reopened this Apr 10, 2021
    @ericvsmith
    Copy link
    Member

    Here's a question. If __init__ is not being generated, either because the user supplied one to this class, or if init=False is specified, should __match_args__ be generated? I think the answer should be no, since the code has no idea what the parameters to __init__ will be. But I'd like to hear from people more familiar with pattern matching.

    I'm working on a patch, and this is my last issue.

    @gvanrossum
    Copy link
    Member

    Are there other cases where suppressing one thing affects others?

    @ericvsmith
    Copy link
    Member

    Are there other cases where suppressing one thing affects others?

    Only the complex interactions among the unsafe_hash, eq, and frozen parameters.

    It feels like if __init__ is not being generated, then the @DataClass code would have no idea what it should set __match_args__ to.

    Not that this problem isn't strictly related to the match_args=False case, it just occurred to me that it's related while I was writing the documentation. Perhaps this should be a separate issue.

    @gvanrossum
    Copy link
    Member

    There may be other reasons why __init__ is not desired, and there's no
    absolute requirement that __match_args__ must match the constructor --
    only a convention. I'd say don't tie them together.

    @ericvsmith
    Copy link
    Member

    I can go either way. It's easy enough for the user to add their own __match_args__, so I won't link them.

    Here's what I have for the documentation, which is why the issue came up:

    • ``match_args``: If true (the default is ``True``), the
      ``__match_args__`` tuple will be created from the list of
      parameters to the generated :meth:`__init__` method (even if
      :meth:`__init__` is not generated, see above). If false, or if
      ``__match_args__`` is already defined in the class, then
      ``__match_args__`` will not be generated.

    @gvanrossum
    Copy link
    Member

    LGTM

    @ericvsmith
    Copy link
    Member

    New changeset 750f484 by Eric V. Smith in branch 'master':
    bpo-43764: Add match_args=False parameter to dataclass decorator and to make_dataclasses function. (GH-25337)
    750f484

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.10 only security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants