This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: dataclasses: make it easier to use user-supplied special methods
Type: behavior Stage: resolved
Components: Versions: Python 3.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: eric.smith Nosy List: eric.smith, gvanrossum, larry, levkivskyi, ncoghlan
Priority: normal Keywords: patch

Created on 2018-01-07 17:06 by eric.smith, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 4726 rhettinger, 2018-01-21 20:34
PR 5366 merged eric.smith, 2018-01-27 20:59
PR 10398 merged xtreak, 2018-11-09 02:30
Messages (25)
msg309628 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2018-01-07 17:06
Modify dataclasses to make it easier to specify special methods.

For example: currently, if you want to override __repr__, you need to specify @dataclass(repr=False), and also provide your own __repr__. Also, it's current an error to not specify repr=False and to also provide your own __repr__.  @dataclass should be able to determine that if you provide __repr__, you don't want it to provide it's own __repr__.

The methods that @dataclass will provide for you are:
__init__
__repr__
__eq__
__ne__
__lt__
__le__
__gt__
__ge__
_hash__
and if using frozen=True, also:
__setattr__
__delattr__

Per the discussion that started at https://mail.python.org/pipermail/python-dev/2017-December/151487.html, the change will be to override these methods only if:
1. the method would have been generated based on @dataclass params, and
2. the method is not present in the class's __dict__.

The first item is to allow the user to continue to suppress method generation, and the second item is so that base-class methods are not considered in the decision.

I'm still thinking through how __eq__ and __ne__ should be handled (as well as the ordering comparisons, too). I'll raise an issue on python-dev and point it here.
msg309976 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2018-01-15 13:13
Another case to consider is that if a class defines __eq__, Python will automatically set hash=False before the decorator gets involved. NOte to self: be sure to add a test case for this.
msg309979 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2018-01-15 13:19
See #32546 for the hash=None problem.
msg310392 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2018-01-22 02:41
Here is my proposal for making it easier for the user to supply dunder methods that dataclasses would otherwise automatically add to the class.

For all of these cases, when I talk about init=, repr=, eq=, order=, hash=, or frozen=, I'm referring to the parameters to the dataclass decorator.

When checking if a dunder method already exists, I mean check for an entry in the class's __dict__. I never check to see if a member is defined in a base class.

Let's get the easy ones out of the way first.

__init__
If __init__ exists or init=False, don't generate __init__.

__repr__
If __repr__ exists or repr=False, don't generate __repr__.

__setattr__
__delattr__
If frozen=True and either of these methods exist, raise a TypeError. These methods are needed for the "frozen-ness" of the class to be implemented, and if they're already set then that behavior can't be enforced.

__eq__
If __eq__ exists or eq=False, don't generate __eq__.

And now the harder ones:

__ne__
I propose to never generate a __ne__ method. Python will call __eq__ and negate the result for you. If you feel you really need a non-matching __ne__, then you can write it yourself without interference from dataclasses.

__lt__
__le__
__gt__
__ge__
I propose to treat each of these individually, but for each of them using the value of the order= parameter. So:
If __lt__ exists or order=False, don't generate __lt__.
If __le__ exists or order=False, don't generate __le__.
If __gt__ exists or order=False, don't generate __gt__.
If __ge__ exists or order=False, don't generate __ge__.
If for some crazy reason you want to define some of these but not others, then set order=False and write your desired methods.

__hash__
Whether dataclasses might generate __hash__ depends on the values of hash=, eq=, and frozen=. Note that the default value of hash= is None. See below for an explanation.

If hash=False, never generate __hash__. If hash=True, generate __hash__ unless it already exists.

If hash=None (the default), then use this table to decide whether and how to generate __hash__:
eq=?	frozen=?	__hash__
False	False		do not generate __hash__
False	True		do not generate __hash__
True	False		set __hash__ to None unless it already exists
True	True		generate __hash__ unless it already exists
                         and is None

Note that it's almost always a bad idea to specify hash=True or hash=False. The action based on the above table (where hash=None, the default), is usually the correct behavior.

One special case to recognize is if the class defines a __eq__. In this case, Python will assign __hash__=None before the dataclass decorator is called. The decorator cannot distinguish between these two cases (except possibly by using the order of __dict__ keys, but that seems overly fragile):

@dataclass
class A:
    def __eq__(self, other): pass

@dataclass
class B:
    def __eq__(self, other): pass
    __hash__ = None

This is the source of the last line in the above table: for a dataclass where eq=True, frozen=True, and hash=None, if __hash__ is None it will still be overwritten. The assumption is that this is what the user wants, but it's a tricky corner case. It also occurs if setting hash=True and defining __eq__. Again, it's not expected to come up in normal usage.
msg310394 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2018-01-22 03:08
Do all the parameters to the decorator take a default value of "None", so that you can differentiate between explicit True, explicit False, and did-not-specify?

Is this the complete list of these dunder-method-generation-control parameters? init repr eq order hash
msg310399 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2018-01-22 07:37
Only hash has the tri-state True/False/None behavior, defaulting to None. It's this way because None is the "do what's rational, based on eq and frozen" behavior. None of the other parameters work this way. 

There's a long issue in the attrs repo that describes how they came to this conclusion: https://github.com/python-attrs/attrs/issues/136. I think it makes sense.

None of the dataclass parameters have a sentinel that would let me detect if the user provided a value or not. In the case of hash, I can't detect if they explicitly passed hash=None or just didn't provide a value. I've given this some thought, and couldn't come up with a use case for knowing this. For example, if init had a sentinel value of MISSING, what would the code ever do except start with:
if init is sentinel:
    init = True
?

In addition to your list of dunder-control-parameters, there's frozen. It determines if instances are immutable (to the extent that they can be made immutable).
msg310402 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2018-01-22 07:55
Oops, that should have been:

if init is MISSING:
    init = True
msg310404 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2018-01-22 08:58
So if I understand correctly: the default value of the "repr" parameter is True.


Decision matrix: does dataclass insert a __repr__ into the class?

  +-- row: argument passed in to decorator's repr parameter
  |
  |
  v    | yes     | no      |  <--- columns: does the class have a __repr__?
-------+---------+---------+
 True  | ???     | yes     |
-------+---------+---------+
 False | no      | no      |
-------+---------+---------+

If the user specifies "repr=True", and also specifies their own __repr__ in the class, does dataclasses stomp on their __repr__ with its own?  Does it throw an exception?


I still prefer the tri-state value here.  In this version, the default value of the "repr" parameter would be None, and the decision matrix for inserting a __repr__ looks like this:

  +-- row: argument passed in to decorator's repr parameter
  |
  |
  v    | yes     | no      |  <--- columns: does the class have a __repr__?
-------+---------+---------+
 True  | raise   | yes     |
-------+---------+---------+
 None  | no      | yes     |
-------+---------+---------+
 False | no      | no      |
-------+---------+---------+

But we've talked about using the tri-state default for all of these parameters before, and clearly you weren't swayed then, so I guess I've said my peace and I'll stop suggesting it.
msg310417 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2018-01-22 11:02
The discussion on python-dev was that your ??? box would be "no": if the user supplied __repr__, they obviously meant for dataclass() to not provide one.

I can't see why the user would say repr=True ("I want dataclass() to add __repr__"), but then provide a __repr__ and get an exception. That looks like the only functionality added by your repr=True row over the proposal. Where your proposal uses repr=None for the "no", "yes" row, mine uses repr=True.

It's not like there's action at a distance here: the user is writing the class. Especially since base classes are ignored. 

I'm ignoring make_dataclasses(), where the user is dynamically creating a class and maybe a __repr__ snuck in. But I don't care so much about that case.

I do think your ascii tables are a good way of explaining this. Thanks! (Now I need a 3D version for eq, frozen, hash!)
msg310462 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2018-01-23 02:36
For the ordering operators, my only question would be whether or not I can rely on them to act like functools.total_ordering: if I supply __eq__ and one of the ordering operators (e.g. __lt__), will dataclasses make sure the other three ordering operators are consistent with those base methods? Or will it bypass them and act directly on the underlying fields?

My suggestion would be to say that if any of __lt__, __le__, __gt__ or __ge__ are defined, then data classes will implicitly generate the other methods based on functools.total_ordering semantics, and will only reference the underlying fields directly if *none* of them are defined. Otherwise I can see folks defining a single method like "__lt__", and being surprised when they end up with inconsistent comparison behaviour.
msg310463 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2018-01-23 02:51
Rather than re-implementing (and maintaining) functools.total_ordering semantics, I'd rather advise them to specify order=False and just use functools.total_ordering. It's an odd use case for dataclasses, anyway.
msg310464 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2018-01-23 02:57
I'd be fine with that recommendation (since `@dataclass(order=False)` and `@total_ordering` will compose without any problems), but in that case I'd suggest having "order=True" + any of the ordering methods result in an exception (as you've proposed for frozen=True and the methods that it needs to override).
msg310556 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2018-01-24 03:27
Are we clear on the proposal now? It would be good to have this implemented in beta 1. Eric's long message looks good to me (though I have to admit the logic around __hash__ confuses me, but I've spent two long days at PyCascades, and I trust Eric's judgment).
msg310569 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2018-01-24 09:12
I'm working on a patch before beta 1. The only issue is whether to raise in the case that order=True and the user specifies one of the 4 ordering dunders. I think raising is okay, especially since order=False is the default: you'd have to go out of your way to cause the exception. I'll mention functools.total_ordering in the exception message.

I do think that the discussion in my long message about __hash__ is correct, although I'm not done thinking it through. I think it's already pretty well tested, but if not I'll make sure there are enough test cases for objects that I think should or shouldn't be hashable that I'll have all of the kinks worked about before beta 1.
msg310653 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2018-01-25 01:33
Raising for order=True if one of the ordering dunders exists sounds fine.

I am confused by the corner case for hash. Your table:

"""
eq=?	frozen=?	__hash__
False	False		do not generate __hash__
False	True		do not generate __hash__
True	False		set __hash__ to None unless it already exists
True	True		generate __hash__ unless it already exists
                         and is None
"""

Then you write at the end of that message:

"""
One special case to recognize is if the class defines a __eq__. In this case, Python will assign __hash__=None before the dataclass decorator is called. The decorator cannot distinguish between these two cases (except possibly by using the order of __dict__ keys, but that seems overly fragile):

@dataclass
class A:
    def __eq__(self, other): pass

@dataclass
class B:
    def __eq__(self, other): pass
    __hash__ = None

This is the source of the last line in the above table: for a dataclass where eq=True, frozen=True, and hash=None, if __hash__ is None it will still be overwritten. The assumption is that this is what the user wants, but it's a tricky corner case. It also occurs if setting hash=True and defining __eq__. Again, it's not expected to come up in normal usage.
"""

I think I understand what you are saying there -- the two cases are treated the same, and a __hash__ is created (assuming the decorator is really "@dataclass(eq=True, frozen=True)"), overwriting the "__hash__ = None" for class B.

However the table's last line says "generate __hash__ unless it already exists and is None". Perhaps that was a typo and you meant to write "and is *not* None"?
msg310668 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2018-01-25 09:40
I'll look at the hash question later today. It's sufficiently complex that I really want to sit down and work through all of the cases.
msg310760 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2018-01-26 13:24
Here's where I am on what methods are added under which circumstances. Hopefully these tables don't get mangled.

The more I think about what to do with hash and raising exceptions, the more I think that the right answer is to not raise any exceptions, and just follow the rule of "don't overwrite an attribute that the user specified". The uses of hash=True or hash=None are so specialized that users can be expected to understand the implications. The interaction among hash=, eq=, and frozen= are sufficiently complicated (see the last table below) that we don't need to add any more rules.

There are only 2 places below where exceptions are raised:

1. frozen=True and the user supplied __setattr__ or __delattr__. These methods are crucial for frozen-ness, so we must be able to add them.

2. order=True and the user supplied at least one of __lt__, __le__, __ge__, or __gt__. This is at Nick's request (see msg310464 in this issue). I'm not so sure about this one: if the user wants to explicitly ask that these methods be added (by specifying order=True, which is not the default), and they also specify some of the ordering function, then why not let them, and have dataclass add the missing ones as we normally would? But I'm not heavily invested in the outcome, as I don't see any reason to write such code, anyway. The only reason I'd suggest leaving it out is to avoid having to explain it in the documentation.

I have this all coded up and ready to create a PR, modulo a few more tests. I'll get it committed before Beta 1. Hopefully these tables answers the questions about the new proposal.

# Conditions for adding methods:
# added = method is added
#       = no action: method is not added
# raise = TypeError is raised
# None  = attribute is set to None

# __init__
#
#   +--- init= parameter
#   |
#   v    |   yes   |   no    |  <--- class has __init__?
# -------+---------+---------+
#  False |         |         |
# -------+---------+---------+
#  True  |         | added   |  <- the default
# -------+---------+---------+

# __repr__
#
#   +--- repr= parameter
#   |
#   v    |   yes   |   no    |  <--- class has __repr__?
# -------+---------+---------+
#  False |         |         |
# -------+---------+---------+
#  True  |         | added   |  <- the default
# -------+---------+---------+

# __setattr__
# __delattr__
#
#   +--- frozen= parameter
#   |
#   v    |   yes   |   no    |  <--- class has __setattr__ or __delattr__?
# -------+---------+---------+
#  False |         |         |  <- the default
# -------+---------+---------+
#  True  | raise * | added   |
# -------+---------+---------+
#
# * Raise because not adding these methods
#    would break the "frozen-ness" of the class.

# __eq__
#
#   +--- eq= parameter
#   |
#   v    |   yes   |   no    |  <--- class has __eq__?
# -------+---------+---------+
#  False |         |         |
# -------+---------+---------+
#  True  |         | added   |  <- the default
# -------+---------+---------+

# __lt__
# __le__
# __gt__
# __ge__
#
#   +--- order= parameter
#   |
#   v    |   yes   |   no    |  <--- class has any comparison method?
# -------+---------+---------+
#  False |         |         |  <- the default
# -------+---------+---------+
#  True  | raise * | added   |
# -------+---------+---------+
#
# * Raise because to allow this case would interfere with using
#    functools.total_ordering.

# __hash__

#   +------------------- hash= parameter
#   |       +----------- eq= parameter
#   |       |       +--- frozen= parameter
#   |       |       |
#   v    |  v    |  v    |  yes  |  no   |  <--- class has __hash__?
# -------+-------+-------+-------+-------+
#  False | False | False |       |       |
# -------+-------+-------+-------+-------+
#  False | False | True  |       |       |
# -------+-------+-------+-------+-------+
#  False | True  | False |       |       |
# -------+-------+-------+-------+-------+
#  False | True  | True  |       |       |
# -------+-------+-------+-------+-------+
#  True  | False | False |       | added | Has no __eq__, but hashable
# -------+-------+-------+-------+-------+
#  True  | False | True  |       | added | Has no __eq__, but hashable
# -------+-------+-------+-------+-------+
#  True  | True  | False |       | added | Not frozen, but hashable
# -------+-------+-------+-------+-------+
#  True  | True  | True  |       | added | Frozen, so hashable
# -------+-------+-------+-------+-------+
#  None  | False | False |       |       | No __eq__, use the base class __hash__
# -------+-------+-------+-------+-------+
#  None  | False | True  |       |       | No __eq__, use the base class __hash__
# -------+-------+-------+-------+-------+
#  None  | True  | False |       | None  | <-- the default, not hashable
# -------+-------+-------+-------+-------+
#  None  | True  | True  |       | added | Frozen, so hashable
# -------+-------+-------+-------+-------+
#
# For boxes that are blank, __hash__ is untouched and therefore
#  inherited from the base class.  If the base is object, then
#  id-based hashing is used.
# Note that a class may have already __hash__=None if it specified an
#  __eq__ method in the class body (not one that was created by
#  dataclass).
msg310775 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2018-01-26 16:24
I'm working on a better formulation of the __hash__ issue, so don't invest much time in this until I post my next message. I'm going to re-organize the various tables to make them easier to read.
msg310776 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2018-01-26 16:30
Can you also clarify that this *never* looks at whether the method is
implemented in a superclass?

And clarify what happens with the __hash__ = None that's automatically
created when you define __eq__.

Please also repeat the default value of the flag in each case.
msg310830 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2018-01-27 01:52
I apologize for the length of this, but I want to be as precise as
possible.  I've no doubt made some mistakes, so corrections and
discussion are welcomed.

I'm adding the commented text at the end of this message to
dataclasses.py. I'm repeating it here for discussion.  These tables
are slightly different from previous versions on this issue, and I've
added line numbers to the __hash__ table to make it easier to discuss.
So any further comments and discussion should reference the tables in
this message.

I think the __init__, __repr__, __set/delattr__, __eq__, and ordering
tables are not controversial.

** hash

For __hash__, the 99.9% use case is that the default value of
hash=None is sufficient.  This is rows 1-4 of that table.  It's
unfortunate that I have to spend so much of the following text
describing cases that I think will rarely be used, but I think it's
important that in these special cases we don't do anything surprising.

**** hash=None

First, let's discuss hash=None, which is the default.  This is lines
1-4 of the __hash__ table.

Here's an example of line 1:

@dataclass(hash=None, eq=False, frozen=False)
class A:
    i: int

The user doesn't want an __eq__, and the class is not frozen.  Whether
or not the user supplied a __hash__, no hash will be added by
@dataclass.  In the absense of __hash__, the base class (in this case,
object) __hash__.  object.__hash__ and object.__eq__ are based on
object identity.

Here's an example of line 2:

@dataclass(hash=None, eq=False, frozen=True)
class A:
    i: int

This is a frozen class, where the user doesn't want an __eq__.  The
same logic is used as for line 1: no __hash__ is added.

Here's an example of line 3, "no" column (no __hash__).  Note that
this line shows the default values for hash=, eq=, frozen=:

@dataclass(hash=None, eq=True, frozen=False)
class A:
    i: int

In this case, if the user doesn't provide __hash__ (the "no" column),
we'll set __hash__=None.  That's because it's a non-frozen class with
an __eq__, it should not be hashable.

Here's an example of the line 3, "yes" column (__hash__ defined):

@dataclass(hash=None, eq=True, frozen=False)
class A:
    i: int
    def __hash__(self): pass

Since a __hash__ exists, it will not be overwritten.  Note that this
is also true if __hash__ were set to None.  We're just checking that
__hash__ exists, not it's value.

Here's an example of line 4, "no" column:

@dataclass(hash=None, eq=True, frozen=True)
class A:
    i: int

In this case the action is "add", and a __hash__ method will be
created.

And finally, consider line 4, "yes" column.  This case is hash=None,
eq=True, frozen=True.  Here we also apply the auto-hash test, just
like we do in the hash=True case (line 12, "yes" column).  We want to
make the frozen instances hashable, so we add our __hash__ method,
unless the user has directly implemented __hash__.

**** hash=False

Next, consider hash=False (rows 5-8).  In this case, @dataclass will
never add a __hash__ method, and if one exists it is not modified.

**** hash=True

Lastly, consider hash=True.  This is rows 9-12 of the __hash__ table.
The user is saying they always want a __hash__ method on the class.
If __hash__ does not already exist (the "no" column), then @dataclass
will add a __hash__ method.  If __hash__ does already exist in the
class's __dict__, @dataclass will overwrite it if and only if the
current value of __hash__ is None, and if the class has an __eq__ in
the class definition.  Let's call this condition the "auto-hash test".
The assumption is that the only reason a class has a __hash__ of None
and an __eq__ method is that the __hash__ was automatically added by
Python's class creation machinery.  And if the hash was auto-added,
then by the user specifying hash=True, they're saying they want the
__hash__ = None overridden by a generated __hash__ method.

I'll give examples from line 9 of the __hash__ table, but this
behavior is the same for rows 9-12.  Consider:

@dataclass(hash=True, eq=False, frozen=False)
class A:
    i: int
    __hash__ = None

This is line 9, "yes" column, action="add*", which says to add a
__hash__ method if the class passes the auto-hash test.  In this case
the class fails the test because it doesn't have an __eq__ in the
class definition.  __hash__ will not be overwritten.

Now consider:

@dataclass(hash=True, eq=False, frozen=False)
class A:
    i: int
    def __eq__(self, other): ...

This again is line 9, "yes" column, action="add*".  In this case, the
user is saying to add a __hash__, but it already exists because Python
automatically added __hash__=None when it created the class.  So, the
class passes the auto-hash test.  So even though there is a __hash__,
we'll overwrite it with a generated method.

Now consider:

@dataclass(hash=True, eq=False, frozen=False)
class A:
    i: int
    def __eq__(self, other): ...
    def __hash__(self): ...

Again, this is line 9, "yes" column, action="add*". The existing
__hash__ is not None, so we don't overwrite the user's __hash__
method.

Note that a class can pass the auto-hash test but not have an
auto-generated __hash__=None.  There's no way for @dataclass to
actually know that __hash__=False was auto-generated, it just assumes
that that's the case.  For example, this class passes the auth-hash
test and __hash__ will be overwritten:

@dataclass(hash=True, eq=False, frozen=False)
class A:
    i: int
    def __eq__(self, other): ...
    __hash__=None

A special case to consider is lines 11 and 12 from the table.  Here's an
example of line 11, but line 12 is the same for the purposes of this
discussion:

@dataclass(hash=True, eq=True, frozen=False)
class A:
    i: int
    __hash__=None

The class will have a generated __eq__, because eq=True.  However, the
class still fails the auto-hash test, because the class's __dict__ did
not have an __eq__ that was added by the class definition.  Instead,
it was added by @dataclass.  So this class fails the auto-hash test
and the __hash__ value will not be overwritten.


Tables follow:

# Conditions for adding methods.  The boxes indicate what action the
#  dataclass decorator takes.  For all of these tables, when I talk
#  about init=, repr=, eq=, order=, hash=, or frozen=, I'm referring
#  to the arguments to the @dataclass decorator.  When checking if a
#  dunder method already exists, I mean check for an entry in the
#  class's __dict__.  I never check to see if an attribute is defined
#  in a base class.

# Key:
# +=========+=========================================+
# + Value   | Meaning                                 |
# +=========+=========================================+
# | <blank> | No action: no method is added.          |
# +---------+-----------------------------------------+
# | add     | Generated method is added.              |
# +---------+-----------------------------------------+
# | add*    | Generated method is added only if the   |
# |         |  existing attribute is None and if the  |
# |         |  user supplied a __eq__ method in the   |
# |         |  class definition.                      |
# +---------+-----------------------------------------+
# | raise   | TypeError is raised.                    |
# +---------+-----------------------------------------+
# | None    | Attribute is set to None.               |
# +=========+=========================================+

# __init__
#
#   +--- init= parameter
#   |
#   v     |       |       |
#         |  no   |  yes  |  <--- class has __init__ in __dict__?
# +=======+=======+=======+
# | False |       |       |
# +-------+-------+-------+
# | True  | add   |       |  <- the default
# +=======+=======+=======+

# __repr__
#
#    +--- repr= parameter
#    |
#    v    |       |       |
#         |  no   |  yes  |  <--- class has __repr__ in __dict__?
# +=======+=======+=======+
# | False |       |       |
# +-------+-------+-------+
# | True  | add   |       |  <- the default
# +=======+=======+=======+


# __setattr__
# __delattr__
#
#    +--- frozen= parameter
#    |
#    v    |       |       |
#         |  no   |  yes  |  <--- class has __setattr__ or __delattr__ in __dict__?
# +=======+=======+=======+
# | False |       |       |  <- the default
# +-------+-------+-------+
# | True  | add   | raise |
# +=======+=======+=======+
# Raise because not adding these methods would break the "frozen-ness"
#  of the class.

# __eq__
#
#    +--- eq= parameter
#    |
#    v    |       |       |
#         |  no   |  yes  |  <--- class has __eq__ in __dict__?
# +=======+=======+=======+
# | False |       |       |
# +-------+-------+-------+
# | True  | add   |       |  <- the default
# +=======+=======+=======+

# __lt__
# __le__
# __gt__
# __ge__
#
#    +--- order= parameter
#    |
#    v    |       |       |
#         |  no   |  yes  |  <--- class has any comparison method in __dict__?
# +=======+=======+=======+
# | False |       |       |  <- the default
# +-------+-------+-------+
# | True  | add   | raise |
# +=======+=======+=======+
# Raise because to allow this case would interfere with using
#  functools.total_ordering.

# __hash__

#      +------------------- hash= parameter
#      |       +----------- eq= parameter
#      |       |       +--- frozen= parameter
#      |       |       |
#      v       v       v    |        |        |
#                           |   no   |  yes   |  <--- class has __hash__ in __dict__?
# +=========+=======+=======+========+========+
# | 1 None  | False | False |        |        | No __eq__, use the base class __hash__
# +---------+-------+-------+--------+--------+
# | 2 None  | False | True  |        |        | No __eq__, use the base class __hash__
# +---------+-------+-------+--------+--------+
# | 3 None  | True  | False | None   |        | <-- the default, not hashable
# +---------+-------+-------+--------+--------+
# | 4 None  | True  | True  | add    | add*   | Frozen, so hashable
# +---------+-------+-------+--------+--------+
# | 5 False | False | False |        |        |
# +---------+-------+-------+--------+--------+
# | 6 False | False | True  |        |        |
# +---------+-------+-------+--------+--------+
# | 7 False | True  | False |        |        |
# +---------+-------+-------+--------+--------+
# | 8 False | True  | True  |        |        |
# +---------+-------+-------+--------+--------+
# | 9 True  | False | False | add    | add*   | Has no __eq__, but hashable
# +---------+-------+-------+--------+--------+
# |10 True  | False | True  | add    | add*   | Has no __eq__, but hashable
# +---------+-------+-------+--------+--------+
# |11 True  | True  | False | add    | add*   | Not frozen, but hashable
# +---------+-------+-------+--------+--------+
# |12 True  | True  | True  | add    | add*   | Frozen, so hashable
# +=========+=======+=======+========+========+
# For boxes that are blank, __hash__ is untouched and therefore
#  inherited from the base class.  If the base is object, then
#  id-based hashing is used.
# Note that a class may have already __hash__=None if it specified an
#  __eq__ method in the class body (not one that was created by
#  @dataclass).
msg310836 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2018-01-27 02:58
Eric's current proposal sounds sensible to me, but I'll note that if we deem it necessary, the code that implicitly sets `__hash__ = None` to override object.__hash__ when __eq__ is defined could also set some other marker to make it more explicit that that happened.

I think Eric's proposed heuristic will be sufficient, though - for subclasses that inherit both __hash__ and __eq__ from a base class the dataclass decorator won't pay any attention to either of them, and if a base class has set __hash__ to something, then the type creation machinery won't inject "__hash__ = None" into the class being defined.
msg310870 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2018-01-27 16:32
If Nick says it's okay it is okay. I think it's okay too but my brain
doesn't want to start up today -- I suggest not waiting for me.
msg310871 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2018-01-27 16:36
I'm completely burned out on it, too. I'll get the code checked in today, then update the PEP later. I don't really want to add all of this reasoning to the PEP, but I guess I should.

I think the gist of it is correct, in any event. We can fix some corner cases during the betas, if need be.

Thanks Guido and Nick for looking at it.
msg310901 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2018-01-28 00:07
New changeset ea8fc52e75363276db23c6a8d7a689f79efce4f9 by Eric V. Smith in branch 'master':
bpo-32513: Make it easier to override dunders in dataclasses. (GH-5366)
https://github.com/python/cpython/commit/ea8fc52e75363276db23c6a8d7a689f79efce4f9
msg311149 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2018-01-29 16:49
Nit: I think the presentation of the 12-row table in msg310830 can be improved, e.g. by putting the legend for add* closer and/or giving add* a clearer name, and by splitting it into three 4-row tables. Currently the legend comes before all the other tables, and those aren't very useful (they're all the same, except for 'order', and the exception there is better explained in text).

But that's a matter for whoever writes the docs. As a spec it is unambiguous. On python-dev I've just posted that I support this version.
History
Date User Action Args
2022-04-11 14:58:56adminsetgithub: 76694
2018-11-09 02:30:23xtreaksetpull_requests: + pull_request9702
2018-11-07 19:33:29xtreaksetpull_requests: - pull_request9682
2018-11-07 19:32:31xtreaksetpull_requests: + pull_request9682
2018-01-29 16:49:14gvanrossumsetmessages: + msg311149
2018-01-28 00:09:26eric.smithsetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2018-01-28 00:07:42eric.smithsetmessages: + msg310901
2018-01-27 20:59:22eric.smithsetkeywords: + patch
pull_requests: + pull_request5209
2018-01-27 16:36:15eric.smithsetmessages: + msg310871
2018-01-27 16:32:34gvanrossumsetmessages: + msg310870
2018-01-27 02:58:34ncoghlansetmessages: + msg310836
2018-01-27 01:52:18eric.smithsetmessages: + msg310830
2018-01-26 16:30:13gvanrossumsetmessages: + msg310776
2018-01-26 16:24:32eric.smithsetmessages: + msg310775
2018-01-26 13:24:50eric.smithsetmessages: + msg310760
2018-01-25 09:40:05eric.smithsetmessages: + msg310668
2018-01-25 01:33:14gvanrossumsetmessages: + msg310653
2018-01-24 09:12:12eric.smithsetmessages: + msg310569
2018-01-24 03:27:32gvanrossumsetnosy: + gvanrossum
messages: + msg310556
2018-01-23 02:57:51ncoghlansetmessages: + msg310464
2018-01-23 02:51:08eric.smithsetmessages: + msg310463
2018-01-23 02:36:18ncoghlansetnosy: + ncoghlan
messages: + msg310462
2018-01-22 11:02:11eric.smithsetmessages: + msg310417
2018-01-22 08:58:14larrysetmessages: + msg310404
2018-01-22 07:55:52eric.smithsetmessages: + msg310402
2018-01-22 07:37:56eric.smithsetmessages: + msg310399
2018-01-22 03:08:41larrysetnosy: + larry
messages: + msg310394
2018-01-22 02:41:21eric.smithsetkeywords: - patch

messages: + msg310392
2018-01-21 20:34:14rhettingersetkeywords: + patch
stage: patch review
pull_requests: + pull_request5108
2018-01-15 13:19:37eric.smithsetmessages: + msg309979
2018-01-15 13:13:56eric.smithsetmessages: + msg309976
2018-01-08 00:57:35levkivskyisetnosy: + levkivskyi
2018-01-07 17:06:05eric.smithcreate