Title: Inconsistent calls to __eq__ from built-in __contains__
Type: behavior Stage: needs patch
Components: Documentation Versions: Python 3.9, Python 3.8
Status: open Resolution:
Dependencies: Superseder:
Assigned To: docs@python Nosy List: docs@python, r.david.murray, rhettinger, steven.daprano, terry.reedy, vfaronov
Priority: normal Keywords:

Created on 2016-07-24 12:13 by vfaronov, last changed 2019-05-30 20:41 by cheryl.sabella.

File name Uploaded Description Edit vfaronov, 2016-07-24 12:13 Example demonstrating the problem
Messages (5)
msg271145 - (view) Author: Vasiliy Faronov (vfaronov) Date: 2016-07-24 12:13
Consider the attached example program. I expect it to run successfully, because the Python 3 language reference says [1]:

> For container types such as list, tuple, set, frozenset, dict, or collections.deque, the expression `x in y` is equivalent to `any(x is e or x == e for e in y)`.

and [2]:

> `x==y` calls `x.__eq__(y)`

Instead, under Python 3.5.2, the program crashes with an assertion error, because `dict.__contains__` calls `Bar.__eq__` instead of `Foo.__eq__`.

The same happens if you replace the dict with a set or a frozenset. But if you replace it with a list or a tuple, the behavior is as documented.

This seems to me like a bug in either the implementation or the documentation.

The language reference clearly says [3] that equality should be symmetric "if possible", but is not required to be, and indeed that is hard to guarantee when different classes are involved.

msg271154 - (view) Author: Steven D'Aprano (steven.daprano) * (Python committer) Date: 2016-07-24 14:23
I'm reluctant to call this a behaviour bug, because I'm reluctant to guarantee the *precise* behaviour when the classes are different.

I haven't checked the source of dict.__contains__, but by experimentation it seems that:

    needle in {key: value}

calls key == needle, rather than needle == key as the docs would imply. This seems to apply going all the way back to at least CPython2.4.

On the other hand, Jython2.5 and IronPython2.6 seem to call needle == key as Vasily expected.

But the docs are clearly incomplete, as they suggest a linear search through all the keys of the dict (or set), which is wrong. Only if needle hashes to the same value as key will == be called.

I suggest updating the docs to make it clear that the sample code shown is only the *approximate* behaviour, which may call == in either direction.
msg271158 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2016-07-24 14:42
I did look at the code.  Indeed list and tuple compare x to e, while dict, set, and frozenset (well, I didn't check each one, just list and set) compare the e to x, where e is they key stored at hash(x).

Steve has a good point about the iteration.  And a user class can do anything it wants in contains.  I wonder if it would be even more accurate to say "conceptually equal" rather than "approximately equal", given that hash table 'in' doesn't do iteration at all.
msg271165 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2016-07-24 16:06
+1 for "approximately equivalent".   The "conceptually equivalent" wording raises more questions in my mind than it answers.  

What we're really trying to say here is that for typically symmetric operations such as __eq__(), an implementation is free to use either a==b or b==a.  The choice usually doesn't matter, but can be revealed by a clever user.

Pretty all of the pure python equivalents in the docs are only approximate equivalents.  1) Tracebacks are different.  2) Timing of exceptions can be different (C functions tend to check arguments before the body of the function and pure python code tends to check when the argument is used).  3) Thread-safety may be different (C datatypes have to work hard to prevent segfaulting when a container mutates in the middle of a call, but in pure python code that tends to be too expensive and the consequences don't result in a segfault). 4) The internal algorithms may be different.

With respect to 4, we really do want to preserve implementation freedom.  Otherwise, it would have been impossible to switch to the timsort algorithm or for set intersection to decide to iterate over the smaller input set.  Another reason that implementation freedom is important is that what is fast, clear, and convenient in one implementation is may by slow, obscure, and difficult in another.

Lastly, the primary purpose of the pure python approximate equivalents in the docs is to help people get a high level understanding of what is going on.  If we were to modify the pure python code to be an exact equivalent, it would 1) tend to over-specify the implementation resulting in unnecessary rigidity and 2) it would tend to be more obscure, defeating our goal of making the docs more clear.

Perhaps, there should be a general FAQ entry to this effect.  The approximate pure python equivalents should be read loosely with the goal of understanding overall functionality and not be interpreted as exact drop in replacements (the PyPy folks can attest that exact drop-in replacements tend to be much more difficult than you might guess).

The FAQ should also mention that in order to make even simple examples of pure python code possible, there needs to be generous reading of the code the precludes the uncommon, exotic, or diabolical contortions that the language is capable of doing.  For example, after "b = a", there a values that cause all of these assertions to fail "assert a==b; assert b==a; assert a==b;".  For another example, after "s = {a, a}", there are values than can cause these to fail "assert a in s;  assert len(s) == 1".
msg271640 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2016-07-29 17:56
Here is a more accurate 'equivalent': "any(x is e or <one of x==e or e==x> for e in <candidate members of y>)" where 'candidate members' is "members with the same hash as x" for built-in hash-based collections and 'all members' for built-in unsorted sequences".  I am sure that built-in range does a direct calculation.  Instead of the above, I suggest this.

"For general collections, such as list, tuple, collections.deque, and most iterables, the expression `x in y` is usually equivalent to `any(x is e or x == e for e in y)`."

Hash-based collections should not be mentioned in such a sentence as the equivalency is completely wrong, as Steven noted.
Date User Action Args
2019-05-30 20:41:50cheryl.sabellasetversions: + Python 3.8, Python 3.9
nosy: + docs@python

assignee: docs@python
components: + Documentation
stage: needs patch
2016-07-29 17:56:25terry.reedysetnosy: + terry.reedy
messages: + msg271640
2016-07-24 16:06:24rhettingersetmessages: + msg271165
2016-07-24 14:42:30r.david.murraysetnosy: + r.david.murray
messages: + msg271158
2016-07-24 14:36:35serhiy.storchakasetnosy: + rhettinger
2016-07-24 14:23:28steven.dapranosetnosy: + steven.daprano
messages: + msg271154
2016-07-24 12:13:15vfaronovcreate