Title: _CallList.__contains__ doesn't always respect ANY.
Type: behavior Stage: patch review
Components: Library (Lib) Versions: Python 3.9, Python 3.8, Python 3.7
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: ElizabethU, cjw296, iritkatriel, mariocj89, michael.foord, p-ganssle, serhiy.storchaka, xtreak
Priority: normal Keywords: patch

Created on 2019-07-11 05:12 by ElizabethU, last changed 2020-10-17 21:57 by ElizabethU.

Pull Requests
URL Status Linked Edit
PR 14700 merged python-dev, 2019-07-11 05:42
Messages (25)
msg347652 - (view) Author: Elizabeth Uselton (ElizabethU) * Date: 2019-07-11 05:12
I have a test that goes something like:

def test_a_thing_calls_what_it_should(self, my_mock):
    # Set up logic here

Which fails, where my_mock.call_args_list looks like 

[(<A Django Model>, Decimal('20')), (<A Django Model>, Decimal('10'))]

This seems like wrong behavior. ANY should be happy to be compared to anything, even a Django object. Doing some digging, I found that on line 340 of cpython/Lib/unittest/ _CallList is overriding __contains__ and comparing each item in the tuples with what I'd passed in to assert_has_calls on the right, which means that instead of using ANY.__eq__, it's calling the Django model's __eq__ with ANY as an argument. Django first checks if the thing it's comparing to is another Django model, and returns False if not. So, <DjangoModel> == ANY is False, but ANY == <DjangoModel> is True. I know that this could also be considered a bug with Django, and I plan to file one with them too, but I don't see any downside to improving the mock library to be more defensive in honoring ANY over any other custom class's overridden __eq__ method.
msg347654 - (view) Author: Karthikeyan Singaravelan (xtreak) * (Python committer) Date: 2019-07-11 05:46
Thanks for the report. Can you please add an example without Django or other dependencies so that I can try reproducing it? I am trying out the below program from the report where Foo has overridden __eq__ to return False if the other object being compared is not a Foo object. In the next statements ANY == Foo() returns True since the left side's object.__eq__ is used which in this case is ANY.__eq__ as you have noted in the original report. 

When list of call objects are compared there is also a code comment about this in the tuple comparison  of args and kwargs such that ANY is placed on the left side [0] so that ANY.__eq__ is used. A pure python example and traceback if any would help here.

from unittest.mock import call, ANY, Mock

class Foo:

    def __eq__(self, other):
        if not isinstance(other, Foo):
            return False
        return True

m = Mock()
obj = Foo()
m(obj, 1)
m.assert_has_calls([call(ANY, 1)])

print(ANY == Foo()) # ANY.__eq__ is called
print(Foo() == ANY) # Foo().__eq__ is called

Is the report more about the below case where position of call objects returns different values?

print(call(ANY, 1) == call(obj, 1)) # False
print(call(obj, 1) == call(ANY, 1)) # True

msg347698 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2019-07-11 18:01
Karthikeyan, I think there is a flaw in your example. The good __eq__ should return NotImplemented instead of False for other types. This wil allow the right operand's __eq__ to play.
msg347722 - (view) Author: Karthikeyan Singaravelan (xtreak) * (Python committer) Date: 2019-07-12 05:23
> Karthikeyan, I think there is a flaw in your example. The good __eq__ should return NotImplemented instead of False for other types. This wil allow the right operand's __eq__ to play.

Serhiy, my example was to try reproducing the original report where Django model tries to check if the other's type and returns False if the type is not a Django model. There could be cases where __eq__ implementation of a class could be just returning False instead of raising a NotImplemented error due to difference in type that affects comparison of ANY.

> Django first checks if the thing it's comparing to is another Django model, and returns False if not. So, <DjangoModel> == ANY is False, but ANY == <DjangoModel> is True.
msg347729 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2019-07-12 08:43
Then I think the problem is with DjangoModel.

I suggest to close this issue as "third party".
msg347730 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2019-07-12 08:45
Though some __eq__ implementations in the stdlib should be fixed too.
msg347731 - (view) Author: Karthikeyan Singaravelan (xtreak) * (Python committer) Date: 2019-07-12 09:17

mock.ANY doesn't make any guarantees that this is a generic implementation that can be used to test equality against any object irrespective of the order to return True. It's documented only to be used for mock calls but there are many users using it for other objects too outside of mock assertions. mock.ANY also depends on the order so that mock.ANY.__eq__ is used but I am not sure about documenting the order here.

Would be helpful if Elizabeth presents a simplified example to evaluate if there are any bugs in the call comparison with ANY or to close it as third party as suggested. As noted in my PR comment there should be a test case for this change proposed to make sure someone doesn't change the order by accident in future :

> Though some __eq__ implementations in the stdlib should be fixed too.

Curious about what problems that this example surface in the stdlib definition of __eq__ . Are you saying that there are Django model like __eq__ definitions where they return False when other's type is different?
msg347733 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2019-07-12 09:23
Yes, see timedelta.__eq__ for example. datetime.__eq__ looks correct.
msg347737 - (view) Author: Karthikeyan Singaravelan (xtreak) * (Python committer) Date: 2019-07-12 10:30
Good catch, commenting out the c implementation of datetime in I can see the following difference.

➜  cpython git:(master) ✗ ./python.exe
Python 3.9.0a0 (heads/master-dirty:c8e7146de2, Jul 12 2019, 15:51:00)
[Clang 7.0.2 (clang-700.1.81)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import datetime, sys, unittest.mock
>>> datetime.timedelta(seconds=1) == unittest.mock.ANY

➜  cpython git:(master) ✗ python3.7
Python 3.7.3 (v3.7.3:ef4ec6ed12, Mar 25 2019, 16:52:21)
[Clang 6.0 (clang-600.0.57)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import datetime, sys, unittest.mock
>>> datetime.timedelta(seconds=1) == unittest.mock.ANY

The C implementation and Python are different for timedelta__eq__ as mentioned with datetime.__eq__ being same and correct in NotImplementedError.

static PyObject *
delta_richcompare(PyObject *self, PyObject *other, int op)
    if (PyDelta_Check(other)) {
        int diff = delta_cmp(self, other);
        return diff_to_bool(diff, op);
    else {

def __eq__(self, other):
    if isinstance(other, timedelta):
        return self._cmp(other) == 0
        return False
msg347739 - (view) Author: Karthikeyan Singaravelan (xtreak) * (Python committer) Date: 2019-07-12 11:10
I can see difference in time.__eq__ and timedelta.__eq__ C and Python implementations. Serhiy, are you planning to fix them or shall I raise an issue with PR for these?
msg347753 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2019-07-12 19:55
You can take it, please.
msg347818 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2019-07-13 13:33
Would be nice to report a bug in Django.
msg347941 - (view) Author: Elizabeth Uselton (ElizabethU) * Date: 2019-07-14 21:07
Hi there, I completely missed that this had caused so much interesting discussion. I've added a regression test that shows the bug I was encountering, which seems to be related to 
 spec_set bypassing _Call's overwritten __eq__ and so not respecting ANY

I agree, this is definitely also a third party bug with Django, but I think it's a problem here too, and am working on a fix.
msg347949 - (view) Author: Karthikeyan Singaravelan (xtreak) * (Python committer) Date: 2019-07-15 05:40
Thanks Elizabeth for the test. The regression test seems to be same as the case noted by Serhiy that Foo.__eq__ is not returning NotImplemented so that ANY.__eq__ can be executed. Below would be the correct implementation that passes. 

The actual comparison is done at [0]. If Foo.__eq__ returned NotImplemented due to type difference it would have called other.arguments. So there is no chance for ANY.__eq__ to be executed. I feel it's more about the third party class that needs to be fixed rather than the stdlib code here and changing the order of arguments for ANY's __eq__ precedence might introduce other subtle bugs.

self.arguments = OrderedDict([('args', (<__main__.Foo object at 0x10a54b500>,))])
other.arguments = OrderedDict([('args', (<ANY>,))])

def __eq__(self, other):
    if self is other:
        return True
    if not isinstance(other, BoundArguments):
        return NotImplemented
    return (self.signature == other.signature and
            self.arguments == other.arguments)

# Better implementation

from unittest.mock import Mock, call, ANY

class Foo(object):
     def __eq__(self, other):
          if not isinstance(other, self.__class__):
               return NotImplemented
          return True
     def __ne__(self, other): pass

mock = Mock(spec_set=Foo)
expected = [call(ANY)]



3.5 and 3.6 are in security fixes only mode. If this is considered to be a bug it can go in master, 3.8 and 3.7. The tests section is for CPython test suite and not for the unittest related bugs. So triaging it back.
msg347961 - (view) Author: Paul Ganssle (p-ganssle) * (Python committer) Date: 2019-07-15 14:20
Maybe I am missing something, but while it is true that DjangoModel is doing the wrong thing by returning False instead of NotImplemented, the `ANY` sentinel is supposed to match *anything*, not just things that compare equal to it, right? I would expect this to work, for example:

  class EqualsNothing:
    def __eq__(self, other):
        return False

  m = Mock(spec_set=EqualsNothing)
  obj = EqualsNothing()

In that example, it is deliberate that EqualsNothing returns False for any type, but ANY should still match it. I think maybe the solution here is to special-case the matching with ANY so the criterion for a call matching would be `y is ANY or x == y`.
msg348068 - (view) Author: Elizabeth Uselton (ElizabethU) * Date: 2019-07-17 15:50
I feel like I agree with Paul here, unsurprising behavior from ANY is it matching anything, with no expectation that third party objects have to correctly have a return path for NotImplemented on their __eq__ method. ANY doesn't currently do that.

I've added a commit with couple other failing tests showing how using assertEqual is currently order sensitive when used with _Call or _CallList objects. I've got a work in progress fixing all three tests that just needs a little clean up, but think I might be able to lessen the complexity of it by adding Paul's idea of just special casing ANY directly, which may be smaller than the fix I have.

Thank you all for providing such great feedback so far, this is really interesting stuff to me, and a great first time submitting to Python experience so far.
msg348105 - (view) Author: Elizabeth Uselton (ElizabethU) * Date: 2019-07-18 07:42
Just added a fix for the tests I added in the last commit showing the order sensitivity in assertEqual when dealing with ANY and _Call or _CallList objects. _Call and _CallList currently depend on ordering to correctly process that an object being compared to ANY with __eq__ should return True. I put more notes in the commit message.

A note on the two newer tests: I want to be sensitive to the idea that I might be pushing back against desired behavior here. I think there is a case to be made that one should expect mock.assertEqual(a, b) to implement a == b, and not care about b == a, in which case maybe order sensitivity in assertEqual makes sense. That said, today I showed a few of my most Python experience heavy colleagues the tests showing that assertEqual was order sensitive when passed _Call or _CallList objects, and based on anecdotal evidence the existing behavior definitely violates the principle of least astonishment. Additionally, the test Karthikeyan kindly linked me seems to imply that an order agnostic assertEqual is the desired behavior.

A note on the fix: This fix feels like it's not as small as it could be. I am still musing over whether it'd be better if instead of the or statements and swapping the comparison order in _CallList's __contains__ method, I made a wrapper that would swap that comparison order in the case that one of the arguments is a BoundArgument, since that was the original problem. That fix would probably not affect the order sensitivity of assertEqual, so it's partially about what we want there. I'll play around with it and see what I can discover. I'll also see if I can do anything to more explicitly to check for ANY, as Paul suggested.

I'm very curious what you all think about whether the assertEqual behavior is a feature or a bug, and if the other fix idea seems better than this one. If neither of these ideas seem like a good fix, I'd love some guidance on what concerns are and what might be a good approach to try next.
msg348121 - (view) Author: Elizabeth Uselton (ElizabethU) * Date: 2019-07-18 17:13
Giving this a reread with fresh eyes this morning, I realized I wasn't explicit enough in saying that I see that this fix is about the same as the one you mentioned, Karthikeyan, but said you were concerned might cause more subtle bugs. I hear that, and will try some more specific fixes, I was just trying to gather whether the tests showing 

self.assertEqual(call(ANY).call_list(), mock.mock_calls)
self.assertEqual(mock.mock_calls, call(ANY).call_list())

results in one passing and the other not (and the same for calls) changed the math on how brittle the current logic is, so just leaving it seemed like less of an option. I'll keep working on some other ideas though, whether we can check for ANY directly, or just wrap the BoundArguments that are causing problems when the mock has spec set.
msg348214 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2019-07-20 14:39
This issue is a part of more general issue. I have opened a thread on Python-Dev for discussing it:
msg348278 - (view) Author: Elizabeth Uselton (ElizabethU) * Date: 2019-07-22 05:33
Serhiy, thanks for pointing that out. I generally agree with everything in that thread, and learned some new things (I had no idea count(), index() and remove() used needle on the left side!) However, I'm not trying to spearhead a rewrite of everything. I think what I'm advocating for on the two assertEqual order tests is that as Guido mentioned "people should fix their __eq__ implementation to properly return NotImplemented". _Call's __eq__ currently has no path to NotImplemented, and if it did, those tests would pass. Perhaps I should change it from the or statement to make it more clear that's what's effectively happening.

I'd like to talk more about what a good solution would be for the _Call __eq__ 
issue, since people (including myself!) have concerns, but I think it would be best to open another bug for that, and not scope creep myself. I've reverted all the commits having to do with that part so hopefully the issue with _call_matcher can get sorted separately, and then we can talk about this as it's own issue.
msg348453 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2019-07-25 21:16
See issue37685.
msg352351 - (view) Author: Paul Ganssle (p-ganssle) * (Python committer) Date: 2019-09-13 15:54
New changeset d6a9d17d8b6c68073931dd8ffa213b4ac351a4ab by Paul Ganssle (Elizabeth Uselton) in branch 'master':
bpo-37555: Update _CallList.__contains__ to respect ANY (#14700)
msg378782 - (view) Author: Irit Katriel (iritkatriel) * (Python triager) Date: 2020-10-16 22:59
Does this need a backport? It's in 3.9 but not 3.8.
msg378800 - (view) Author: Karthikeyan Singaravelan (xtreak) * (Python committer) Date: 2020-10-17 03:56
I would prefer not backporting this change given the discussion in
msg378837 - (view) Author: Elizabeth Uselton (ElizabethU) * Date: 2020-10-17 21:57
I believe it should work in both 3.8 and 3.9, the difference is that someone (Serhiy if I'm remembering correctly?) did the work in 3.9 of standardizing which side of the = the needle was on. So, this change works on 3.9 without the _AnyComparer class I added, but I believe unless _AnyComparer has been removed, this change would work in 3.8. Xtreak, it seems like most of your concerns about backporting it are about how complicated this part of the code is, and wanting it to get more testing via the mock backport? Is there any way I can help mitigate those concerns, or keep up with how much use it's gotten from the mock backport so we can determine when (if ever) it would be a good time to backport this?
Date User Action Args
2020-10-17 21:57:05ElizabethUsetmessages: + msg378837
2020-10-17 03:56:31xtreaksetmessages: + msg378800
2020-10-16 22:59:55iritkatrielsetnosy: + iritkatriel
messages: + msg378782
2019-09-13 15:54:35p-gansslesetmessages: + msg352351
2019-07-25 21:16:11serhiy.storchakasetmessages: + msg348453
2019-07-22 05:33:08ElizabethUsetmessages: + msg348278
2019-07-20 14:39:27serhiy.storchakasetmessages: + msg348214
2019-07-18 17:13:45ElizabethUsetmessages: + msg348121
2019-07-18 07:42:49ElizabethUsetmessages: + msg348105
2019-07-18 07:26:29xtreaksetnosy: + cjw296, michael.foord, mariocj89
2019-07-17 15:50:53ElizabethUsetmessages: + msg348068
2019-07-15 14:20:09p-gansslesetnosy: + p-ganssle
messages: + msg347961
2019-07-15 05:40:17xtreaksetversions: - Python 3.5, Python 3.6
nosy: + serhiy.storchaka, xtreak

messages: + msg347949

components: + Library (Lib), - Tests
2019-07-14 21:07:28ElizabethUsetversions: + Python 3.5, Python 3.6
nosy: - serhiy.storchaka, xtreak

messages: + msg347941

components: + Tests, - Library (Lib)
2019-07-13 13:33:03serhiy.storchakasetmessages: + msg347818
2019-07-12 19:55:31serhiy.storchakasetmessages: + msg347753
2019-07-12 11:10:06xtreaksetmessages: + msg347739
2019-07-12 10:30:42xtreaksetmessages: + msg347737
2019-07-12 09:23:19serhiy.storchakasetmessages: + msg347733
2019-07-12 09:17:46xtreaksetmessages: + msg347731
2019-07-12 08:45:43serhiy.storchakasetmessages: + msg347730
2019-07-12 08:43:47serhiy.storchakasetmessages: + msg347729
2019-07-12 05:23:57xtreaksetmessages: + msg347722
2019-07-11 18:01:08serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg347698
2019-07-11 05:46:59xtreaksetversions: - Python 3.5, Python 3.6
nosy: + xtreak

messages: + msg347654

components: + Library (Lib), - Tests
2019-07-11 05:42:04python-devsetkeywords: + patch
stage: patch review
pull_requests: + pull_request14499
2019-07-11 05:12:04ElizabethUcreate