classification
Title: Add missing multiset predicates to collections.Counter
Type: enhancement Stage: resolved
Components: Library (Lib) Versions: Python 3.10
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: rhettinger Nosy List: rhettinger, veky
Priority: normal Keywords: patch

Created on 2020-05-24 15:11 by rhettinger, last changed 2020-05-31 21:57 by rhettinger. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 20339 merged rhettinger, 2020-05-24 15:11
PR 20548 merged rhettinger, 2020-05-31 01:53
Messages (7)
msg369808 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2020-05-24 15:11
These missing predicates have been requested a number of times:

   isequal()
   issubset()
   issuperset()
   isdisjoint()
msg369938 - (view) Author: Vedran Čačić (veky) * Date: 2020-05-26 03:47
isequal is really strange considering we're talking about Python here. Do any of other stdlib types have that method instead of just using == (which works fine even now)? I'd even spell the second and third as <= and >=, same as set does.

But if we're finally going to accept that Counters are just bags (CS term for multisets), then surely .add and .remove (and maybe .discard, setting the count to 0) would be more natural additions. I can't count (pun intended) all the times I had to write that '] += 1' just to count some element.
msg370228 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2020-05-28 15:35
New changeset 60398512c86c5535edd817c99ccb50453b3b0471 by Raymond Hettinger in branch 'master':
bpo-40755: Add missing multiset operations to Counter() (GH-20339)
https://github.com/python/cpython/commit/60398512c86c5535edd817c99ccb50453b3b0471
msg370230 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2020-05-28 15:41
I would also have preferred to use the operators <, >, <=, >=, and ==.  The docs in the patch explain why we can't go down this path.

Also, while counters have support for multiset operations, they continue to support other use cases a well (negative counts and fractional counts). That support can't be removed without breaking existing code that relies on it.

From the outset, a Counter was just a dictionary that return 0 for missing keys.  Users are free to use that concept however they want.
msg370409 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2020-05-31 01:51
After more thought, I've found a way to use the rich comparisons as requested.  Doing so consistently required that the __eq__ method treat missing elements as having a zero count.  See attached PR.
msg370410 - (view) Author: Vedran Čačić (veky) * Date: 2020-05-31 02:24
I'm very glad for that. :-)

For the other part of my message, I never intended to remove the support for non-natural counts. I just wanted to add some more methods to the natural part of Counter's API. It already has some methods which assume natural counts: .elements(), for example.
msg370512 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2020-05-31 21:57
New changeset b7d79b4f36787874128c439d38397fe95c48429b by Raymond Hettinger in branch 'master':
bpo-40755: Add rich comparisons to Counter (GH-20548)
https://github.com/python/cpython/commit/b7d79b4f36787874128c439d38397fe95c48429b
History
Date User Action Args
2020-05-31 21:57:48rhettingersetmessages: + msg370512
2020-05-31 02:24:25vekysetmessages: + msg370410
2020-05-31 01:53:34rhettingersetpull_requests: + pull_request19792
2020-05-31 01:51:32rhettingersetmessages: + msg370409
2020-05-28 15:41:30rhettingersetstatus: open -> closed
resolution: fixed
messages: + msg370230

stage: patch review -> resolved
2020-05-28 15:35:53rhettingersetmessages: + msg370228
2020-05-26 03:47:36vekysetnosy: + veky
messages: + msg369938
2020-05-24 15:11:35rhettingersetkeywords: + patch
stage: patch review
pull_requests: + pull_request19623
2020-05-24 15:11:00rhettingercreate