classification
Title: Conflation of Counter with Multiset
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.10, Python 3.9, Python 3.8, Python 3.7, Python 3.6, Python 3.5
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: rhettinger, wpk-
Priority: normal Keywords:

Created on 2020-05-25 09:28 by wpk-, last changed 2020-05-25 17:23 by rhettinger. This issue is now closed.

Messages (2)
msg369867 - (view) Author: Paul (wpk-) Date: 2020-05-25 09:28
The collections docs state: "Several mathematical operations are provided for combining Counter objects to produce multisets (counters that have counts greater than zero)."

I am surprised at the clear level of decision into conflating counters with multisets. Why break all functionality for negative counts in favour of multisets? Why not create a Multiset object for multisets?

One example use of negative counts is in factorisation (https://bugs.python.org/msg368298 will be surprised counters don't count)
18   = 2**1 * 3**2  --> x18 = Counter({2: 1, 3: 2})
 4   = 2**2         --> x4 = Counter({2: 2})

To compute 18/4 in this representation (which I believe is exactly precisely a count), one would expect

18/4 = 2**-1 * 3**2 --> x4_5 = x18 - x4 = Counter({2: -1, 3: 2})

But instead,

x18 - x4 = Counter({3: 2}) = 9 ???

This is just an example. The use case for negative counts is plain and obvious. The question is: why does collections break counter behaviour in favour of conflation with multisets? Why not have two objects: Counter for counters and Multiset for multisets?
msg369897 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2020-05-25 17:23
For the most part, Counter() works fine with negative counts.  The update() and subtract() methods were specifically designed to work with negative values.  Nothing prevents use cases with negative counts.

In addition, there are some methods like elements() that only make sense with positive counts.  So if your use case has negative counts, then these methods methods wouldn't be applicable.

Should the Counter() have been two different classes?  Maybe yes, maybe no.  But that ship sailed a long time ago.  For now, it is what it is and wouldn't be easy to change without breaking a lot of code.

From the outset, the central concept of Counter() is that it is a dictionary that returns zero when a value is missing.  Pretty much everything else is a set of convenience methods supporting all the different ways people aspire to use it (multisets, bags, counters, sparse arrays, etc).  People needing multiset methods use the multiset methods.  People want negative counts use the other methods.

Am marking this a closed.  There isn't much that can be changed here.  Also, theoretical objections aside, what we have now seems to be working well enough for most people most of the time.
History
Date User Action Args
2020-05-25 17:23:39rhettingersetstatus: open -> closed
resolution: not a bug
messages: + msg369897

stage: resolved
2020-05-25 12:21:00corona10setnosy: + rhettinger
2020-05-25 09:28:30wpk-create