Issue 22515: Implement partial order on Counter

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/66705

classification

Title:	Implement partial order on Counter
Type:	enhancement	Stage:	patch review
Components:	Library (Lib)	Versions:	Python 3.5

process

Status:	closed	Resolution:	rejected
Dependencies:		Superseder:
Assigned To:	rhettinger	Nosy List:	Aaron.Meurer, Saimadhav.Heblikar, cool-RR, eric.smith, ethan.furman, josh.r, mark.dickinson, pitrou, r.david.murray, rhettinger, scoder, serhiy.storchaka, steven.daprano
Priority:	normal	Keywords:	patch

Created on 2014-09-29 19:33 by cool-RR, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Files
File name	Uploaded	Description	Edit
patch.patch	cool-RR, 2014-10-03 09:08		review
patch2.patch	cool-RR, 2014-10-03 20:47		review
issue22515.stoneleaf.patch.01	ethan.furman, 2014-10-03 23:53		review
issue22515_partial_order_counter_v2.diff	Saimadhav.Heblikar, 2014-10-04 06:57		review
issue22515_partial_order_counter_v3.diff	serhiy.storchaka, 2014-10-04 19:39		review

Messages (66)
msg227819 - (view)	Author: Ram Rachum (cool-RR) *	Date: 2014-09-29 19:33
I suggest implementing `Counter.__lt__` which will be a partial order, similarly to `set.__lt__`. That is, one counter will be considered smaller-or-equal to another if for any item in the first counter, the second counter has an equal or bigger amount of that item.
msg227824 - (view)	Author: Steven D'Aprano (steven.daprano) *	Date: 2014-09-29 20:15
On Mon, Sep 29, 2014 at 07:33:21PM +0000, Ram Rachum wrote: > I suggest implementing `Counter.__lt__` which will be a partial order, > similarly to `set.__lt__`. Since Counter is described as a multiset, this sounds reasonable to me.
msg227827 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2014-09-29 20:24
What should be result of following operations? Counter({'a': 0}) < Counter({}) Counter({}) < Counter({'a': 0}) Counter({'a': -1}) < Counter({}) Counter({}) < Counter({'a': -1})
msg227828 - (view)	Author: Ram Rachum (cool-RR) *	Date: 2014-09-29 20:26
I suggest they be ignored like in `elements`.
msg227829 - (view)	Author: Ram Rachum (cool-RR) *	Date: 2014-09-29 20:30
(I mean, the non-positive values should be ignored.)
msg227831 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2014-09-29 20:35
There is other definition of the <= operator for sets: "A <= B" is equivalent to "len(A - B) == 0". Extending to Counter this can mean "len(A.subtract(B).elements()) == 0".
msg227842 - (view)	Author: Ethan Furman (ethan.furman) *	Date: 2014-09-29 21:15
What would be the result of Counter({'a':1, 'b':2}) < Counter({'a':2, 'b':1}) ?
msg227844 - (view)	Author: Ram Rachum (cool-RR) *	Date: 2014-09-29 21:20
False, like with sets.
msg227845 - (view)	Author: R. David Murray (r.david.murray) *	Date: 2014-09-29 21:21
Thanks, Ethan. I had a feeling that this wasn't well defined but I couldn't come up with an example :)
msg227846 - (view)	Author: Ram Rachum (cool-RR) *	Date: 2014-09-29 21:24
David, there's nothing here that isn't well defined. It's simply a partial order, not a total order. We have the same for sets.
msg227847 - (view)	Author: Ethan Furman (ethan.furman) *	Date: 2014-09-29 22:06
set's don't have values, and you are wanting to implement the partial ordering based on the values. (side-note: how does partial-ordering work for sets?) > That is, one counter will be considered smaller-or-equal to another if for any > item in the first counter, the second counter has an equal or bigger amount of > that item. According to your definition, my example should have returned True, which is clearly nonsensical. Even if you changed the definition to: For every item in the first counter, that item's value is less than the corresponding item in the second counter. You have situations like: Counter({'a':1, 'b':1}) < Counter({'a':2}) I just don't think there is one interpretation that is going to be correct most of the time.
msg227848 - (view)	Author: Ram Rachum (cool-RR) *	Date: 2014-09-29 22:30
Ethan, I don't understand what the problem is. I also don't understand your side note question "how does partial-ordering work for sets?" I'm not sure what you're asking. >> That is, one counter will be considered smaller-or-equal to another if for any >> item in the first counter, the second counter has an equal or bigger amount of >> that item. > According to your definition, my example should have returned True, which is clearly nonsensical. In your example the first counter had `b` twice, while the second counter had it only once. So the result is `False`. This comes pretty directly from my definition, I'm not sure where the confusion is. Regarding your new example: Since `Counter` works like a `defaultdict`, the second counter has a zero quantity of `b`. So the answer is again `False`.
msg227849 - (view)	Author: Ram Rachum (cool-RR) *	Date: 2014-09-29 22:32
To put it another way: In Python sets, `a <= b` iff `b` has everything that `a` has, and possibly more. I'm proposing the exact same definition for counters. True, the implementation might be different because sets are not dicts and don't have `.values()`, but that is just the implementation, which is not the important thing here.
msg227850 - (view)	Author: Ethan Furman (ethan.furman) *	Date: 2014-09-29 22:36
That is certainly a better definition. If you carefully reread your original post you will see that you wrote something different ('any' instead of 'every'). Also, your OP uses '__lt__' but talks about 'less-than-or-equal -- somewhat confusing. ;)
msg227851 - (view)	Author: Ram Rachum (cool-RR) *	Date: 2014-09-29 22:44
You're right, sorry. I meant the mathematical "for any" which means "for every": https://en.wikipedia.org/wiki/List_of_mathematical_symbols (See "for any;")
msg227852 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2014-09-29 22:54
The request sounds ok to me. As in the corner case of negative values, it seems that Counter currently doesn't elide zero values, e.g.: >>> Counter({'a': 0}) == Counter({}) False Therefore, we should have: >>> Counter({'a': -1}) < Counter({'a': 0}) True but: >>> Counter({'a': -1}) < Counter({}) False (I must admit, the current semantics of Counter look a bit ad hoc and not terribly consistent... halfway between a regular mapping and a multiset)
msg227857 - (view)	Author: Ram Rachum (cool-RR) *	Date: 2014-09-29 23:13
Shall I write a patch?
msg227858 - (view)	Author: Ethan Furman (ethan.furman) *	Date: 2014-09-29 23:17
I think the final call is Raymond's (whether to accept the patch), but go ahead. Worst case scenario is you'll have your own thoroughly tested PartiallyOrderedCounter you can use in your own code. :)
msg227860 - (view)	Author: Ram Rachum (cool-RR) *	Date: 2014-09-29 23:21
If that's the case I'd prefer Raymond to first say whether this feature is generally welcome before spending my time on making a patch.
msg228071 - (view)	Author: Eric V. Smith (eric.smith) *	Date: 2014-10-01 14:18
Is there some particular problem you're trying to solve, which this would make easier? Without a use case, I'm -1.
msg228072 - (view)	Author: Ram Rachum (cool-RR) *	Date: 2014-10-01 14:23
I needed it for an internal calculation in a combinatorics package I'm writing. Do you want me to take the time to explain the calculation? It's quite complex.
msg228077 - (view)	Author: Eric V. Smith (eric.smith) *	Date: 2014-10-01 14:50
No need to explain it. It sounds like it's not generally useful, so I'm still -1.
msg228082 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2014-10-01 16:04
Even if you don't find it useful, Eric, it doesn't take up the method space. You can very easily ignore it, there's no cognitive burden.
msg228083 - (view)	Author: Ethan Furman (ethan.furman) *	Date: 2014-10-01 16:27
If it is so specialized as to only be needed in complex combinatorial calculations, does it belong in the general-purpose part of the language? After all, we have the math and cmath modules for more specialized arithmetic operations.
msg228084 - (view)	Author: Ethan Furman (ethan.furman) *	Date: 2014-10-01 16:29
Curiousity question: What happens if you try to sort a list of partially ordered Counters?
msg228085 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2014-10-01 16:30
> If it is so specialized as to only be needed in complex combinatorial calculations How do you know it is "only needed in complex combinatorial calculations"? > What happens if you try to sort a list of partially ordered Counters? Try it with partially ordered sets.
msg228086 - (view)	Author: Ram Rachum (cool-RR) *	Date: 2014-10-01 16:42
I don't see why it's so hard to imagine how this will be used. Say I have a counter signifying how many of each product I have in my warehouse, and I have another counter saying how many of each product a client wants. I may use `counter2 <= counter1` to check whether there are enough products in stock. Sounds straightforward to me.
msg228087 - (view)	Author: Ethan Furman (ethan.furman) *	Date: 2014-10-01 16:57
--> s1 = set([1]) --> s2 = set([1, 2]) --> s3 = set([1, 2, 3]) --> s4 = set([2]) --> s5 = set([2, 3]) --> s6 = set([3]) --> l = [s1, s2, s3, s4, s5, s6] --> sorted(l) [{1}, {2}, {1, 2}, {3}, {2, 3}, {1, 2, 3}] --> s1 < s4 False --> s4 < s2 True --> s1 < s2 True --> s4 < s6 False --> l = [s1, s4] --> sorted(l) [{1}, {2}] --> sorted(l) [{1}, {2}] Looks horribly messy to me. In the last example we can see that neither s1 nor s4 are smaller, yet s1 is consistently put first. On the other hand, I suppose it's okay for Counter to have this behavior since it's already in sets.
msg228088 - (view)	Author: Steven D'Aprano (steven.daprano) *	Date: 2014-10-01 16:59
Ethan said: > If it is so specialized as to only be needed in complex combinatorial > calculations, does it belong in the general-purpose part of the > language? It's a multi-set, a general purpose and fairly fundamental data type. https://en.wikipedia.org/wiki/Set_%28abstract_data_type%29#Multiset And later: > Curiousity question: What happens if you try to sort a list of > partially ordered Counters? The same thing that happens when you sort a list of any partially ordered objects, such as sets: py> sorted([{1, 2, 3, 4}, {2, 4}, {1, 3}, {2, 3, 4}, {1, 2, 3}]) [{2, 4}, {1, 3}, {2, 3, 4}, {1, 2, 3}, {1, 2, 3, 4}] You get some order, but since sorting assumes a total order, not just partial order, the result isn't really meaningful, and will very likely depend on the initial order of the items. If that worries you, then don't sort items that implement only a partial order.
msg228089 - (view)	Author: Ethan Furman (ethan.furman) *	Date: 2014-10-01 17:10
I'll go with +0.5. :) If this goes in, I think a missing key in either Counter should default to 0 for purposes of the ordering.
msg228090 - (view)	Author: Ram Rachum (cool-RR) *	Date: 2014-10-01 17:13
If/when there's general agreement that this functionality should be merged in (assuming the implementation is acceptable), let me know and I'll be happy to write the code and tests.
msg228094 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2014-10-01 17:18
> I'll go with +0.5. :) I was going to make a joke about Counters only accepting integral values, but the constructor is actually quite laxist: >>> Counter({'a': 2.5}) Counter({'a': 2.5}) >>> Counter({'a': 2.5 + 1j}) Counter({'a': (2.5+1j)}) >>> Counter({'a': 'b'}) Counter({'a': 'b'}) (!)
msg228096 - (view)	Author: Ethan Furman (ethan.furman) *	Date: 2014-10-01 17:21
That does seem odd -- how can you have 'b' of something?
msg228097 - (view)	Author: Ram Rachum (cool-RR) *	Date: 2014-10-01 17:24
The real problem is that `Counter` was put into the standard library with such lenient conditions on how it should be used-- basically looking at it as a `dict` subclass rather than a counter, putting emphasis on implementation rather than purpose, which was a mistake that we'll now have to deal with forever.
msg228098 - (view)	Author: Stefan Behnel (scoder) *	Date: 2014-10-01 17:28
> That does seem odd -- how can you have 'b' of something? Don't ask this kind of question as long as any math guys are listening. But seriously, I consider this proposal reasonable. If the problem is that the counter values can be non-integers, then why not just raise an exception if someone tries to test the ordering with illegal (i.e. unordered or uncomparable) values? Data quality issues are best handled on user side.
msg228099 - (view)	Author: Ram Rachum (cool-RR) *	Date: 2014-10-01 17:31
Stefan: If you'd want to raise an exception, you'll need to define in which cases. Should it raise an exception for decimal numbers? Complex numbers? etc. Personally I'd enjoy it raising exceptions in as many situations as possible (so we could keep Counter usage focused to actual counter usage and not just wild dicts) but I doubt it'll sit well with people who might use Counters with weird values.
msg228100 - (view)	Author: Ethan Furman (ethan.furman) *	Date: 2014-10-01 17:35
I have to disagree. The intent is clearly expressed in the docs [1]. However, if I have a need to deal with partial amounts (say, 2.5 apples because I gave half of one to my horse ;), Counter will still work with that: --> treats = Counter({'carrots':12, 'apples':3, 'sugar_cubes':100}) --> treats Counter({'sugar_cubes': 100, 'carrots': 12, 'apples': 3}) --> treats['apples'] -= 0.5 --> treats Counter({'sugar_cubes': 100, 'carrots': 12, 'apples': 2.5}) At least, it will until we fix that bug. ;)
msg228101 - (view)	Author: Stefan Behnel (scoder) *	Date: 2014-10-01 17:36
I'd raise an exception whenever the values of the Counter turn out to be uncomparable. If users manage to fill Counters with non-number values that are comparable amongst each other, or even if each key has a value of a different type, why not just support that? All that is really required for this is that you can compare a) keys for equality and b) values of identical keys for ordering, right?
msg228102 - (view)	Author: Ram Rachum (cool-RR) *	Date: 2014-10-01 17:38
Ah, now I understand you Stefan. That sounds like a good scheme; let any comparison errors between the values simply propagate up.
msg228105 - (view)	Author: Ethan Furman (ethan.furman) *	Date: 2014-10-01 17:51
[1] https://docs.python.org/3/library/collections.html#collections.Counter
msg228236 - (view)	Author: Ethan Furman (ethan.furman) *	Date: 2014-10-02 16:15
I, myself, wrote: ----------------- > At least, it will until we fix that bug. ;) Happily, reading the whole documentation revealed that non-integer values are allowed, so it's not a bug. :)
msg228312 - (view)	Author: Ram Rachum (cool-RR) *	Date: 2014-10-03 09:08
Attached patch of implementation with tests. I haven't actually run this (because I don't know how to run Python's test suite) so there are likely to be bugs. if there's general agreement about this direction I'll figure out how to run the test so I could ensure it works.
msg228374 - (view)	Author: Ram Rachum (cool-RR) *	Date: 2014-10-03 20:47
Fixed patch attached. Still needs someone to run it and see whether there are any bugs.
msg228391 - (view)	Author: Ethan Furman (ethan.furman) *	Date: 2014-10-03 22:05
Testing the patch...
msg228398 - (view)	Author: Ethan Furman (ethan.furman) *	Date: 2014-10-03 22:24
In going through the tests I have found an edge-case that I am unsure of how to handle: Counter(a=1) < Counter(a=1, b=1) On the one hand, they both have the same value for 'a'; on the other hand, the second counter has more elements... So should the result be False or True?
msg228401 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2014-10-03 22:25
> Counter(a=1) < Counter(a=1, b=1) Do an analogy with sets and the answer will be obvious.
msg228402 - (view)	Author: Ethan Furman (ethan.furman) *	Date: 2014-10-03 22:29
I am not a mathematician -- feel free to include the answer with your hints. ;) If my analogy is correct, this situation is similar to {1} < {1, 2} Which is True. Can someone verify that I am correct?
msg228403 - (view)	Author: Ram Rachum (cool-RR) *	Date: 2014-10-03 22:30
Yes, correct.
msg228411 - (view)	Author: Ethan Furman (ethan.furman) *	Date: 2014-10-03 23:53
New patch attached.
msg228429 - (view)	Author: Saimadhav Heblikar (Saimadhav.Heblikar) *	Date: 2014-10-04 06:57
I mentioned my wording for the comment about lesser than partial ordering using the code review tool. In the attached patch(based on issue22515.stoneleaf.patch.01), I have added test for the case when the other object is not a mapping. I have also tried to refactor the __le__ and __ge__ methods. Please comment on the changes. Regards
msg228433 - (view)	Author: Ram Rachum (cool-RR) *	Date: 2014-10-04 07:38
I don't really understand the refactoring. For example why is `len(self) <= len(other)` needed? For which case? Note that `len` would also catch keys with `0` value which shouldn't have an effect.
msg228436 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2014-10-04 07:57
There are some properties of set comparison: (a < b) == (a <= b and a != b) (a <= b) == (a < b or a == b) (a <= b and b <= a) == (a == b) (a < b and b < a) == False (a <= b) == (a - b == set()) if (a <= b and b <= c) then (a <= c) (a <= b and a <= c) == (a <= (b & c)) (a <= c and b <= c) == ((a \| b) <= c) Is this true for Counter?
msg228440 - (view)	Author: Ram Rachum (cool-RR) *	Date: 2014-10-04 08:38
> There are some properties of set comparison: > > (a < b) == (a <= b and a != b) I don't think so, because counter equality is based on dict equality so it doesn't ignore zero values. > (a <= b) == (a < b or a == b) No, ditto. > (a <= b and b <= a) == (a == b) No, ditto. > (a < b and b < a) == False Yes. > (a <= b) == (a - b == set()) Yes. > if (a <= b and b <= c) then (a <= c) Yes. > (a <= b and a <= c) == (a <= (b & c)) Yes. > (a <= c and b <= c) == ((a \| b) <= c) Yes.
msg228441 - (view)	Author: Steven D'Aprano (steven.daprano) *	Date: 2014-10-04 09:58
On Sat, Oct 04, 2014 at 07:57:16AM +0000, Serhiy Storchaka wrote: > There are some properties of set comparison: > > (a < b) == (a <= b and a != b) > (a <= b) == (a < b or a == b) > (a <= b and b <= a) == (a == b) > (a < b and b < a) == False > (a <= b) == (a - b == set()) > if (a <= b and b <= c) then (a <= c) > (a <= b and a <= c) == (a <= (b & c)) > (a <= c and b <= c) == ((a \| b) <= c) > > Is this true for Counter? I believe these are all true for multisets. But Counter is not just a multiset (a.k.a. bag) since the "counts" are not limited to non-negative integers (the natural numbers). py> C = Counter() py> C['a'] = -3 py> C['b'] = 'spam' py> C['c'] = 0.5 py> C Counter({'c': 0.5, 'b': 'spam', 'a': -3}) I don't know of any standard definition for subset and superset for multisets like C above. I think that we should raise an exception in cases like that: TypeError for cases where any "count" is non-numeric, and ValueError for cases where any count is not a non-negative integer. (Later, if somebody comes up with a sensible meaning for sub/superset of such a Counter, we can relax that restriction.) Another issue: are missing elements treated as if they have count 0? E.g. what are these? Counter({'a': 0, 'b': 1}) <= Counter({'a': 1, 'b': 1}) Counter({'a': 0, 'b': 1}) <= Counter({'b': 1}) According to these pages: http://singularcontiguity.wordpress.com/2008/05/07/multiset-equality-and-submultisets/ http://singularcontiguity.wordpress.com/2008/04/28/multiset-membership/ I would argue that they should both return True (based on the idea that the "support" of a multiset is the set of those elements with non-zero count). But I don't know how authoritative that blog is. (The author even mentions that he only has a passing familiarity with some aspects of multisets.) Anyone familiar with Haskell? What does this bag class do? http://hackage.haskell.org/package/uulib-0.9.8/docs/UU-DData-IntBag.html This may also be helpful. Sets and bags in Z language: http://www.cs.ox.ac.uk/people/michael.wooldridge/teaching/soft-eng/lect08.pdf http://www.cs.ox.ac.uk/people/michael.wooldridge/teaching/soft-eng/lect14.pdf
msg228442 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2014-10-04 10:08
At least this condition is false for current implementation: (a < b and b < a) == False Example: a = Counter({'a': -1}), b = Counter({'b': -1}).
msg228443 - (view)	Author: Steven D'Aprano (steven.daprano) *	Date: 2014-10-04 10:13
On Sat, Oct 04, 2014 at 08:38:17AM +0000, Ram Rachum wrote: > > Ram Rachum added the comment: > > > There are some properties of set comparison: > > > > (a < b) == (a <= b and a != b) > > I don't think so, because counter equality is based on dict equality so it doesn't ignore zero values. That's a very good point! That suggests that we shouldn't treat Counters as multisets (a.k.a. bags) and define subset/superset operators. They're almost multisets, but not quite the same, since: Counter({'a': 0}) != Counter({}) but the formal definition of multiset suggests those should be equal: https://en.wikipedia.org/wiki/Multiset#Formal_definition Perhaps we should rethink this. I'm starting to wonder whether we need a PEP to define exactly how subset and superset should be defined for Counters. This is precisely the trouble with rushing to add code before we know what the code is supposed to do... :-)
msg228444 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2014-10-04 10:43
Just because something is discussed doesn't mean it needs a PEP. Also, see http://bugs.python.org/issue22515#msg227852
msg228445 - (view)	Author: Ram Rachum (cool-RR) *	Date: 2014-10-04 10:43
I see that in my implementation for __lt__ and __gt__ I forgot to check whether the other counter has elements that this counter doesn't, which could flip `found_strict_difference`.
msg228461 - (view)	Author: Steven D'Aprano (steven.daprano) *	Date: 2014-10-04 15:24
On Sat, Oct 04, 2014 at 10:43:02AM +0000, Antoine Pitrou wrote: > Just because something is discussed doesn't mean it needs a PEP. I know that in general discussions do not need a PEP, but we seem to be rushing awfully fast into writing code before we even know what the code is supposed to do. Perhaps "need a PEP" was too strong, but we surely need to do something to decide on constistent and well-defined behaviour. As you pointed out earlier, Counters aren't precisely multisets, they're more like a hybrid mapping/multiset. It's not clear what behaviour subset and superset should have for such a hybrid. By rushing into code before deciding on what the code is supposed to do, we end up with inconsistent behaviour. Here are a pair of Counters which compare "is subset", but not "is subset or equal" using Ram's patch: py> Counter(a=1, b=0) < Counter(a=2) True py> Counter(a=1, b=0) <= Counter(a=2) False On the other hand, here's a pair which compares "is subset or equal", but not ("is subset" or "equal"): py> Counter(a=0) <= Counter(b=0) True py> (Counter(a=0) < Counter(b=0)) or (Counter(a=0) == Counter(b=0)) False Adding elements with count=0 changes whether a set is a subset or not: py> Counter(a=0) < Counter(b=0) False py> Counter(a=0) < Counter(b=0, c=0) True py> Counter(a=0, b=0) < Counter(b=0, c=0) False I'm not convinced that mixed Counter and regular dict comparisons should be supported, but Ram's patch supports any Mapping type. Is that a good idea? There's been no discussion on that question. Can somebody explain to me what this means? How is this meaningfully a subset operation? py> Counter(a="eggs") < Counter(a="spam") True I supported adding these operations to Counter, but the last thing I want is to enshrine a set of ad-hoc and inconsistent semantics that don't match standard multiset behaviour and are defined by the implementation, instead of having the implementation defined by the desired semantics.
msg228462 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2014-10-04 15:39
Le 04/10/2014 17:24, Steven D'Aprano a écrit : > > I know that in general discussions do not need a PEP, but we seem to be > rushing awfully fast into writing code before we even know what the code > is supposed to do. Writing a patch is useful practice because it's a way to discover latent problems (especially when writing unit tests). > py> Counter(a=1, b=0) < Counter(a=2) > True > py> Counter(a=1, b=0) <= Counter(a=2) > False That looks wrong to me. > py> Counter(a=0) <= Counter(b=0) > True > py> (Counter(a=0) < Counter(b=0)) or (Counter(a=0) == Counter(b=0)) > False Ditto. > py> Counter(a=0) < Counter(b=0) > False > py> Counter(a=0) < Counter(b=0, c=0) > True Yuck. > I'm not convinced that mixed Counter and regular dict comparisons should > be supported, but Ram's patch supports any Mapping type. Is that a good > idea? There's been no discussion on that question. That sounds wrong to me as well. > Can somebody explain to me what this means? How is this meaningfully a > subset operation? > > py> Counter(a="eggs") < Counter(a="spam") > True I agree this looks silly. However, the Counter doc says: """The multiset methods are designed only for use cases with positive values. [...] There are no type restrictions, but the value type needs to support addition, subtraction, and comparison.""" Which really sounds like an excuse to not perform any type checking and silently allow nonsensical multiset behaviour with non-integer keys. We can follow that lead :-)
msg228463 - (view)	Author: Ram Rachum (cool-RR) *	Date: 2014-10-04 15:54
Hi everybody, I agree we have a mess on our hands, with lots of cases where comparison is weird. To be honest I've given up on `collections.Counter` at this point. The fact that it's a "hybrid mapping/multiset" is the cause of all the problems, so I just made my own class that is defined as a multiset with only positive integer values, and I'm going to use that and define methods on it without having to worry about these weird cases. Unless someone else wants to champion this issue and resolve the edge cases, it can be closed as far as I'm concerned. I really wish that `collections.Counter` wouldn't have been added to the standard library as a "hybrid mapping/multiset"; I think that it shows emphasis on the implementation rather than the usage, while the usage is more important. But there's nothing I can do about that. (If there was a place where we can make notes for Python 4, I'd put a note there about it.)
msg228477 - (view)	Author: Ethan Furman (ethan.furman) *	Date: 2014-10-04 18:43
We are not "rushing into code", we are getting some code, and tests, written to help define what we are talking about. So far the feedback has given some more tests to add, and some things to change (such as only comparing with other Counters). The big question I have is how do we handle <= and >= ? Normal Counter == treats a key with a zero value as being unequeal to a Counter without that same key, but I don't think that makes sense for partial ordering.
msg228487 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2014-10-04 19:39
This implementation obeys more set properties.
msg229065 - (view)	Author: Raymond Hettinger (rhettinger) *	Date: 2014-10-11 07:37
> I suggest implementing `Counter.__lt__` which will be a > partial order, similarly to `set.__lt__`. This would be reasonable if the counter were a strict stand-alone multiset or bag; however, the core design is much looser than that (supporting negative counts for example). It would be more accurate to think about the Counter as a dict that returns 0 for missing keys and has few extra methods that can be handy in the content of counting things. Accordingly, a partial ordering operator wouldn't be well-defined for some of the intended use cases. The addition of subset/superset operations was considered and rejected during the original design phase (the omission was intentional). It is better to leave this to the user to decide what they want to do and how they intend to do so. After all, it is as easily to work with as a regular dict. You can access it directly or subclass it to fit your own needs. The starting point for this class (when first proposed and endorsed by Guido) was: class Counter(dict): def __missing__(self, key): return 0 That core idea is dirt simple and it is not hard to build your own variants if the one in collections doesn't meet your needs.
msg231365 - (view)	Author: Ram Rachum (cool-RR) *	Date: 2014-11-19 10:27
Hi everyone, Just wanted to follow up here that I've created my own dedicated bag class. Documentation: https://combi.readthedocs.org/en/stable/bags.html Install using `pip install combi` It's packed with features. There are a ton of arithmetic operations defined and the comparisons methods we talked about in this issue. There are also frozen and ordered variants.
msg253251 - (view)	Author: Aaron Meurer (Aaron.Meurer)	Date: 2015-10-20 20:19
I can't believe this issue was closed. Why can't Counter.__lt__(self, other) just be all(self[i] < other[i] for i in self)? Just because Counter supports weird stuff shouldn't bill it out. To follow that logic, we should also remove Counter.subtract >>> Counter(a='b') - Counter(a='c') Traceback (most recent call last): File "<ipython-input-5-31b2df7f8ff1>", line 1, in <module> Counter(a='b') - Counter(a='c') File "/Users/aaronmeurer/anaconda/lib/python3.5/collections/__init__.py", line 709, in __sub__ newcount = count - other[elem] TypeError: unsupported operand type(s) for -: 'str' and 'str' It's super annoying that Counter supports all the basic set operations except subset. A reminder that this does work: >>> Counter(a=2) \| Counter(a=1) Counter({'a': 2}) And also fails on weird values: >>> Counter(a='b') \| Counter(a='c') Traceback (most recent call last): File "<ipython-input-8-f9768ad92117>", line 1, in <module> Counter(a='b') \| Counter(a='c') File "/Users/aaronmeurer/anaconda/lib/python3.5/collections/__init__.py", line 730, in __or__ if newcount > 0: TypeError: unorderable types: str() > int()

History
Date	User	Action	Args
2022-04-11 14:58:08	admin	set	github: 66705
2016-09-28 09:18:28	serhiy.storchaka	link	issue28296 superseder
2015-10-20 20:19:07	Aaron.Meurer	set	nosy: + Aaron.Meurer messages: + msg253251
2014-11-19 10:27:17	cool-RR	set	messages: + msg231365
2014-10-11 07:37:32	rhettinger	set	status: open -> closed resolution: rejected messages: + msg229065
2014-10-11 06:50:31	rhettinger	set	assignee: rhettinger
2014-10-04 19:39:37	serhiy.storchaka	set	files: + issue22515_partial_order_counter_v3.diff messages: + msg228487
2014-10-04 18:43:58	ethan.furman	set	messages: + msg228477
2014-10-04 15:54:18	cool-RR	set	messages: + msg228463
2014-10-04 15:39:04	pitrou	set	messages: + msg228462
2014-10-04 15:24:23	steven.daprano	set	messages: + msg228461
2014-10-04 10:43:39	cool-RR	set	messages: + msg228445
2014-10-04 10:43:02	pitrou	set	messages: + msg228444
2014-10-04 10:13:25	steven.daprano	set	messages: + msg228443
2014-10-04 10:08:33	serhiy.storchaka	set	messages: + msg228442
2014-10-04 09:58:52	steven.daprano	set	messages: + msg228441
2014-10-04 09:00:35	mark.dickinson	set	nosy: + mark.dickinson
2014-10-04 08:38:17	cool-RR	set	messages: + msg228440
2014-10-04 07:57:16	serhiy.storchaka	set	messages: + msg228436
2014-10-04 07:38:04	cool-RR	set	messages: + msg228433
2014-10-04 06:57:55	Saimadhav.Heblikar	set	files: + issue22515_partial_order_counter_v2.diff nosy: + Saimadhav.Heblikar messages: + msg228429
2014-10-03 23:53:26	ethan.furman	set	files: + issue22515.stoneleaf.patch.01 messages: + msg228411
2014-10-03 22:30:32	cool-RR	set	messages: + msg228403
2014-10-03 22:29:52	ethan.furman	set	stage: test needed -> patch review
2014-10-03 22:29:39	ethan.furman	set	messages: + msg228402
2014-10-03 22:25:19	pitrou	set	messages: + msg228401
2014-10-03 22:24:14	ethan.furman	set	messages: + msg228398
2014-10-03 22:05:45	ethan.furman	set	messages: + msg228391
2014-10-03 20:47:56	cool-RR	set	files: + patch2.patch messages: + msg228374
2014-10-03 09:08:19	cool-RR	set	files: + patch.patch keywords: + patch messages: + msg228312
2014-10-02 16:15:45	ethan.furman	set	messages: + msg228236
2014-10-01 17:51:33	ethan.furman	set	messages: + msg228105
2014-10-01 17:38:50	cool-RR	set	messages: + msg228102
2014-10-01 17:36:35	scoder	set	messages: + msg228101
2014-10-01 17:35:49	ethan.furman	set	messages: + msg228100
2014-10-01 17:31:52	cool-RR	set	messages: + msg228099
2014-10-01 17:28:42	scoder	set	nosy: + scoder messages: + msg228098
2014-10-01 17:24:13	cool-RR	set	messages: + msg228097
2014-10-01 17:21:58	ethan.furman	set	messages: + msg228096
2014-10-01 17:18:48	pitrou	set	messages: + msg228094
2014-10-01 17:13:36	cool-RR	set	messages: + msg228090
2014-10-01 17:10:08	ethan.furman	set	messages: + msg228089 stage: test needed
2014-10-01 16:59:27	steven.daprano	set	messages: + msg228088
2014-10-01 16:57:58	ethan.furman	set	messages: + msg228087
2014-10-01 16:42:06	cool-RR	set	messages: + msg228086
2014-10-01 16:30:14	pitrou	set	messages: + msg228085
2014-10-01 16:29:04	ethan.furman	set	messages: + msg228084
2014-10-01 16:27:33	ethan.furman	set	messages: + msg228083
2014-10-01 16:04:59	pitrou	set	messages: + msg228082
2014-10-01 14:50:21	eric.smith	set	messages: + msg228077
2014-10-01 14:23:47	cool-RR	set	messages: + msg228072
2014-10-01 14:18:10	eric.smith	set	nosy: + eric.smith messages: + msg228071
2014-09-29 23:21:30	cool-RR	set	messages: + msg227860
2014-09-29 23:17:52	ethan.furman	set	nosy: + rhettinger type: enhancement messages: + msg227858
2014-09-29 23:13:24	cool-RR	set	messages: + msg227857
2014-09-29 22:54:15	pitrou	set	nosy: + pitrou messages: + msg227852
2014-09-29 22:44:47	cool-RR	set	messages: + msg227851
2014-09-29 22:36:39	ethan.furman	set	messages: + msg227850
2014-09-29 22:32:58	cool-RR	set	messages: + msg227849
2014-09-29 22:30:54	cool-RR	set	messages: + msg227848
2014-09-29 22:16:19	josh.r	set	nosy: + josh.r
2014-09-29 22:06:02	ethan.furman	set	messages: + msg227847
2014-09-29 21:24:53	cool-RR	set	messages: + msg227846
2014-09-29 21:21:14	r.david.murray	set	nosy: + r.david.murray messages: + msg227845
2014-09-29 21:20:59	cool-RR	set	messages: + msg227844
2014-09-29 21:15:34	ethan.furman	set	nosy: + ethan.furman messages: + msg227842
2014-09-29 20:35:26	serhiy.storchaka	set	messages: + msg227831
2014-09-29 20:30:54	cool-RR	set	messages: + msg227829
2014-09-29 20:26:45	cool-RR	set	messages: + msg227828
2014-09-29 20:24:23	serhiy.storchaka	set	nosy: + serhiy.storchaka messages: + msg227827
2014-09-29 20:15:30	steven.daprano	set	nosy: + steven.daprano messages: + msg227824
2014-09-29 19:33:21	cool-RR	create