classification
Title: Non-integer values in collections.Counter
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.9
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: HuangFuSL, rhettinger
Priority: normal Keywords:

Created on 2021-01-29 14:50 by HuangFuSL, last changed 2021-01-29 19:13 by rhettinger. This issue is now closed.

Messages (2)
msg385913 - (view) Author: HuangFuSL (HuangFuSL) Date: 2021-01-29 14:50
When I am creating a counter object provided by `collections.Counter` using a mapping object like a dictionary, it seems that Python will not check the validity of the values in the mapping object.

I've checked the following Python script could be successfully executed using Python 3.9.0 on Windows.

```python
>>> from collections import Counter
>>> a = Counter({'0': '0'})
>>> a.elements()
<itertools.chain object at 0x00000252DDB5A4F0>
```

`a.elements()` returns a iterator, iterating through it will normally get the records counted by the `Counter`, but with a `str` object inside, iterating through it will make a `TypeError` raised.

```python
>>> for i in a.elements():
...     pass
...
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'str' object cannot be interpreted as an integer
```

Meanwhile, if the counter contains values that cannot be compared such as `False` and `''`, `most_common()` method will fail.

```python
>>> b = Counter({'0': False, '1': ''})
>>> b.most_common()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python39\lib\collections\__init__.py", line 610, in most_common
    return sorted(self.items(), key=_itemgetter(1), reverse=True)
TypeError: '<' not supported between instances of 'bool' and 'str'
```

The `sys.version` variable of my Python interpreter is as follows:

```
>>> import sys
>>> sys.version
'3.9.0 (tags/v3.9.0:9cf6752, Oct  5 2020, 15:34:40) [MSC v.1927 64 bit (AMD64)]'
>>>
```

I'm not sure whether the result is intentionally designed, but I think such execution results may lead to confusion.
msg385927 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2021-01-29 19:13
Thanks for the report, but there isn't anything that can be done about this.  The core design concept for Counter was to be a regular dictionary augmented by a few capabilities that made it suitable for counting.  It isn't closed-off in any way.  On the plus side, that makes it versatile and make it fit well the other parts of the Python ecosystem that work with dictionaries.  On the minus, there is nothing to keep out data that doesn't make sense in the context of counting.

FWIW, there is a multiset package on PyPi that offers a closed-off implementation that has an internal dictionary that can only be accessed by methods that prevent negative counts or non-integer values.
History
Date User Action Args
2021-01-29 19:13:19rhettingersetstatus: open -> closed
resolution: not a bug
messages: + msg385927

stage: resolved
2021-01-29 16:17:20mark.dickinsonsetnosy: + rhettinger
2021-01-29 14:55:23HuangFuSLsetcomponents: + Library (Lib), - Extension Modules
2021-01-29 14:50:25HuangFuSLcreate