This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: collections.Counter methods return Counter objects instead of subclass objects
Type: behavior Stage:
Components: Library (Lib) Versions: Python 3.5
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: rhettinger Nosy List: mark.dickinson, r.david.murray, rhettinger, rmalouf
Priority: normal Keywords:

Created on 2015-11-02 16:35 by rmalouf, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Messages (6)
msg253930 - (view) Author: Rob Malouf (rmalouf) * Date: 2015-11-02 16:35
Several collections.Counter methods return Counter objects, which is leads to wrong or at least confusing behavior when Counter is subclassed.  For example, nltk.FreqDist is a subclass of Counter:

>>> x = nltk.FreqDist(['a','a','b','b','b'])
>>> y = nltk.FreqDist(['b','b','b','b','b'])
>>> z = x + y
>>> z.__class__
<class 'collections.Counter'>

This applies to __add__(), __sub__(), __or__(), __and__(), __pos__(), and __neg__().  

In contrast, the copy() method does (what I think is) the right thing:

>>> x.copy().__class__
<class 'nltk.probability.FreqDist'>
msg253936 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2015-11-02 17:47
This indeed looks like a bug to me, but I won't be surprised if Raymond says it was done on purpose.  On the other hand originally the binary operator mothods didn't correctly return NotImplemented for non-Counters, so perhaps it is just an oversight. 

I'm not sure what the backward compatibility consequences would be of changing it.
msg253939 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2015-11-02 18:39
I'm fairly sure that this behaviour is deliberate, and (more-or-less) consistently applied throughout the builtin and standard library types. If you create a subclass C of list, then concatenation of two instances of C still gives a result of type list. Similarly, if you subclass Decimal then arithmetic operations on instances of your subclass still return a Decimal instance. If you want operations to return instances of the subclass, it's up to you to override those operations in your subclass.

Having the Counter operations automatically return an instance of the subclass would mean placing restrictions on the signature of the constructor of any such subclass, which may be undesirable. (A subclass may want to require additional arguments in its constructor.)
msg253942 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2015-11-02 19:10
OK, good point.  Perhaps we should turn this into a feature request to replace the calls to Counter with a call to an overridable method to make subclassing in this way possible without massive cut and paste.
msg253944 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2015-11-02 19:13
Relevant post from Tim Peters back in 2005, on the subject of subclassing datetime and timedelta:

"""
Yes, and all builtin Python types work that way.  For example,
int.__add__ or float.__add__ applied to a subclass of int or float
will return an int or float; similarly for a subclass of str.  This
was Guido's decision, based on that an implementation of any method in
a base class has no idea what requirements may exist for invoking a
subclass's constructor.  For example, a subclass may restrict the
values of constructor arguments, or require more arguments than a base
class constructor; it may permute the order of positional arguments in
the base class constructor; it may even be "a feature" that a subclass
constructor gives a different meaning to an argument it shares with
the base class constructor.  Since there isn't a way to guess, Python
does a safe thing instead.
"""

Source: https://mail.python.org/pipermail/python-list/2005-January/311610.html
msg253953 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2015-11-03 00:04
The decision was deliberate and matches what we've done elsewhere in Python.   The nearest kin to Counters are sets/frozensets which also behave the same way:

    >>> class S(set):
            pass
    >>> type(S('abc') | S('cdef'))
    <class 'set'>

Mark Dickinson articulated the reasons clearly.   Python has been consistent about this decision from the outset (see array.array or fractions.Fraction for example).

Marking this as "not a bug" and closing.
History
Date User Action Args
2022-04-11 14:58:23adminsetgithub: 69721
2015-11-03 00:04:31rhettingersetstatus: open -> closed
resolution: not a bug
messages: + msg253953
2015-11-02 23:48:25rhettingersetassignee: rhettinger
2015-11-02 19:13:51mark.dickinsonsetmessages: + msg253944
2015-11-02 19:10:21r.david.murraysetmessages: + msg253942
2015-11-02 18:39:56mark.dickinsonsetnosy: + mark.dickinson
messages: + msg253939
2015-11-02 17:47:36r.david.murraysetnosy: + rhettinger, r.david.murray

messages: + msg253936
title: collections.Counter methods return Counter objects -> collections.Counter methods return Counter objects instead of subclass objects
2015-11-02 16:35:19rmaloufcreate