classification
Title: Repr of collection's subclasses
Type: enhancement Stage: resolved
Components: Extension Modules, Interpreter Core Versions: Python 3.7
process
Status: closed Resolution: fixed
Dependencies: 31497 Superseder:
Assigned To: rhettinger Nosy List: haypo, r.david.murray, rhettinger, serhiy.storchaka, xiang.zhang
Priority: normal Keywords: patch

Created on 2016-07-17 08:54 by serhiy.storchaka, last changed 2017-09-21 12:44 by haypo. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 3631 merged serhiy.storchaka, 2017-09-17 17:06
Messages (12)
msg270621 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-07-17 08:54
The repr of subclasses of some collection classes contains a name of the subclass:

>>> class S(set): pass
... 
>>> S([1, 2, 3])
S({1, 2, 3})
>>> import collections
>>> class OD(collections.OrderedDict): pass
... 
>>> OD({1: 2})
OD([(1, 2)])
>>> class C(collections.Counter): pass
... 
>>> C('senselessness')
C({'s': 6, 'e': 4, 'n': 2, 'l': 1})

But the repr of subclasses of some collection classes contains a name of the base class:

>>> class BA(bytearray): pass
... 
>>> BA([1, 2, 3])
bytearray(b'\x01\x02\x03')
>>> class D(collections.deque): pass
... 
>>> D([1, 2, 3])
deque([1, 2, 3])
>>> class DD(collections.defaultdict): pass
... 
>>> DD(int, {1: 2})
defaultdict(<class 'int'>, {1: 2})

Shouldn't a name of the subclass always be used?
msg270623 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016-07-17 09:30
How about other built-in classes? If repr does matter, maybe str, int, dict should also respect this rule?
msg270625 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-07-17 10:00
This can break third-party code. For example the code that explicitly makes the repr containing a subclass name:

class MyStr(str):
    def __repr__(self):
        return 'MyStr(%s)' % str.__repr__(self)

I think the chance of breaking third-party code for bytearray or deque is smaller, since the repr is not literal.
msg270627 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016-07-17 10:12
Yes. So if we are not going to change other built-in types, maybe we'd better not change bytearray either. My opinion is that don't change built-in classes, even bytearray. If users would like a more reasonable repr, they can provide a custom __repr__ as your example. But make such changes to classes in collections sounds like a good idea to me.
msg270646 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2016-07-17 15:30
It certainly seems that collections should be consistent about this.  The question of builtin types is a different issue, and I agree that it is probably way more trouble than it is worth to change them, especially since, for example, repr(str) is often used just to get the quote marks in contexts where you *don't* want the subclass name.
msg270650 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-07-17 16:20
It looks to me that the repr of a collection contains a dynamic name if it is implemented in Python and hardcoded base name if it is implemented in C (OrderedDict originally was implemented in Python). Maybe just because tp_name contains full qualified name, and extracting a bare class name needs few lines of code.

There is similar issue with the io module classes: issue21861.

Since this problem is already solved for OrderedDict, I think it is easy to use this solution in other classes. Maybe factoring out the following code into helper function.

    const char *classname;

    classname = strrchr(Py_TYPE(self)->tp_name, '.');
    if (classname == NULL)
        classname = Py_TYPE(self)->tp_name;
    else
        classname++;
msg302387 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2017-09-17 22:06
+1 for changing the cases Serhiy found.   Also, I agree with Serhiy that there should not be a change for the built-in types that have a literal notation (it has been this was forever, hasn't caused any real issues, and changing it now will likely break code).
msg302684 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-09-21 11:24
New changeset b3a77964ea89a488fc0e920e3db6d8477279f19b by Serhiy Storchaka in branch 'master':
bpo-27541: Reprs of subclasses of some classes now contain actual type name. (#3631)
https://github.com/python/cpython/commit/b3a77964ea89a488fc0e920e3db6d8477279f19b
msg302687 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-09-21 12:11
Thanks Raymond.
msg302689 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-09-21 12:16
Why using type.__name__ rather than type.__qualname__?
msg302690 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-09-21 12:42
Because reprs of Python implementations of collection use a bare __name__.

__qualname__ is used only in combination with __module__. Using a single __qualname__ can be confused: foo.bar looks as a name bar in the module foo. Whether in reprs and error messages either full qualified name is used ("{cls.__module__}.{__qualname__}") or a bare __name__. If a displayed name contains a dot it is always a full qualified name.
msg302691 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-09-21 12:44
> Because reprs of Python implementations of collection use a bare __name__.

Ah, maybe this module should be updated to use qualified name with the name in repr()?

> __qualname__ is used only in combination with __module__.

I was thinking at module.qualname, right.
History
Date User Action Args
2017-09-21 12:44:50hayposetmessages: + msg302691
2017-09-21 12:42:12serhiy.storchakasetmessages: + msg302690
2017-09-21 12:16:50hayposetnosy: + haypo
messages: + msg302689
2017-09-21 12:11:00serhiy.storchakasetstatus: open -> closed
resolution: fixed
messages: + msg302687

stage: patch review -> resolved
2017-09-21 11:24:16serhiy.storchakasetmessages: + msg302684
2017-09-17 22:06:10rhettingersetmessages: - msg302377
2017-09-17 22:06:03rhettingersetmessages: + msg302387
2017-09-17 17:40:17rhettingersetmessages: + msg302377
2017-09-17 17:08:27serhiy.storchakasetdependencies: + Add _PyType_Name()
versions: + Python 3.7, - Python 3.6
2017-09-17 17:06:31serhiy.storchakasetkeywords: + patch
stage: patch review
pull_requests: + pull_request3620
2017-08-15 14:54:52r.david.murraylinkissue31194 superseder
2016-07-17 16:20:14serhiy.storchakasetmessages: + msg270650
2016-07-17 15:30:29r.david.murraysetnosy: + r.david.murray
messages: + msg270646
2016-07-17 12:54:41rhettingersetassignee: rhettinger
2016-07-17 10:12:47xiang.zhangsetmessages: + msg270627
2016-07-17 10:00:55serhiy.storchakasetmessages: + msg270625
2016-07-17 09:30:26xiang.zhangsetmessages: + msg270623
2016-07-17 08:54:36serhiy.storchakacreate