classification
Title: pprint for dataclass instances
Type: enhancement Stage: resolved
Components: Library (Lib) Versions: Python 3.10
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: eric.smith Nosy List: LewisGaul, eric.smith, eric.snow, gregory.p.smith, rhettinger, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2021-01-31 00:14 by LewisGaul, last changed 2021-04-14 00:03 by eric.smith. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 24389 merged LewisGaul, 2021-01-31 00:19
Messages (15)
msg386002 - (view) Author: Lewis Gaul (LewisGaul) * Date: 2021-01-31 00:14
Currently the pprint module does not have support for dataclasses. I have implemented support for this and will link the PR once I have the issue number!

See also issue37376 for SimpleNamespace support.
msg386012 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2021-01-31 09:53
Since a dataclass can do anything a regular class can do, it is probably too general to have a special case for pprint.
msg386014 - (view) Author: Lewis Gaul (LewisGaul) * Date: 2021-01-31 12:52
> a dataclass can do anything a regular class can do

Agreed, but isn't that also true of any subclasses of currently supported types? In particular 'UserDict', 'UserList' and 'UserString', which all have explicit support in pprint and are intended for "easier subclassing" according to the docs.

I'm also not sure why it would be a reason for not giving it pprint handling (in the case where there's no user-defined __repr__). Is there any harm in doing so? 

I'd consider dataclasses one of the primary choices for storing data in modern Python (e.g. for converting to/from JSON/YAML), and may well be used for storing nested data, which can be very hard to read without some mechanism for pretty-printing.

Indeed, the dataclasses.asdict() function recurses into dataclass fields. This gives the option of pprint(dataclasses.asdict(my_dataclass)), but at the cost of losing the class names and any custom reprs.
msg386017 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2021-01-31 14:09
For all other classes we check that there is no user defined __repr__. But it is difficult to check this for dataclass or namedtuple because there is no base class for dataclasses or namedtuples which provides standard __repr__ implementation. All __repr__ for dataclasses and namedtuples are generated.

See issue7434.
msg386018 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2021-01-31 14:17
Adding an attribute on the __repr__ (and other methods?) signifying that they were generated would let us distinguish them.

Say, checking getattr(__repr__, '__is_generated__', False), or similar.
msg386020 - (view) Author: Lewis Gaul (LewisGaul) * Date: 2021-01-31 14:34
@Serhiy - Yes, I noted that problem in the PR. Thanks for pointing me to that issue, I agree it would be good to make pprint properly extensible (my current solution is to maintain a fork of the pprint module with dataclass support added).

Eric's suggestion would work, I wasn't sure if it would be considered an 'ok' thing to do, but if so then could be an easy enough way to support dataclasses (and namedtuples potentially)?
msg386023 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2021-01-31 15:07
Good idea Eric, it should work.

But it can make the code of pprint potentially less flexible. Currently it uses a mapping which maps __repr__ to corresponding pprint implementation. Only exception is for dicts, for historical reasons. It potentially can allow to make pprint more general and support arbitrary types by registering some handlers. Since there is no standard implementation of __repr__ for namedtuples and dataclasses we cannot use them as keys, and need to hardcode checks for namedtuple and dataclass (and any other generated classes).

It is a minor objection. Perhaps practicality should beat purity in this case.
msg386033 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2021-01-31 18:15
At some point, we need a modern redesign alternative to pprint.  It could have its own __pprint__ method to communicate how it wants to be pretty printed.

Until then, I think the existing pprint module should only grow custom support for classes that have a mostly consistent structure and usage pattern.  SimpleNamespace, for example, made sense for a custom pprint handler because it is so dict like and is almost never customized.

IMO, dataclasses are a bridge too far.  Having pprint() guess what a dataclass intends is not far from try to guess what an arbitrary class intends.  This is skating on thin ice.
msg386034 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2021-01-31 18:27
I agree that we need a better pprint. I think it would be easier to create something new rather than try and morph the existing pprint, but maybe I lack enough imagination. I'd prefer to use functools.singledispatch instead of a __pprint__ method, but it doesn't really make a lot of difference. PEP 443 (singledispatch) does use pprint as a motivating example.

I tend to agree with Raymond that we don't want to guess what a dataclass class intends as its usage. After all, we deliberately made it easy to add a custom __repr__ (one not generated by dataclasses.dataclass).
msg388794 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2021-03-16 00:41
I'm leaning toward accepting this on the condition that it only be invoked for dataclasses where __repr__ was the version generated by @dataclass. And also that it use the same fields that the generated __repr__ would use (basically skipping repr=False fields). Under those conditions, I don't see the harm.

The reason I'm leaning toward acceptance is that we've talked about a better pprint for ages, and yet there's no activity that I can tell toward developing a replacement in the stdlib. pprint was a motivating example for PEP 443 (singledispatch), and that was accepted 8 years ago. I don't think we should have to wait forever to get better pprint for dataclasses.

But I'm still not 100% decided, and I can be reasoned with!
msg388795 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2021-03-16 01:26
FWIW, we've not had a feature request for this ever, nor has there been a request for pprint to support attrs, nor any other hand-rolled class that implements methods similar to those generated by dataclasses.  AFAICT, this tracker issue wasn't motivated by a known use case; rather, it was "my PR was accepted for SimpleNamespace and thought dataclasses could be the next."

In the absence of known use cases and user requests, I think we should let it be.  Put me down for a -0.
msg388828 - (view) Author: Lewis Gaul (LewisGaul) * Date: 2021-03-16 10:09
> FWIW, we've not had a feature request for this ever, nor has there been a request for pprint to support attrs, nor any other hand-rolled class that implements methods similar to those generated by dataclasses.

I wouldn't expect core Python to support a 3rd party lib like attrs, but I fail to see what's so different between dataclasses, SimpleNamespace and namedtuple - all of these may be used for storing/modelling [nested] data, which then may be printed.

> AFAICT, this tracker issue wasn't motivated by a known use case; rather, it was "my PR was accepted for SimpleNamespace and thought dataclasses could be the next."

This issue is entirely motivated by a real-world example - I'm currently maintaining a private fork of the pprint module with support for dataclasses added. I'm assuming the reason this hasn't come up before is that dataclasses are relatively new (and plenty of users will still be on older versions of Python).

I was not the author of the issue that added support for SimpleNamespace, I just saw it and used it as an example of precedent.

> At some point, we need a modern redesign alternative to pprint.

I'm on board with this, but as Eric said there aren't currently any signs of this being worked on. In absence of a redesign, dataclass support seems like a natural extension to me.
msg389198 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2021-03-20 23:12
+0.5 I lean towards just accepting this under the conditions Eric describes given that dataclass is a stdlib concept and nobody is likely to claim that such output from pprint is a bad thing.

The larger "some form of protocol for pprint to work on all sorts of other things" issue (regardless of how) remains a long term wish list item that'll probably wind up in PEP land if someone wants to take it on.
msg391020 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2021-04-13 23:59
New changeset 11159d2c9d6616497ef4cc62953a5c3cc8454afb by Lewis Gaul in branch 'master':
bpo-43080: pprint for dataclass instances (GH-24389)
https://github.com/python/cpython/commit/11159d2c9d6616497ef4cc62953a5c3cc8454afb
msg391021 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2021-04-14 00:03
Thanks for all of the work, LewisGaul.
History
Date User Action Args
2021-04-14 00:03:44eric.smithsetstatus: open -> closed
resolution: fixed
messages: + msg391021

stage: patch review -> resolved
2021-04-13 23:59:32eric.smithsetmessages: + msg391020
2021-03-20 23:12:59gregory.p.smithsetnosy: + gregory.p.smith
messages: + msg389198
2021-03-16 10:09:47LewisGaulsetmessages: + msg388828
2021-03-16 01:26:34rhettingersetmessages: + msg388795
2021-03-16 00:41:28eric.smithsetmessages: + msg388794
2021-01-31 18:27:10eric.smithsetmessages: + msg386034
2021-01-31 18:15:07rhettingersetmessages: + msg386033
2021-01-31 15:07:20serhiy.storchakasetmessages: + msg386023
2021-01-31 14:34:58LewisGaulsetmessages: + msg386020
2021-01-31 14:17:02eric.smithsetmessages: + msg386018
2021-01-31 14:09:31serhiy.storchakasetmessages: + msg386017
2021-01-31 12:52:46LewisGaulsetmessages: + msg386014
2021-01-31 09:53:47rhettingersetassignee: eric.smith

messages: + msg386012
nosy: + eric.smith
2021-01-31 00:19:40LewisGaulsetkeywords: + patch
stage: patch review
pull_requests: + pull_request23204
2021-01-31 00:14:38LewisGaulcreate