classification
Title: Add compact=True flag to json.dump/dumps
Type: enhancement Stage: resolved
Components: Library (Lib) Versions: Python 3.7
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: bob.ippolito Nosy List: Alex Gordon, andrewnester, benhoyt, berker.peksag, bob.ippolito, brett.cannon, haypo, r.david.murray, rhettinger, serhiy.storchaka
Priority: normal Keywords:

Created on 2017-02-13 02:04 by Alex Gordon, last changed 2017-03-14 17:06 by benhoyt. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 72 andrewnester, 2017-02-13 15:51
Messages (21)
msg287663 - (view) Author: Alex Gordon (Alex Gordon) Date: 2017-02-13 02:04
Broadly speaking, there are three main output styles for json.dump/dumps:

1. Default: json.dumps(obj)
2. Compact: json.dumps(obj, separators=(',', ':'))
3. Pretty-printing: json.dumps(obj, sort_keys=True, indent=4)

The 'compact' style is the densest, suitable if the JSON is to be sent over the network, archived on disk, or otherwise consumed by a machine. The pretty-printed style is for human consumption: configuration files, debugging, etc.

Even though the compact style is often desirable, the API for producing it is unforgiving. It's easy to accidentally write code like the following, which silently produces invalid nonsense:

    json.dumps(obj, separators=(':', ','))

I propose the addition of a new flag `compact=True`, that simply sets `separators=(',', ':')`. e.g.

    >>> obj = {"foo": 1, "bar": 2}
    >>> json.dumps(obj, compact=True)
    '{"foo":1,"bar":2}'

The defaults for `separators=` are good, so eventually `compact=` would relegate `separators=` to obscurity. Setting both `compact=True` and `separators=` at the same time should be an error.
msg287685 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2017-02-13 13:04
An alternative would be to define a COMPACT constant in the json module equal to (',', ':').
msg287707 - (view) Author: Andrew Nester (andrewnester) * Date: 2017-02-13 15:51
I've just added PR implementing alternative version provided by R. David Murray as more simple and straight-forward one.
msg287716 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-02-13 18:08
I'm -1. This adds the maintenance and learning burden and doesn't make the user code more clear. The reader needs to search what json.COMPACT means.
msg287739 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2017-02-14 07:45
I concur with Serhiy and think this API change would be a net increase in complexity.

FWIW, network transmission size issues can already be handled more effectively "Accept-Encoding: gzip, deflate".

Bob, do you have any thoughts?
msg287740 - (view) Author: Bob Ippolito (bob.ippolito) * (Python committer) Date: 2017-02-14 08:02
I agree, in isolation it's a fine proposal, but the interface here is already a bit too complex and the benefit is pretty minimal. When the size really does matter, you can take care to set it correctly once and be done with it.
msg287746 - (view) Author: Alex Gordon (Alex Gordon) Date: 2017-02-14 09:20
The point is that, as a principle of good API design, the json module should not generate malformed JSON unless the user very explicitly asks for their JSON to be corrupted.

Python stands alone in having a JSON serializer that can produce strings such as {"k",[1:2:3]} if the user holds it wrong.

Outside of Python, the 'compact' encoding is just the normal way that JSON is encoded. It's the default almost everywhere else. I'm not suggesting Python should default to it also, but I am suggesting that it should be safe and easy to remove the extraneous whitespace.

Historically there were two values you might want to pass to separators. Aside from (',', ':'), the other was (',', ': ') when indent was not None, to suppress the trailing space at the end of each line. This is no longer necessary after 3.4.

After 3.4, separators= currently acts as a very complicated boolean flag, because the only value that makes sense to pass to it is (',', ':'). With a compact flag, users could ignore separators entirely and so the API would be made simpler and safer.
msg287773 - (view) Author: Berker Peksag (berker.peksag) * (Python committer) Date: 2017-02-14 13:26
I'm +1 on the idea. Currently, a user needs to find this information in a wall of text: "To get the most compact JSON representation, you should specify (',', ':') to eliminate whitespace." It's easy to miss the lack of trailing space in (',', ':').

Perhaps the constant name could be changed to something more descriptive, but I don't have a better suggestion at the moment.

(Adding Brett since he also reviewed the PR.)
msg287774 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-02-14 13:32
"Add compact=True flag to json.dump/dumps"

Oh, fun, I implemented exactly this option in my perf project. Extract:

        def dump(data, fp, compact):
            kw = {}
            if compact:
                kw['separators'] = (',', ':')
            else:
                kw['indent'] = 4
            json.dump(data, fp, sort_keys=True, **kw)

So I like the idea of moving the compact=True feature directly into the json module ;-)

(I don't propose to change the default to indent=4, that's a personal choice ;-))
msg287786 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2017-02-14 17:53
I'm a little confused as the PR proposes a COMPACT constant for the module but the issue proposes a new compact argument. Which are we actively considering?
msg287827 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2017-02-15 09:32
Everyone can chime in with their thoughts, but Bob gets to make the decision on this one.
msg288026 - (view) Author: Ben Hoyt (benhoyt) * Date: 2017-02-17 17:31
I agree with the confusion (PR proposes separators=COMPACT, issue compact=True).

I like the concept but not either of the APIs proposed. I *much* more often want to get pretty output -- if I had a dime for every time I've written "json.dumps(obj, sort_keys=True, indent=4)" I'd be rich enough to buy an entire cup of Starbucks coffee. But then you'd need a pretty=True option as well, which would be mutually exclusive with compact=True, so not great.

But what about a style= (or "format="?) parameter, which defaults to 'default' (or just None) meaning the same as now. If you pass format='pretty' you get "sort_keys=True, indent=4" and if you pass format='compact' you get "separators=(',', ':')".
msg288030 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-02-17 18:00
Since core developers have different opinions about this issue, it should be discussed on the Python-Dev maillist until one side convince the other side or BDFL make his decision.

Personally I dislike any complication of json API. Likely it is already the most complicated API in the stdlib.
msg288161 - (view) Author: Andrew Nester (andrewnester) * Date: 2017-02-19 19:43
Adding new argument sucs as format= or compact= will make API more complicated. In addition it's not easy and has obvious how to handle situations wheb we have both separatots= and format= arguments set.
msg288162 - (view) Author: Bob Ippolito (bob.ippolito) * (Python committer) Date: 2017-02-19 19:58
I would recommend a moratorium on new options until we have a plan to make the usage of the JSON APIs simpler overall. It's accumulated too many options over time. The real trouble is figuring out how to do this in a backwards compatible way that does not impact performance too much. Off-hand, I can't think of any obvious way aside from using new function names with a cleaner options list.
msg289543 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2017-03-13 17:43
So new options sounds like a no-go, but what about the COMPACT attribute on the json module as per https://github.com/python/cpython/pull/72?
msg289547 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-03-13 18:50
I expressed my opinion, but I am ready to change it if this proposal meets the support of other core developers after discussion on the mailing list.
msg289564 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2017-03-14 05:11
[Serhiy]
> Personally I dislike any complication of json API. 
> Likely it is already the most complicated API in the stdlib.

[Bob Ippolito]
> the interface here is already a bit too complex 
> and the benefit is pretty minimal

I concur with those sentiments and am going to mark this as closed.  We already have a way to do it.  That way may not be ideal but putting in an additional way is likely to result in a net increase in complexity.

If some counter-consensus arises on one of the mailing lists, this tracker issue can be re-opened and the discussion can continue.
msg289581 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2017-03-14 13:06
I don't see how adding a constant increases the complexity of the API.
msg289603 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2017-03-14 16:24
I agree with David that I don't see how adding a constant to the module is really a complication of an API.
msg289608 - (view) Author: Ben Hoyt (benhoyt) * Date: 2017-03-14 17:06
Agreed. Seems much the same as other argument constants,
like pickle.HIGHEST_PROTOCOL for the "protocol" argument. These are not
changing the API, just adding a helper constant to avoid the magic values.

-Ben

On Tue, Mar 14, 2017 at 12:24 PM, Brett Cannon <report@bugs.python.org>
wrote:

>
> Brett Cannon added the comment:
>
> I agree with David that I don't see how adding a constant to the module is
> really a complication of an API.
>
> ----------
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue29540>
> _______________________________________
>
History
Date User Action Args
2017-03-14 17:06:54benhoytsetmessages: + msg289608
2017-03-14 16:24:23brett.cannonsetmessages: + msg289603
2017-03-14 13:06:04r.david.murraysetmessages: + msg289581
2017-03-14 05:11:28rhettingersetstatus: open -> closed
resolution: rejected
messages: + msg289564

stage: patch review -> resolved
2017-03-13 18:50:03serhiy.storchakasetmessages: + msg289547
2017-03-13 17:43:45brett.cannonsetmessages: + msg289543
2017-02-19 19:58:40bob.ippolitosetmessages: + msg288162
2017-02-19 19:43:12andrewnestersetmessages: + msg288161
2017-02-17 18:00:38serhiy.storchakasetmessages: + msg288030
2017-02-17 17:31:10benhoytsetnosy: + benhoyt
messages: + msg288026
2017-02-15 09:32:39rhettingersetmessages: + msg287827
2017-02-14 17:53:17brett.cannonsetmessages: + msg287786
2017-02-14 13:32:44hayposetnosy: + haypo
messages: + msg287774
2017-02-14 13:26:12berker.peksagsetnosy: + berker.peksag, brett.cannon

messages: + msg287773
stage: patch review
2017-02-14 09:20:39Alex Gordonsetmessages: + msg287746
2017-02-14 08:02:47bob.ippolitosetmessages: + msg287740
2017-02-14 07:45:22rhettingersetassignee: bob.ippolito

messages: + msg287739
nosy: + bob.ippolito, rhettinger
2017-02-13 18:08:08serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg287716
2017-02-13 15:51:42andrewnestersetnosy: + andrewnester

messages: + msg287707
pull_requests: + pull_request53
2017-02-13 13:04:02r.david.murraysetnosy: + r.david.murray
messages: + msg287685
2017-02-13 02:04:34Alex Gordoncreate