classification
Title: Specifying indent in the json.tool command
Type: enhancement Stage: resolved
Components: Library (Lib) Versions: Python 3.9
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: bob.ippolito, dhimmel, ezio.melotti, flavianhautbois, inada.naoki, r.david.murray, rhettinger, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2017-02-23 20:05 by dhimmel, last changed 2019-12-07 14:14 by inada.naoki. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 345 merged dhimmel, 2017-02-27 15:37
PR 426 Serhiy Int, 2017-03-03 13:23
PR 2720 closed dhimmel, 2017-07-15 19:00
PR 9765 closed wim.glenn, 2018-10-09 05:33
PR 17482 merged dhimmel, 2019-12-06 01:12
Messages (17)
msg288479 - (view) Author: Daniel Himmelstein (dhimmel) * Date: 2017-02-23 20:05
The utility of `python -m json.tool` would increase if users could specify the indent level.

Example use case: newlines in a JSON document are important for readability and the ability to open in a text editor. However, if the file is large, you can save space by decreasing the indent level.

I added an --indent argument to json.tool in https://github.com/python/cpython/pull/201. However, design discussion is required since indent can take an int, string, or None. In addition, a indent string is a tab, which is difficult to pass via a command line argument.

Currently, I added the following function to convert the indent option to the indent parameter of json.dump:

```
def parse_indent(indent):
    """Parse the argparse indent argument."""
    if indent == 'None':
        return None
    if indent == r'\t':
        return '\t'
    try:
        return int(indent)
    except ValueError:
        return indent
```

@inada.naoki mentioned the special casing is undesirable. I agree, but can't think of an alternative. Advice appreciated.
msg288646 - (view) Author: Daniel Himmelstein (dhimmel) * Date: 2017-02-27 15:53
For discussion on how to implement this, see 

+ https://github.com/python/cpython/pull/201#discussion_r102146742
+ https://github.com/python/cpython/pull/201#discussion_r102840190
+ https://github.com/python/cpython/pull/201#discussion_r102891428

Implementation now moved to https://github.com/python/cpython/issues/345
msg288905 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-03-03 17:43
The discussion is scattered between different tracker issues and pull requests. Let continue it in one place.

I don't think we should add too much options for controlling any little detail. json.tools is just small utility purposed mainly for debugging. If you need to control more details, it is not hard to write simple Python script.

For example, for "compact" output:

$ python3 -c "import sys, json; json.dump(json.load(sys.stdin), sys.stdout, separators=(',', ':'))"
msg288918 - (view) Author: Daniel Himmelstein (dhimmel) * Date: 2017-03-03 19:20
To recap the discussion from https://git.io/vyCY8: there are three potential mutually exclusive command line options that have been suggested. There are as follows.

```python
import json
obj = [1, 2]

print('--indent=4')
print(json.dumps(obj, indent=4))

print('--no-indent')
print(json.dumps(obj, indent=None))

print('--compact')
print(json.dumps(obj, separators=(',', ':')))
```

which produces the following output:


```
--indent=4
[
    1,
    2
]
--no-indent
[1, 2]
--compact
[1,2]
```

Currently, https://github.com/python/cpython/pull/345 has implemented --indent and --no-indent. One suggestion was to replace --no-indent with --compact, but that would prevent json.tool from outputting YAML < 1.2 compatible JSON. Therefore, the main question is whether to add --compact or not?

There is a clear use case for --compact. However, it requires a bit more "logic" as there is no `compact` argument in json.dump. Therefore @serhiy.storchaka suggests not adding --compact to keep json.tool lightweight.

I disagree that json.tool is "mainly for debugging". I encounter lot's of applications were JSON needs to be reformatted, so I see json.tool as a cross-platform command line utility.

However, I am more concerned that the JSON API may change, especially with respect to the compact encoding (see http://bugs.python.org/issue29540). Therefore, it seems prudent to let the API evolve and later revisit whether a --compact argument makes sense.

The danger of adding --compact now would be if json.dump adopts an argument for --compact that is not compact. Then aligning the json.tool and json.dump terminology could require backwards incompatible changes.
msg289712 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-03-16 13:15
I'm not particularly interested in this feature. Adding two or three options looks excessive. Python is a programming language and it is easy to write a simple script for your needs. Much easier than implement the general command line interface that supports all options of programming API.
msg289713 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2017-03-16 13:35
Easier, but if we do it in the tool, then it is done for everyone and they don't *each* have to spend that "less time" writing their own script.  And --indent and --compact are both useful for debugging/hand testing, since it allows you to generate the input your code is expecting, and input your code might not be expecting.

On the other hand, I'm not contributing much these days, so I'm not in a good position to be suggesting additions to the support burden :(.
msg289714 - (view) Author: Daniel Himmelstein (dhimmel) * Date: 2017-03-16 14:00
@serhiy.storchaka I totally understand the desire to keep json.tool simple. However, given the description of json.tool in the documentation (below), I think an indentation option is within scope:

> The json.tool module provides a simple command line interface to validate and pretty-print JSON objects.

Indentation/newlines are a fundamental aspect of "pretty-printing". Right now I rarely use json.tool, since indent=4 is too extreme from a visual and file size perspective. Instead I prefer `indent=2` (or even `indent=1`) and I now have to:

1. create a python script to set my desired input
2. make sure every environment has access to this python script (the real annoyance)

Currently, json.tool has a --sort-keys argument, which I think is great. --sort-keys is also an essential feature from my perspective (bpo-21650). So in short, I think json.tool is close to being a really useful utility but falls a bit short without an indentation option.

Given that json.tool exists, I think it makes sense to take steps to make sure it's actually relevant as a json reformatter. Given this motivation, I'm not opposed to adding --compact, if we're confident it will be forward compatible with the json.dump API.
msg289715 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-03-16 14:07
It's not just the support burden. It is also a burden of learning and remembering new API.

The support burden itself is not tiny too. It includes careful designing, writing the implementation and test (the more options you have the more combinations you need to test), increasing running time of tests (CLI tests are slow!), and you need to return to that after adding every new feature for testing compatibility with them.
msg289725 - (view) Author: Bob Ippolito (bob.ippolito) * (Python committer) Date: 2017-03-16 18:21
Probably the best thing we could do here is to mirror the options available in similar tools, such as jq: https://stedolan.github.io/jq/manual/#Invokingjq

The relevant options here would be:

    --indent
    --tab
    --compact-output
    --sort-keys

The default indent in jq is 2, which I tend to prefer these days, but maybe 4 is still appropriate given PEP 8:

    $ echo '[{}, {"a": "b"}, 2, 3, 4]' | jq
    [
      {},
      {
        "a": "b"
      },
      2,
      3,
      4
    ]


This is how jq interprets --compact-output:

    $ echo '[{}, {"a": "b"}, 2, 3, 4]' | jq --compact-output
    [{},{"a":"b"},2,3,4]


I do not think that it's worth having the command-line tool cater to people that want to indent in other ways (e.g. using a string that isn't all spaces or a single tab).
msg291630 - (view) Author: Daniel Himmelstein (dhimmel) * Date: 2017-04-13 19:13
@bob.ippolito thanks for pointing to jq as a reference implementation. I updated the pull request (https://git.io/vS9o8) to implement all of the relevant options. Currently, the PR supports the following mutually exclusive arguments:

--indent
--no-indent
--tab
--compact

These additions took 16 new lines of code in tool.py and 41 new lines of tests. However, I am happy to refactor the tests to be less repetitive if we choose to go forward with these changes.

@serhiy.storchaka I took a maximalist approach with respect to adding indentation options to GH #345. Although I know not all of the options may get merged, I thought we might as well work backwards.

However, the more I think about it, I do think every option above is a unique and valuable addition. I think that even with the changes, json.tool remains a lightweight wrapper json.load + json.dump.
msg312548 - (view) Author: Inada Naoki (inada.naoki) * (Python committer) Date: 2018-02-22 10:25
I'm OK to options in current pull request.
And I think this bike-shedding discussion is not so important to pay our time and energy.

Does anyone have strong opinion?
If no, I'll merge the PR.
msg312570 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-02-22 17:15
I don't think this PR should be merged. It adds too much options. I think that if one needs more control than the current json.tools command line interface gives, he should use the Python interface. Don't forget, that Python is a programming language.
msg348543 - (view) Author: Flavian Hautbois (flavianhautbois) * Date: 2019-07-27 10:49
So what do we do about this? 

Two possibilities:
1. We merge PR 9765 and close PRs 345 and 201, as 9765 seems more straighforward and was already approved. 9765 should be resubmitted to be merged since the base repo does not exist anymore, I could do that.
2. We consider that this is out of scope for Python, and since jq is widely used, it does not make a lot of sense to include it. We should then close all PRs to keep our pull requests clean

I suggest going with 2
msg348740 - (view) Author: Daniel Himmelstein (dhimmel) * Date: 2019-07-30 13:48
Since opening this issue, I've encountered several additional instances where indentation control would have been nice. I don't agree that jq is a sufficient substitute:

1. jq is generally not pre-installed on systems. For projects where users are guaranteed to have Python installed (but may be on any operating system), it is most straightforward to be able to just use a python command and not have to explain how to install jq on 3 different OSes.

2. jq does a lot more than prettifying JSON. The simple use case of reformatting JSON (to/from a file or stdin/stdout) can be challenging.

3. json.tool describes itself as a "simple command line interface to ... pretty-print JSON objects". Indentation is an essential aspect of pretty printing. Why even have this CLI if we're unwilling to add essential basic functionality that is already well supported by the underlying json.dump API?

Python excels at APIs that match user needs. It seems like we're a small change away from making json.tool flexible enough to satisfy real-world needs.

So I'm in favor of merging PR 9765 or PR 345. I'm happy to do the work on either to get them mergeable based on whatever specification we can agree on here.
msg348892 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2019-08-02 07:53
[Serhiy]
> I don't think this PR should be merged. It adds too much options.

[Daniel]
> Since opening this issue, I've encountered several additional 
> instances where indentation control would have been nice.
> I don't agree that jq is a sufficient substitute.

I'm inclined to agree with Serhiy.  While there are some handy command-line calls, they are secondary offshoots to the standard library and tend to be lightweight rather than full featured.  They serve various purposes from quick demos to support of development and testing.  In general, they aren't intended to be applications unto themselves: we don't document all command line tools in one place, we don't version number them, we don't maintain forums or web pages for them, we typically don't lock in their behaviors with a unit tests, we don't guarantee that they will be available across versions, and we periodically change their APIs (for example when switching from getopt to argparse).
msg357775 - (view) Author: Inada Naoki (inada.naoki) * (Python committer) Date: 2019-12-04 06:15
New changeset 03257949bc02a4afdf2ea1eb07a73f8128129579 by Inada Naoki (Daniel Himmelstein) in branch 'master':
bpo-29636: Add --(no-)indent arguments to json.tool (GH-345)
https://github.com/python/cpython/commit/03257949bc02a4afdf2ea1eb07a73f8128129579
msg357976 - (view) Author: Inada Naoki (inada.naoki) * (Python committer) Date: 2019-12-07 14:14
New changeset 15fb7fa88187f5841088721a43609bffe64a8dc7 by Inada Naoki (Daniel Himmelstein) in branch 'master':
bpo-29636: json.tool: Add document for indentation options. (GH-17482)
https://github.com/python/cpython/commit/15fb7fa88187f5841088721a43609bffe64a8dc7
History
Date User Action Args
2019-12-07 14:14:45inada.naokisetmessages: + msg357976
2019-12-06 01:12:41dhimmelsetpull_requests: + pull_request16961
2019-12-04 06:17:12inada.naokisetstatus: open -> closed
stage: patch review -> resolved
resolution: fixed
versions: + Python 3.9, - Python 3.8
2019-12-04 06:15:27inada.naokisetmessages: + msg357775
2019-08-02 07:53:06rhettingersetmessages: + msg348892
2019-07-30 13:48:29dhimmelsetmessages: + msg348740
2019-07-27 10:49:03flavianhautboissetnosy: + flavianhautbois
messages: + msg348543
2018-10-09 05:33:56wim.glennsetkeywords: + patch
stage: patch review
pull_requests: + pull_request9152
2018-02-22 17:15:44serhiy.storchakasetmessages: + msg312570
2018-02-22 10:25:30inada.naokisetmessages: + msg312548
2018-02-22 10:20:54inada.naokisetcomponents: + Library (Lib), - IO
versions: + Python 3.8, - Python 3.7
2017-07-15 19:00:48dhimmelsetpull_requests: + pull_request2778
2017-06-15 05:25:49serhiy.storchakalinkissue30669 superseder
2017-04-13 19:13:20dhimmelsetmessages: + msg291630
2017-03-16 18:21:20bob.ippolitosetmessages: + msg289725
2017-03-16 14:07:04serhiy.storchakasetmessages: + msg289715
2017-03-16 14:00:32dhimmelsetmessages: + msg289714
2017-03-16 13:35:01r.david.murraysetnosy: + r.david.murray
messages: + msg289713
2017-03-16 13:15:22serhiy.storchakasetnosy: + rhettinger, bob.ippolito, ezio.melotti
messages: + msg289712
2017-03-03 19:20:13dhimmelsetmessages: + msg288918
2017-03-03 17:43:47serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg288905
2017-03-03 13:23:01Serhiy Intsetpull_requests: + pull_request355
2017-02-27 15:53:06dhimmelsetmessages: + msg288646
2017-02-27 15:50:17dhimmelsetpull_requests: - pull_request230
2017-02-27 15:37:42dhimmelsetpull_requests: + pull_request298
2017-02-23 20:05:56dhimmelcreate