classification
Title: Specifying indent in the json.tool command
Type: enhancement Stage:
Components: IO Versions: Python 3.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: bob.ippolito, dhimmel, ezio.melotti, inada.naoki, r.david.murray, rhettinger, serhiy.storchaka
Priority: normal Keywords:

Created on 2017-02-23 20:05 by dhimmel, last changed 2017-04-13 19:13 by dhimmel.

Pull Requests
URL Status Linked Edit
PR 345 open dhimmel, 2017-02-27 15:37
PR 426 Serhiy Int, 2017-03-03 13:23
Messages (10)
msg288479 - (view) Author: Daniel Himmelstein (dhimmel) * Date: 2017-02-23 20:05
The utility of `python -m json.tool` would increase if users could specify the indent level.

Example use case: newlines in a JSON document are important for readability and the ability to open in a text editor. However, if the file is large, you can save space by decreasing the indent level.

I added an --indent argument to json.tool in https://github.com/python/cpython/pull/201. However, design discussion is required since indent can take an int, string, or None. In addition, a indent string is a tab, which is difficult to pass via a command line argument.

Currently, I added the following function to convert the indent option to the indent parameter of json.dump:

```
def parse_indent(indent):
    """Parse the argparse indent argument."""
    if indent == 'None':
        return None
    if indent == r'\t':
        return '\t'
    try:
        return int(indent)
    except ValueError:
        return indent
```

@inada.naoki mentioned the special casing is undesirable. I agree, but can't think of an alternative. Advice appreciated.
msg288646 - (view) Author: Daniel Himmelstein (dhimmel) * Date: 2017-02-27 15:53
For discussion on how to implement this, see 

+ https://github.com/python/cpython/pull/201#discussion_r102146742
+ https://github.com/python/cpython/pull/201#discussion_r102840190
+ https://github.com/python/cpython/pull/201#discussion_r102891428

Implementation now moved to https://github.com/python/cpython/issues/345
msg288905 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-03-03 17:43
The discussion is scattered between different tracker issues and pull requests. Let continue it in one place.

I don't think we should add too much options for controlling any little detail. json.tools is just small utility purposed mainly for debugging. If you need to control more details, it is not hard to write simple Python script.

For example, for "compact" output:

$ python3 -c "import sys, json; json.dump(json.load(sys.stdin), sys.stdout, separators=(',', ':'))"
msg288918 - (view) Author: Daniel Himmelstein (dhimmel) * Date: 2017-03-03 19:20
To recap the discussion from https://git.io/vyCY8: there are three potential mutually exclusive command line options that have been suggested. There are as follows.

```python
import json
obj = [1, 2]

print('--indent=4')
print(json.dumps(obj, indent=4))

print('--no-indent')
print(json.dumps(obj, indent=None))

print('--compact')
print(json.dumps(obj, separators=(',', ':')))
```

which produces the following output:


```
--indent=4
[
    1,
    2
]
--no-indent
[1, 2]
--compact
[1,2]
```

Currently, https://github.com/python/cpython/pull/345 has implemented --indent and --no-indent. One suggestion was to replace --no-indent with --compact, but that would prevent json.tool from outputting YAML < 1.2 compatible JSON. Therefore, the main question is whether to add --compact or not?

There is a clear use case for --compact. However, it requires a bit more "logic" as there is no `compact` argument in json.dump. Therefore @serhiy.storchaka suggests not adding --compact to keep json.tool lightweight.

I disagree that json.tool is "mainly for debugging". I encounter lot's of applications were JSON needs to be reformatted, so I see json.tool as a cross-platform command line utility.

However, I am more concerned that the JSON API may change, especially with respect to the compact encoding (see http://bugs.python.org/issue29540). Therefore, it seems prudent to let the API evolve and later revisit whether a --compact argument makes sense.

The danger of adding --compact now would be if json.dump adopts an argument for --compact that is not compact. Then aligning the json.tool and json.dump terminology could require backwards incompatible changes.
msg289712 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-03-16 13:15
I'm not particularly interested in this feature. Adding two or three options looks excessive. Python is a programming language and it is easy to write a simple script for your needs. Much easier than implement the general command line interface that supports all options of programming API.
msg289713 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2017-03-16 13:35
Easier, but if we do it in the tool, then it is done for everyone and they don't *each* have to spend that "less time" writing their own script.  And --indent and --compact are both useful for debugging/hand testing, since it allows you to generate the input your code is expecting, and input your code might not be expecting.

On the other hand, I'm not contributing much these days, so I'm not in a good position to be suggesting additions to the support burden :(.
msg289714 - (view) Author: Daniel Himmelstein (dhimmel) * Date: 2017-03-16 14:00
@serhiy.storchaka I totally understand the desire to keep json.tool simple. However, given the description of json.tool in the documentation (below), I think an indentation option is within scope:

> The json.tool module provides a simple command line interface to validate and pretty-print JSON objects.

Indentation/newlines are a fundamental aspect of "pretty-printing". Right now I rarely use json.tool, since indent=4 is too extreme from a visual and file size perspective. Instead I prefer `indent=2` (or even `indent=1`) and I now have to:

1. create a python script to set my desired input
2. make sure every environment has access to this python script (the real annoyance)

Currently, json.tool has a --sort-keys argument, which I think is great. --sort-keys is also an essential feature from my perspective (bpo-21650). So in short, I think json.tool is close to being a really useful utility but falls a bit short without an indentation option.

Given that json.tool exists, I think it makes sense to take steps to make sure it's actually relevant as a json reformatter. Given this motivation, I'm not opposed to adding --compact, if we're confident it will be forward compatible with the json.dump API.
msg289715 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-03-16 14:07
It's not just the support burden. It is also a burden of learning and remembering new API.

The support burden itself is not tiny too. It includes careful designing, writing the implementation and test (the more options you have the more combinations you need to test), increasing running time of tests (CLI tests are slow!), and you need to return to that after adding every new feature for testing compatibility with them.
msg289725 - (view) Author: Bob Ippolito (bob.ippolito) * (Python committer) Date: 2017-03-16 18:21
Probably the best thing we could do here is to mirror the options available in similar tools, such as jq: https://stedolan.github.io/jq/manual/#Invokingjq

The relevant options here would be:

    --indent
    --tab
    --compact-output
    --sort-keys

The default indent in jq is 2, which I tend to prefer these days, but maybe 4 is still appropriate given PEP 8:

    $ echo '[{}, {"a": "b"}, 2, 3, 4]' | jq
    [
      {},
      {
        "a": "b"
      },
      2,
      3,
      4
    ]


This is how jq interprets --compact-output:

    $ echo '[{}, {"a": "b"}, 2, 3, 4]' | jq --compact-output
    [{},{"a":"b"},2,3,4]


I do not think that it's worth having the command-line tool cater to people that want to indent in other ways (e.g. using a string that isn't all spaces or a single tab).
msg291630 - (view) Author: Daniel Himmelstein (dhimmel) * Date: 2017-04-13 19:13
@bob.ippolito thanks for pointing to jq as a reference implementation. I updated the pull request (https://git.io/vS9o8) to implement all of the relevant options. Currently, the PR supports the following mutually exclusive arguments:

--indent
--no-indent
--tab
--compact

These additions took 16 new lines of code in tool.py and 41 new lines of tests. However, I am happy to refactor the tests to be less repetitive if we choose to go forward with these changes.

@serhiy.storchaka I took a maximalist approach with respect to adding indentation options to GH #345. Although I know not all of the options may get merged, I thought we might as well work backwards.

However, the more I think about it, I do think every option above is a unique and valuable addition. I think that even with the changes, json.tool remains a lightweight wrapper json.load + json.dump.
History
Date User Action Args
2017-04-13 19:13:20dhimmelsetmessages: + msg291630
2017-03-16 18:21:20bob.ippolitosetmessages: + msg289725
2017-03-16 14:07:04serhiy.storchakasetmessages: + msg289715
2017-03-16 14:00:32dhimmelsetmessages: + msg289714
2017-03-16 13:35:01r.david.murraysetnosy: + r.david.murray
messages: + msg289713
2017-03-16 13:15:22serhiy.storchakasetnosy: + rhettinger, bob.ippolito, ezio.melotti
messages: + msg289712
2017-03-03 19:20:13dhimmelsetmessages: + msg288918
2017-03-03 17:43:47serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg288905
2017-03-03 13:23:01Serhiy Intsetpull_requests: + pull_request355
2017-02-27 15:53:06dhimmelsetmessages: + msg288646
2017-02-27 15:50:17dhimmelsetpull_requests: - pull_request230
2017-02-27 15:37:42dhimmelsetpull_requests: + pull_request298
2017-02-23 20:05:56dhimmelcreate