Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specifying indent in the json.tool command #73822

Closed
dhimmel mannequin opened this issue Feb 23, 2017 · 17 comments
Closed

Specifying indent in the json.tool command #73822

dhimmel mannequin opened this issue Feb 23, 2017 · 17 comments
Labels
3.9 only security fixes stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@dhimmel
Copy link
Mannequin

dhimmel mannequin commented Feb 23, 2017

BPO 29636
Nosy @rhettinger, @etrepum, @ezio-melotti, @bitdancer, @methane, @serhiy-storchaka, @dhimmel, @flavianh
PRs
  • bpo-29636: Add --indent / --no-indent arguments to json.tool #345
  • Allow setting indent width or character in json.tool #426
  • bpo-30971: Improve code readability of json.tool #2720
  • bpo-29636: improve CLI of json.tool #9765
  • bpo-29636: json.tool: Add document for indentation options. #17482
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2019-12-04.06:17:12.983>
    created_at = <Date 2017-02-23.20:05:56.924>
    labels = ['type-feature', 'library', '3.9']
    title = 'Specifying indent in the json.tool command'
    updated_at = <Date 2019-12-07.14:14:45.945>
    user = 'https://github.com/dhimmel'

    bugs.python.org fields:

    activity = <Date 2019-12-07.14:14:45.945>
    actor = 'methane'
    assignee = 'none'
    closed = True
    closed_date = <Date 2019-12-04.06:17:12.983>
    closer = 'methane'
    components = ['Library (Lib)']
    creation = <Date 2017-02-23.20:05:56.924>
    creator = 'dhimmel'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 29636
    keywords = ['patch']
    message_count = 17.0
    messages = ['288479', '288646', '288905', '288918', '289712', '289713', '289714', '289715', '289725', '291630', '312548', '312570', '348543', '348740', '348892', '357775', '357976']
    nosy_count = 8.0
    nosy_names = ['rhettinger', 'bob.ippolito', 'ezio.melotti', 'r.david.murray', 'methane', 'serhiy.storchaka', 'dhimmel', 'flavianhautbois']
    pr_nums = ['345', '426', '2720', '9765', '17482']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue29636'
    versions = ['Python 3.9']

    @dhimmel
    Copy link
    Mannequin Author

    dhimmel mannequin commented Feb 23, 2017

    The utility of python -m json.tool would increase if users could specify the indent level.

    Example use case: newlines in a JSON document are important for readability and the ability to open in a text editor. However, if the file is large, you can save space by decreasing the indent level.

    I added an --indent argument to json.tool in #201. However, design discussion is required since indent can take an int, string, or None. In addition, a indent string is a tab, which is difficult to pass via a command line argument.

    Currently, I added the following function to convert the indent option to the indent parameter of json.dump:

    def parse_indent(indent):
        """Parse the argparse indent argument."""
        if indent == 'None':
            return None
        if indent == r'\t':
            return '\t'
        try:
            return int(indent)
        except ValueError:
            return indent
    

    @inada.naoki mentioned the special casing is undesirable. I agree, but can't think of an alternative. Advice appreciated.

    @dhimmel dhimmel mannequin added topic-IO 3.7 (EOL) end of life type-feature A feature request or enhancement labels Feb 23, 2017
    @dhimmel
    Copy link
    Mannequin Author

    dhimmel mannequin commented Feb 27, 2017

    For discussion on how to implement this, see

    + #201 (comment)
    + #201 (comment)
    + #201 (comment)

    Implementation now moved to #345

    @serhiy-storchaka
    Copy link
    Member

    The discussion is scattered between different tracker issues and pull requests. Let continue it in one place.

    I don't think we should add too much options for controlling any little detail. json.tools is just small utility purposed mainly for debugging. If you need to control more details, it is not hard to write simple Python script.

    For example, for "compact" output:

    $ python3 -c "import sys, json; json.dump(json.load(sys.stdin), sys.stdout, separators=(',', ':'))"

    @dhimmel
    Copy link
    Mannequin Author

    dhimmel mannequin commented Mar 3, 2017

    To recap the discussion from https://git.io/vyCY8: there are three potential mutually exclusive command line options that have been suggested. There are as follows.

    import json
    obj = [1, 2]
    
    print('--indent=4')
    print(json.dumps(obj, indent=4))
    
    print('--no-indent')
    print(json.dumps(obj, indent=None))
    
    print('--compact')
    print(json.dumps(obj, separators=(',', ':')))

    which produces the following output:

    --indent=4
    [
        1,
        2
    ]
    --no-indent
    [1, 2]
    --compact
    [1,2]
    

    Currently, #345 has implemented --indent and --no-indent. One suggestion was to replace --no-indent with --compact, but that would prevent json.tool from outputting YAML < 1.2 compatible JSON. Therefore, the main question is whether to add --compact or not?

    There is a clear use case for --compact. However, it requires a bit more "logic" as there is no compact argument in json.dump. Therefore @serhiy.storchaka suggests not adding --compact to keep json.tool lightweight.

    I disagree that json.tool is "mainly for debugging". I encounter lot's of applications were JSON needs to be reformatted, so I see json.tool as a cross-platform command line utility.

    However, I am more concerned that the JSON API may change, especially with respect to the compact encoding (see http://bugs.python.org/issue29540). Therefore, it seems prudent to let the API evolve and later revisit whether a --compact argument makes sense.

    The danger of adding --compact now would be if json.dump adopts an argument for --compact that is not compact. Then aligning the json.tool and json.dump terminology could require backwards incompatible changes.

    @serhiy-storchaka
    Copy link
    Member

    I'm not particularly interested in this feature. Adding two or three options looks excessive. Python is a programming language and it is easy to write a simple script for your needs. Much easier than implement the general command line interface that supports all options of programming API.

    @bitdancer
    Copy link
    Member

    Easier, but if we do it in the tool, then it is done for everyone and they don't *each* have to spend that "less time" writing their own script. And --indent and --compact are both useful for debugging/hand testing, since it allows you to generate the input your code is expecting, and input your code might not be expecting.

    On the other hand, I'm not contributing much these days, so I'm not in a good position to be suggesting additions to the support burden :(.

    @dhimmel
    Copy link
    Mannequin Author

    dhimmel mannequin commented Mar 16, 2017

    @serhiy.storchaka I totally understand the desire to keep json.tool simple. However, given the description of json.tool in the documentation (below), I think an indentation option is within scope:

    The json.tool module provides a simple command line interface to validate and pretty-print JSON objects.

    Indentation/newlines are a fundamental aspect of "pretty-printing". Right now I rarely use json.tool, since indent=4 is too extreme from a visual and file size perspective. Instead I prefer indent=2 (or even indent=1) and I now have to:

    1. create a python script to set my desired input
    2. make sure every environment has access to this python script (the real annoyance)

    Currently, json.tool has a --sort-keys argument, which I think is great. --sort-keys is also an essential feature from my perspective (bpo-21650). So in short, I think json.tool is close to being a really useful utility but falls a bit short without an indentation option.

    Given that json.tool exists, I think it makes sense to take steps to make sure it's actually relevant as a json reformatter. Given this motivation, I'm not opposed to adding --compact, if we're confident it will be forward compatible with the json.dump API.

    @serhiy-storchaka
    Copy link
    Member

    It's not just the support burden. It is also a burden of learning and remembering new API.

    The support burden itself is not tiny too. It includes careful designing, writing the implementation and test (the more options you have the more combinations you need to test), increasing running time of tests (CLI tests are slow!), and you need to return to that after adding every new feature for testing compatibility with them.

    @etrepum
    Copy link
    Mannequin

    etrepum mannequin commented Mar 16, 2017

    Probably the best thing we could do here is to mirror the options available in similar tools, such as jq: https://stedolan.github.io/jq/manual/#Invokingjq

    The relevant options here would be:

    --indent
    --tab
    --compact-output
    --sort-keys
    

    The default indent in jq is 2, which I tend to prefer these days, but maybe 4 is still appropriate given PEP-8:

        $ echo '[{}, {"a": "b"}, 2, 3, 4]' | jq
        [
          {},
          {
            "a": "b"
          },
          2,
          3,
          4
        ]

    This is how jq interprets --compact-output:

        $ echo '[{}, {"a": "b"}, 2, 3, 4]' | jq --compact-output
        [{},{"a":"b"},2,3,4]

    I do not think that it's worth having the command-line tool cater to people that want to indent in other ways (e.g. using a string that isn't all spaces or a single tab).

    @dhimmel
    Copy link
    Mannequin Author

    dhimmel mannequin commented Apr 13, 2017

    @bob.ippolito thanks for pointing to jq as a reference implementation. I updated the pull request (https://git.io/vS9o8) to implement all of the relevant options. Currently, the PR supports the following mutually exclusive arguments:

    --indent
    --no-indent
    --tab
    --compact

    These additions took 16 new lines of code in tool.py and 41 new lines of tests. However, I am happy to refactor the tests to be less repetitive if we choose to go forward with these changes.

    @serhiy.storchaka I took a maximalist approach with respect to adding indentation options to GH #345. Although I know not all of the options may get merged, I thought we might as well work backwards.

    However, the more I think about it, I do think every option above is a unique and valuable addition. I think that even with the changes, json.tool remains a lightweight wrapper json.load + json.dump.

    @methane methane added stdlib Python modules in the Lib dir 3.8 only security fixes and removed topic-IO 3.7 (EOL) end of life labels Feb 22, 2018
    @methane
    Copy link
    Member

    methane commented Feb 22, 2018

    I'm OK to options in current pull request.
    And I think this bike-shedding discussion is not so important to pay our time and energy.

    Does anyone have strong opinion?
    If no, I'll merge the PR.

    @serhiy-storchaka
    Copy link
    Member

    I don't think this PR should be merged. It adds too much options. I think that if one needs more control than the current json.tools command line interface gives, he should use the Python interface. Don't forget, that Python is a programming language.

    @flavianh
    Copy link
    Mannequin

    flavianh mannequin commented Jul 27, 2019

    So what do we do about this?

    Two possibilities:

    1. We merge PR 9765 and close PRs 345 and 201, as 9765 seems more straighforward and was already approved. 9765 should be resubmitted to be merged since the base repo does not exist anymore, I could do that.
    2. We consider that this is out of scope for Python, and since jq is widely used, it does not make a lot of sense to include it. We should then close all PRs to keep our pull requests clean

    I suggest going with 2

    @dhimmel
    Copy link
    Mannequin Author

    dhimmel mannequin commented Jul 30, 2019

    Since opening this issue, I've encountered several additional instances where indentation control would have been nice. I don't agree that jq is a sufficient substitute:

    1. jq is generally not pre-installed on systems. For projects where users are guaranteed to have Python installed (but may be on any operating system), it is most straightforward to be able to just use a python command and not have to explain how to install jq on 3 different OSes.

    2. jq does a lot more than prettifying JSON. The simple use case of reformatting JSON (to/from a file or stdin/stdout) can be challenging.

    3. json.tool describes itself as a "simple command line interface to ... pretty-print JSON objects". Indentation is an essential aspect of pretty printing. Why even have this CLI if we're unwilling to add essential basic functionality that is already well supported by the underlying json.dump API?

    Python excels at APIs that match user needs. It seems like we're a small change away from making json.tool flexible enough to satisfy real-world needs.

    So I'm in favor of merging PR 9765 or PR 345. I'm happy to do the work on either to get them mergeable based on whatever specification we can agree on here.

    @rhettinger
    Copy link
    Contributor

    [Serhiy]

    I don't think this PR should be merged. It adds too much options.

    [Daniel]

    Since opening this issue, I've encountered several additional
    instances where indentation control would have been nice.
    I don't agree that jq is a sufficient substitute.

    I'm inclined to agree with Serhiy. While there are some handy command-line calls, they are secondary offshoots to the standard library and tend to be lightweight rather than full featured. They serve various purposes from quick demos to support of development and testing. In general, they aren't intended to be applications unto themselves: we don't document all command line tools in one place, we don't version number them, we don't maintain forums or web pages for them, we typically don't lock in their behaviors with a unit tests, we don't guarantee that they will be available across versions, and we periodically change their APIs (for example when switching from getopt to argparse).

    @methane
    Copy link
    Member

    methane commented Dec 4, 2019

    New changeset 0325794 by Inada Naoki (Daniel Himmelstein) in branch 'master':
    bpo-29636: Add --(no-)indent arguments to json.tool (GH-345)
    0325794

    @methane methane added 3.9 only security fixes and removed 3.8 only security fixes labels Dec 4, 2019
    @methane methane closed this as completed Dec 4, 2019
    @methane
    Copy link
    Member

    methane commented Dec 7, 2019

    New changeset 15fb7fa by Inada Naoki (Daniel Himmelstein) in branch 'master':
    bpo-29636: json.tool: Add document for indentation options. (GH-17482)
    15fb7fa

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.9 only security fixes stdlib Python modules in the Lib dir type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants