Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiline ast.dump() #82176

Closed
serhiy-storchaka opened this issue Aug 31, 2019 · 8 comments
Closed

Multiline ast.dump() #82176

serhiy-storchaka opened this issue Aug 31, 2019 · 8 comments
Labels
3.9 only security fixes stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@serhiy-storchaka
Copy link
Member

BPO 37995
Nosy @brettcannon, @rhettinger, @terryjreedy, @benjaminp, @serhiy-storchaka, @1st1, @asottile, @tirkarthi
PRs
  • bpo-37995: Add an option to ast.dump() to produce a multiline output. #15631
  • Files
  • ast_bloop.py: display() is a generic AST tree display pretty printer
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2019-09-09.20:42:49.620>
    created_at = <Date 2019-08-31.09:29:28.418>
    labels = ['type-feature', 'library', '3.9']
    title = 'Multiline ast.dump()'
    updated_at = <Date 2019-09-09.20:42:49.620>
    user = 'https://github.com/serhiy-storchaka'

    bugs.python.org fields:

    activity = <Date 2019-09-09.20:42:49.620>
    actor = 'serhiy.storchaka'
    assignee = 'none'
    closed = True
    closed_date = <Date 2019-09-09.20:42:49.620>
    closer = 'serhiy.storchaka'
    components = ['Library (Lib)']
    creation = <Date 2019-08-31.09:29:28.418>
    creator = 'serhiy.storchaka'
    dependencies = []
    files = ['48580']
    hgrepos = []
    issue_num = 37995
    keywords = ['patch']
    message_count = 8.0
    messages = ['350913', '350914', '350934', '350962', '350963', '351271', '351296', '351523']
    nosy_count = 8.0
    nosy_names = ['brett.cannon', 'rhettinger', 'terry.reedy', 'benjamin.peterson', 'serhiy.storchaka', 'yselivanov', 'Anthony Sottile', 'xtreak']
    pr_nums = ['15631']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue37995'
    versions = ['Python 3.9']

    @serhiy-storchaka
    Copy link
    Member Author

    ast.dump() is mainly useful for debugging purposes. Unfortunately the output is too long and complex even for simple examples. It contains too much nested calls and lists.

    >>> import ast
    >>> node = ast.parse('spam(eggs, "and cheese")')
    >>> print(ast.dump(node))
    Module(body=[Expr(value=Call(func=Name(id='spam', ctx=Load()), args=[Name(id='eggs', ctx=Load()), Constant(value='and cheese', kind=None)], keywords=[]))], type_ignores=[])

    It is worse if include more information:

    >>> print(ast.dump(node, include_attributes=True))
    Module(body=[Expr(value=Call(func=Name(id='spam', ctx=Load(), lineno=1, col_offset=0, end_lineno=1, end_col_offset=4), args=[Name(id='eggs', ctx=Load(), lineno=1, col_offset=5, end_lineno=1, end_col_offset=9), Constant(value='and cheese', kind=None, lineno=1, col_offset=11, end_lineno=1, end_col_offset=23)], keywords=[], lineno=1, col_offset=0, end_lineno=1, end_col_offset=24), lineno=1, col_offset=0, end_lineno=1, end_col_offset=24)], type_ignores=[])

    And for larger examples it is almost unusable.

    I propose to make ast.dump() producing a multiline indented output. Add the optional "indent" parameter. If it is a non-negative number or a string, the output if formatted with the specified indentation. If it is None (by default), the output is a single string.

    >>> print(ast.dump(node, indent=3))
    Module(
       body=[
          Expr(
             value=Call(
                func=Name(
                   id='spam',
                   ctx=Load()),
                args=[
                   Name(
                      id='eggs',
                      ctx=Load()),
                   Constant(
                      value='and cheese',
                      kind=None)],
                keywords=[]))],
       type_ignores=[])

    Looks better, no?

    I am not sure about closing parenthesis. Should they be attached to the last item (as above) or split on a separate line (as below)? Or use some heuristic to make the output more readable and compact?

    >>> print(ast.dump(node, indent=3))
    Module(
       body=[
          Expr(
             value=Call(
                func=Name(
                   id='spam',
                   ctx=Load()
                ),
                args=[
                   Name(
                      id='eggs',
                      ctx=Load()
                   ),
                   Constant(
                      value='and cheese',
                      kind=None
                   )
                ],
                keywords=[]
             )
          )
       ],
       type_ignores=[]
    )

    @serhiy-storchaka serhiy-storchaka added 3.9 only security fixes stdlib Python modules in the Lib dir type-feature A feature request or enhancement labels Aug 31, 2019
    @serhiy-storchaka
    Copy link
    Member Author

    See also bpo-36287.

    @asottile
    Copy link
    Mannequin

    asottile mannequin commented Sep 1, 2019

    neat, this looks like a similar api to astpretty: https://github.com/asottile/astpretty

    @rhettinger
    Copy link
    Contributor

    FWIW, I wrote a generic AST pretty printer for a personal compiler project (see attached file). Perhaps it can be adapted to the Python AST.

    ## Example input ###############################

    Program(procs=[Procedure(name='FACTORIAL', params=['N'], is_test=False, body=Block(blocknum=0, stmts=[Assign(name=Output(is_bool=False), value=Number(x=1)), Assign(name=Cell(i=0), value=Number(x=1)), Loop(times=Id(name='N'), fixed=False, body=Block(blocknum=1, stmts=[Assign(name=Output(is_bool=False), value=BinOp(value1=Output(is_bool=False), op='x', value2=Cell(i=0))), Assign(name=Cell(i=0), value=BinOp(value1=Cell(i=0), op='+', value2=Number(x=1)))]))]))], calls=[])

    ## Example output ###############################

    Program(
    procs = [
    Procedure(
    name = 'FACTORIAL',
    params = [
    'N'
    ],
    is_test = False,
    body = Block(
    blocknum = 0,
    stmts = [
    Assign(
    name = Output(is_bool=False),
    value = Number(x=1)
    ),
    Assign(
    name = Cell(i=0),
    value = Number(x=1)
    ),
    Loop(
    times = Id(name='N'),
    fixed = False,
    body = Block(
    blocknum = 1,
    stmts = [
    Assign(
    name = Output(is_bool=False),
    value = BinOp(
    value1 = Output(is_bool=False),
    op = 'x',
    value2 = Cell(i=0)
    )
    ),
    Assign(
    name = Cell(i=0),
    value = BinOp(
    value1 = Cell(i=0),
    op = '+',
    value2 = Number(x=1)
    )
    )
    ]
    )
    )
    ]
    )
    )
    ],
    calls = []
    )

    @rhettinger
    Copy link
    Contributor

    It would be great is the tool wasn't tightly bound to our particular AST and could be used for any hand-rolled AST.

    @terryjreedy
    Copy link
    Member

    Much better. I prefer the first version with closers on the last item.

    @serhiy-storchaka
    Copy link
    Member Author

    It would be great is the tool wasn't tightly bound to our particular AST and could be used for any hand-rolled AST.

    I do not think it is possible. Every tree has its specifics: what types of nodes are supported (AST, list and primitive immutable values), what children the node can have (node._fields and node._attributes), how to get a child (getattr(node, name), but can be absent), what is the node name (node.__class__.__name__), how to represent nodes (constructor, list display and literals for primitives), what options are supported (annotate_fields, include_attributes and indent), what nodes are simple and what are complex. All these details are different in every AST implementation. Recursive walking the tree is trivial in general, handling specifics makes every foramating function unique.

    What was interesting in the first Raymond's implementation, is that some simple nodes are written in one line. Its code can't be used with ast.AST because of different tree structure and different meaning of simple nodes (and also nodes with a single child are very rare), but PR 15631 has been updated to produce more compact output by using other heuristics:

    >>> print(ast.dump(node, indent=3))
    Module(
       body=[
          Expr(
             value=Call(
                func=Name(id='spam', ctx=Load()),
                args=[
                   Name(id='eggs', ctx=Load()),
                   Constant(value='and cheese', kind=None)],
                keywords=[]))],
       type_ignores=[])

    @serhiy-storchaka
    Copy link
    Member Author

    New changeset 850573b by Serhiy Storchaka in branch 'master':
    bpo-37995: Add an option to ast.dump() to produce a multiline output. (GH-15631)
    850573b

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.9 only security fixes stdlib Python modules in the Lib dir type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants