classification
Title: Multiline ast.dump()
Type: enhancement Stage: resolved
Components: Library (Lib) Versions: Python 3.9
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: Anthony Sottile, benjamin.peterson, brett.cannon, rhettinger, serhiy.storchaka, terry.reedy, xtreak, yselivanov
Priority: normal Keywords: patch

Created on 2019-08-31 09:29 by serhiy.storchaka, last changed 2019-09-09 20:42 by serhiy.storchaka. This issue is now closed.

Files
File name Uploaded Description Edit
ast_bloop.py rhettinger, 2019-09-02 03:56 display() is a generic AST tree display pretty printer
Pull Requests
URL Status Linked Edit
PR 15631 merged serhiy.storchaka, 2019-08-31 10:37
Messages (8)
msg350913 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2019-08-31 09:29
ast.dump() is mainly useful for debugging purposes. Unfortunately the output is too long and complex even for simple examples. It contains too much nested calls and lists. 

>>> import ast
>>> node = ast.parse('spam(eggs, "and cheese")')
>>> print(ast.dump(node))
Module(body=[Expr(value=Call(func=Name(id='spam', ctx=Load()), args=[Name(id='eggs', ctx=Load()), Constant(value='and cheese', kind=None)], keywords=[]))], type_ignores=[])

It is worse if include more information:

>>> print(ast.dump(node, include_attributes=True))
Module(body=[Expr(value=Call(func=Name(id='spam', ctx=Load(), lineno=1, col_offset=0, end_lineno=1, end_col_offset=4), args=[Name(id='eggs', ctx=Load(), lineno=1, col_offset=5, end_lineno=1, end_col_offset=9), Constant(value='and cheese', kind=None, lineno=1, col_offset=11, end_lineno=1, end_col_offset=23)], keywords=[], lineno=1, col_offset=0, end_lineno=1, end_col_offset=24), lineno=1, col_offset=0, end_lineno=1, end_col_offset=24)], type_ignores=[])

And for larger examples it is almost unusable.

I propose to make ast.dump() producing a multiline indented output. Add the optional "indent" parameter. If it is a non-negative number or a string, the output if formatted with the specified indentation. If it is None (by default), the output is a single string.

>>> print(ast.dump(node, indent=3))
Module(
   body=[
      Expr(
         value=Call(
            func=Name(
               id='spam',
               ctx=Load()),
            args=[
               Name(
                  id='eggs',
                  ctx=Load()),
               Constant(
                  value='and cheese',
                  kind=None)],
            keywords=[]))],
   type_ignores=[])

Looks better, no?

I am not sure about closing parenthesis. Should they be attached to the last item (as above) or split on a separate line (as below)? Or use some heuristic to make the output more readable and compact?

>>> print(ast.dump(node, indent=3))
Module(
   body=[
      Expr(
         value=Call(
            func=Name(
               id='spam',
               ctx=Load()
            ),
            args=[
               Name(
                  id='eggs',
                  ctx=Load()
               ),
               Constant(
                  value='and cheese',
                  kind=None
               )
            ],
            keywords=[]
         )
      )
   ],
   type_ignores=[]
)
msg350914 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2019-08-31 09:30
See also issue36287.
msg350934 - (view) Author: Anthony Sottile (Anthony Sottile) * Date: 2019-09-01 01:51
neat, this looks like a similar api to astpretty: https://github.com/asottile/astpretty
msg350962 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2019-09-01 20:24
FWIW, I wrote a generic AST pretty printer for a personal compiler project (see attached file).  Perhaps it can be adapted to the Python AST.

## Example input ###############################

Program(procs=[Procedure(name='FACTORIAL', params=['N'], is_test=False, body=Block(blocknum=0, stmts=[Assign(name=Output(is_bool=False), value=Number(x=1)), Assign(name=Cell(i=0), value=Number(x=1)), Loop(times=Id(name='N'), fixed=False, body=Block(blocknum=1, stmts=[Assign(name=Output(is_bool=False), value=BinOp(value1=Output(is_bool=False), op='x', value2=Cell(i=0))), Assign(name=Cell(i=0), value=BinOp(value1=Cell(i=0), op='+', value2=Number(x=1)))]))]))], calls=[])

## Example output ###############################

Program(
   procs = [
      Procedure(
         name = 'FACTORIAL',
         params = [
            'N'
         ],
         is_test = False,
         body = Block(
            blocknum = 0,
            stmts = [
               Assign(
                  name = Output(is_bool=False),
                  value = Number(x=1)
               ),
               Assign(
                  name = Cell(i=0),
                  value = Number(x=1)
               ),
               Loop(
                  times = Id(name='N'),
                  fixed = False,
                  body = Block(
                     blocknum = 1,
                     stmts = [
                        Assign(
                           name = Output(is_bool=False),
                           value = BinOp(
                              value1 = Output(is_bool=False),
                              op = 'x',
                              value2 = Cell(i=0)
                           )
                        ),
                        Assign(
                           name = Cell(i=0),
                           value = BinOp(
                              value1 = Cell(i=0),
                              op = '+',
                              value2 = Number(x=1)
                           )
                        )
                     ]
                  )
               )
            ]
         )
      )
   ],
   calls = []
)
msg350963 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2019-09-01 20:26
It would be great is the tool wasn't tightly bound to our particular AST and could be used for any hand-rolled AST.
msg351271 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2019-09-06 22:07
Much better.  I prefer the first version with closers on the last item.
msg351296 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2019-09-07 08:57
> It would be great is the tool wasn't tightly bound to our particular AST and could be used for any hand-rolled AST.

I do not think it is possible. Every tree has its specifics: what types of nodes are supported (AST, list and primitive immutable values), what children the node can have (node._fields and node._attributes), how to get a child (getattr(node, name), but can be absent), what is the node name (node.__class__.__name__), how to represent nodes (constructor, list display and literals for primitives), what options are supported (annotate_fields, include_attributes and indent), what nodes are simple and what are complex. All these details are different in every AST implementation. Recursive walking the tree is trivial in general, handling specifics makes every foramating function unique.

What was interesting in the first Raymond's implementation, is that some simple nodes are written in one line. Its code can't be used with ast.AST because of different tree structure and different meaning of simple nodes (and also nodes with a single child are very rare), but PR 15631 has been updated to produce more compact output by using other heuristics:

>>> print(ast.dump(node, indent=3))
Module(
   body=[
      Expr(
         value=Call(
            func=Name(id='spam', ctx=Load()),
            args=[
               Name(id='eggs', ctx=Load()),
               Constant(value='and cheese', kind=None)],
            keywords=[]))],
   type_ignores=[])
msg351523 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2019-09-09 16:33
New changeset 850573b836d5b82d1a1ebe75a635aaa0a3dff997 by Serhiy Storchaka in branch 'master':
bpo-37995: Add an option to ast.dump() to produce a multiline output. (GH-15631)
https://github.com/python/cpython/commit/850573b836d5b82d1a1ebe75a635aaa0a3dff997
History
Date User Action Args
2019-12-01 18:58:56serhiy.storchakalinkissue19541 superseder
2019-09-09 20:42:49serhiy.storchakasetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2019-09-09 16:33:15serhiy.storchakasetmessages: + msg351523
2019-09-07 09:38:48serhiy.storchakalinkissue38049 dependencies
2019-09-07 08:57:26serhiy.storchakasetmessages: + msg351296
2019-09-06 22:07:48terry.reedysetnosy: + terry.reedy
messages: + msg351271
2019-09-02 03:56:09rhettingersetfiles: + ast_bloop.py
2019-09-02 03:55:42rhettingersetfiles: - ast_bloop.py
2019-09-01 20:26:03rhettingersetmessages: + msg350963
2019-09-01 20:24:08rhettingersetfiles: + ast_bloop.py

messages: + msg350962
2019-09-01 01:51:19Anthony Sottilesetnosy: + Anthony Sottile
messages: + msg350934
2019-08-31 12:23:37xtreaksetnosy: + xtreak
2019-08-31 10:37:37serhiy.storchakasetkeywords: + patch
stage: patch review
pull_requests: + pull_request15299
2019-08-31 09:30:46serhiy.storchakasetmessages: + msg350914
2019-08-31 09:29:28serhiy.storchakacreate