ast.dump() is mainly useful for debugging purposes. Unfortunately the output is too long and complex even for simple examples. It contains too much nested calls and lists.
>>> import ast
>>> node = ast.parse('spam(eggs, "and cheese")')
>>> print(ast.dump(node))
Module(body=[Expr(value=Call(func=Name(id='spam', ctx=Load()), args=[Name(id='eggs', ctx=Load()), Constant(value='and cheese', kind=None)], keywords=[]))], type_ignores=[])
It is worse if include more information:
>>> print(ast.dump(node, include_attributes=True))
Module(body=[Expr(value=Call(func=Name(id='spam', ctx=Load(), lineno=1, col_offset=0, end_lineno=1, end_col_offset=4), args=[Name(id='eggs', ctx=Load(), lineno=1, col_offset=5, end_lineno=1, end_col_offset=9), Constant(value='and cheese', kind=None, lineno=1, col_offset=11, end_lineno=1, end_col_offset=23)], keywords=[], lineno=1, col_offset=0, end_lineno=1, end_col_offset=24), lineno=1, col_offset=0, end_lineno=1, end_col_offset=24)], type_ignores=[])
And for larger examples it is almost unusable.
I propose to make ast.dump() producing a multiline indented output. Add the optional "indent" parameter. If it is a non-negative number or a string, the output if formatted with the specified indentation. If it is None (by default), the output is a single string.
>>> print(ast.dump(node, indent=3))
Module(
body=[
Expr(
value=Call(
func=Name(
id='spam',
ctx=Load()),
args=[
Name(
id='eggs',
ctx=Load()),
Constant(
value='and cheese',
kind=None)],
keywords=[]))],
type_ignores=[])
Looks better, no?
I am not sure about closing parenthesis. Should they be attached to the last item (as above) or split on a separate line (as below)? Or use some heuristic to make the output more readable and compact?
>>> print(ast.dump(node, indent=3))
Module(
body=[
Expr(
value=Call(
func=Name(
id='spam',
ctx=Load()
),
args=[
Name(
id='eggs',
ctx=Load()
),
Constant(
value='and cheese',
kind=None
)
],
keywords=[]
)
)
],
type_ignores=[]
)
|
FWIW, I wrote a generic AST pretty printer for a personal compiler project (see attached file). Perhaps it can be adapted to the Python AST.
## Example input ###############################
Program(procs=[Procedure(name='FACTORIAL', params=['N'], is_test=False, body=Block(blocknum=0, stmts=[Assign(name=Output(is_bool=False), value=Number(x=1)), Assign(name=Cell(i=0), value=Number(x=1)), Loop(times=Id(name='N'), fixed=False, body=Block(blocknum=1, stmts=[Assign(name=Output(is_bool=False), value=BinOp(value1=Output(is_bool=False), op='x', value2=Cell(i=0))), Assign(name=Cell(i=0), value=BinOp(value1=Cell(i=0), op='+', value2=Number(x=1)))]))]))], calls=[])
## Example output ###############################
Program(
procs = [
Procedure(
name = 'FACTORIAL',
params = [
'N'
],
is_test = False,
body = Block(
blocknum = 0,
stmts = [
Assign(
name = Output(is_bool=False),
value = Number(x=1)
),
Assign(
name = Cell(i=0),
value = Number(x=1)
),
Loop(
times = Id(name='N'),
fixed = False,
body = Block(
blocknum = 1,
stmts = [
Assign(
name = Output(is_bool=False),
value = BinOp(
value1 = Output(is_bool=False),
op = 'x',
value2 = Cell(i=0)
)
),
Assign(
name = Cell(i=0),
value = BinOp(
value1 = Cell(i=0),
op = '+',
value2 = Number(x=1)
)
)
]
)
)
]
)
)
],
calls = []
)
|
> It would be great is the tool wasn't tightly bound to our particular AST and could be used for any hand-rolled AST.
I do not think it is possible. Every tree has its specifics: what types of nodes are supported (AST, list and primitive immutable values), what children the node can have (node._fields and node._attributes), how to get a child (getattr(node, name), but can be absent), what is the node name (node.__class__.__name__), how to represent nodes (constructor, list display and literals for primitives), what options are supported (annotate_fields, include_attributes and indent), what nodes are simple and what are complex. All these details are different in every AST implementation. Recursive walking the tree is trivial in general, handling specifics makes every foramating function unique.
What was interesting in the first Raymond's implementation, is that some simple nodes are written in one line. Its code can't be used with ast.AST because of different tree structure and different meaning of simple nodes (and also nodes with a single child are very rare), but PR 15631 has been updated to produce more compact output by using other heuristics:
>>> print(ast.dump(node, indent=3))
Module(
body=[
Expr(
value=Call(
func=Name(id='spam', ctx=Load()),
args=[
Name(id='eggs', ctx=Load()),
Constant(value='and cheese', kind=None)],
keywords=[]))],
type_ignores=[])
|