classification
Title: add dump_json to ast module
Type: enhancement Stage: resolved
Components: Library (Lib) Versions: Python 3.9
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: Nosy List: BTaskaya, benjamin.peterson, pablogsal, sparverius
Priority: normal Keywords: patch

Created on 2020-02-19 06:23 by sparverius, last changed 2020-05-23 18:36 by cheryl.sabella. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 18558 closed sparverius, 2020-02-19 06:34
Messages (7)
msg362256 - (view) Author: Richard K (sparverius) * Date: 2020-02-19 06:23
Currently within the ast module, `dump` generates a string representation of the AST for example,

>>> ast.dump(node)
'Module(body=[], type_ignores=[])'


The proposed enhancement would provide a complementary function, `dump_json` as in a json representation of the ast. 
This would be useful for those who would like to benefit from the utilities of the json module for formatting, pretty-printing, and the like.  
It would also be useful for those who want to serialize the AST or export it in a form that can be consumed in an other programming language.
A simplified example, 


>>> import ast
>>> node = ast.parse('')
>>> ast.dump_json(node)
{'Module': {'body': [], 'type_ignores': []}}


A simplified example of using `ast.dump_json` with the json module,

>>> import json
>>> json.dumps(ast.dump_json(node))
'{"Module": {"body": [], "type_ignores": []}}'
msg362273 - (view) Author: Batuhan Taskaya (BTaskaya) * (Python committer) Date: 2020-02-19 14:00
> The proposed enhancement would provide a complementary function, `dump_json` as in a json representation of the ast. 

IMHO this is not a feature that has a general usage. If you want, as far as I can see, there are some packages for doing that in PyPI already. Also, the patch looks small so you can just add this to the required project.

> This would be useful for those who would like to benefit from the utilities of the json module for formatting, pretty-printing, and the like.  

ast.dump now can dump in pretty-printed way.

> It would also be useful for those who want to serialize the AST or export it in a form that can be consumed in an other programming language.

As I said, serializing/exporting the AST isn't a common use case. It is a very specific goal.

In general, I'm -1 on this.
msg362274 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2020-02-19 14:19
Thanks, Richard for your proposal. I concur with Batuhan: I am -1 as well on this addition. Echoing some of the same ideas, I think this is specialized enough that does not make sense to have it in the standard library, especially if a Pypi package already exists. Additionally, this is straightforward to implement for very simple cases but PR18558 will fail for very generic ASTs if they are deep enough (it uses recursion).
msg362284 - (view) Author: Richard K (sparverius) * Date: 2020-02-19 18:02
Batuhan & Pablo thank you for your thoughts! Just wanted to reply to a few of the comments to clarify my position on the issue.


> IMHO this is not a feature that has a general usage. If you want, as far as I can see, there are some packages for doing that in PyPI already. Also, the patch looks small so you can just add this to the required project.


There seems to be movement towards a general usage. For instance, take a look at clang, in particular the flag '-ast-dump=json'.

$ clang -cc1 -ast-dump=json foo.cc


> ast.dump now can dump in pretty-printed way.

Indeed however, there is not much one can do further with the output of ast.dump. With ast.dump_json one would benefit from programmer-centric functionality.

-- 

> Thanks, Richard for your proposal. I concur with Batuhan: I am -1 as well on this addition. Echoing some of the same ideas, I think this is specialized enough that does not make sense to have it in the standard library, especially if a Pypi package already exists. 


After just browsing the the pypi package/s you may be referring to, it appears that they do so in non-standard ways.


> Additionally, this is straightforward to implement for very simple cases but PR18558 will fail for very generic ASTs if they are deep enough (it uses recursion).


The implementation of ast.dump also uses recursion. I have tested ast.dump_json on sufficiently large source files and have not run into recursion depth exceeded issues.


Thanks again for your perspective!
msg362285 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2020-02-19 18:15
> There seems to be movement towards a general usage. For instance, take a look at clang, in particular the flag '-ast-dump=json'.

I don't think the clang argument holds because clang is a command-line tool after all and it makes sense that it can produce several outputs while the ast module is exposes APIs that you can further process inside the language. Having json from the clang output will require more than one tool if clang does not support it while doing it in Python only requires Python.

> it appears that they do so in non-standard ways.

Can you clarify what do you mean with that? 

> The implementation of ast.dump also uses recursion. I have tested ast.dump_json on sufficiently large source files and have not run into recursion depth exceeded issues.

This is not the primary argumet as by itself is weaker because this is an edge case but for instance, here is an example of ast.dump succeeding and your tool failing:


>>> x = ast.List()
>>> for _ in range(1010):
    ...:     x = ast.List(x)
    ...:

>>> ast.dump(x)
'List(elts=List(elts=List(elts=List(elts=List(elts=L......

>>> dump_json(x)
---------------------------------------------------------------------------
RecursionError                            Traceback (most recent call last)
<ipython-input-22-fadef4fb6a0d> in <module>
RecursionError: maximum recursion depth exceeded while calling a Python object
msg362286 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2020-02-19 18:17
Just to clarify: ast.dump *will* fail with a more deph object as well, I am not claiming that ast.dump will parse everything because of course suffers the same problem.
msg362289 - (view) Author: Richard K (sparverius) * Date: 2020-02-19 19:46
> I don't think the clang argument holds because clang is a command-line tool after all and it makes sense that it can produce several outputs while the ast module is exposes APIs that you can further process inside the language. Having json from the clang output will require more than one tool if clang does not support it while doing it in Python only requires Python.

I see what you mean. I was just trying to illustrate that such a feature is desired by some. 

Perhaps 'Python only requires Python' means that Python _could_ be the first widely used language with such a superior meta-programming feature with respect to AST analysis/code generation. 


> > it appears that they do so in non-standard ways.

> Can you clarify what do you mean with that? 

By non-standard I mean that the resulting json does not follow the structure of the tree explicitly. For example with ast2json, '"_type": "Print"' includes a (somewhat misleading) key that is not represented in the actual AST. 

Example of ast2json output (example found here, https://github.com/YoloSwagTeam/ast2json#example),

{
    "body": [
        {
            "_type": "Print",
            "nl": true,
            "col_offset": 0,
            "dest": null,
            "values": [
                {
                    "s": "Hello World!",
                    "_type": "Str",
                    "lineno": 1,
                    "col_offset": 6
                }
            ],
            "lineno": 1
        }
    ],
    "_type": "Module"
}


> Just to clarify: ast.dump *will* fail with a more deph object as well, I am not claiming that ast.dump will parse everything because of course suffers the same problem.

Makes sense. As you mentioned, these are edge cases which I assume will not be an issue for those seeking to gain the benefits of 'ast.dump_json'
History
Date User Action Args
2020-05-23 18:36:51cheryl.sabellasetstatus: open -> closed
resolution: rejected
stage: patch review -> resolved
2020-02-19 19:46:38sparveriussetmessages: + msg362289
2020-02-19 18:17:10pablogsalsetmessages: + msg362286
2020-02-19 18:15:31pablogsalsetmessages: + msg362285
2020-02-19 18:02:17sparveriussetmessages: + msg362284
2020-02-19 14:19:42pablogsalsetmessages: + msg362274
2020-02-19 14:00:14BTaskayasetnosy: + benjamin.peterson, pablogsal
messages: + msg362273
2020-02-19 12:28:50BTaskayasetnosy: + BTaskaya
2020-02-19 06:34:29sparveriussetkeywords: + patch
stage: patch review
pull_requests: + pull_request17938
2020-02-19 06:23:22sparveriuscreate