Title: Provide a way to compare AST nodes for equality recursively
Type: enhancement Stage: patch review
Components: Library (Lib) Versions: Python 3.8
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: BTaskaya, Julian, Philip Dye, anthony shaw, anthonypjshaw, benjamin.peterson, levkivskyi, louielu, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2012-09-20 18:43 by Julian, last changed 2020-03-28 18:12 by BTaskaya.

Pull Requests
URL Status Linked Edit
PR 1368 closed louielu, 2017-05-01 14:47
PR 1375 closed louielu, 2017-05-02 08:03
PR 14970 open flavianhautbois, 2019-07-26 20:49
PR 19211 open BTaskaya, 2020-03-28 18:12
Messages (17)
msg170826 - (view) Author: Julian Berman (Julian) * Date: 2012-09-20 18:43
As is, as far as I can tell, there's no way to easily compare two AST nodes to see if they have the same children and same fields (recursively).

I'm writing some unit tests for a NodeTransformers, so I've settled for comparing `ast.dump()`s of each, which is kind of dirty, but 1) works and 2) produces reasonable failure messages. (As a side note of what else I've tried, comparing, say, a list of the `iter_child_nodes` is not a good alternative, since the tests I'm writing are making assertions that a given node was not modified, which means they deepcopy the node and then want to assert that the "transformed" node is unchanged.)

I don't know the global implications of changing ast.AST.__eq__ to know if that's feasible (hopefully someone will comment on that), but if it isn't, another provided way would be nice.
msg170907 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2012-09-21 17:42
This is a reasonable request. Should comparison include lineno/col_offset or not? If you implement, __eq__, you should also implement __hash__. Maybe, it would be best to start with a helper comparison function in the ast module.
msg170909 - (view) Author: Julian Berman (Julian) * Date: 2012-09-21 18:12
I'd say yes (to both lineno/col_offset). And yeah that sounds like what I had in mind (a helper function).

If I'm specific for a moment about implementation, perhaps something like `ast.diff`, which yielded tuples of differing nodes (in say, breadth first order down the tree) from two given nodes, and took args for configuration, compare_lineno and compare_col_offset (both defaulted to True), and then __eq__ was just `next(ast.diff(self, other, compare_lineno=True, compare_col_offset=True), None) is None`.

Sound good to you?
msg171000 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2012-09-22 14:46
Yes, though some things like what to return if one has an entire subtree that the other doesn't have will have to be worked out.
msg185288 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2013-03-26 18:13
I have a use for this as well, but w/o the lineno/col_offset comparison and all I need is a True/False result.
msg185289 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2013-03-26 18:17
IOW I think a function that uses ast.walk() and has flags for specifying whether _attributes should also be checked and then uses a class check and then uses _fields to do all other checking.
msg292656 - (view) Author: Louie Lu (louielu) * Date: 2017-05-01 10:04
If we only need a binary True/False result, could we just return a compare of dump(a) == dump(b)?

This can also add the include_attributes flags for need.
msg292665 - (view) Author: Louie Lu (louielu) * Date: 2017-05-01 14:50
Provide a recursive way to compare AST nodes, it will compare with fields, but no attributes.

The performance compare with ast.dump methods in unittest

# Recursive compare
./python -m unittest test.test_ast.ASTCompareTest
Ran 6 tests in 0.669s


# ast.dump compare
./python -m unittest test.test_ast.ASTCompareTest
Ran 6 tests in 2.192s
msg292717 - (view) Author: Louie Lu (louielu) * Date: 2017-05-02 06:35
Update to AST base type richcompare.

the unittest finished time was better than python version:

    Ran 7 tests in 0.278s
msg341541 - (view) Author: anthony shaw (anthony shaw) Date: 2019-05-06 15:39
This discussion is inconclusive and targets an old version of CPython, can this issue be closed?
msg341546 - (view) Author: anthony shaw (anthonypjshaw) * (Python triager) Date: 2019-05-06 15:49
Closing issue, PR branch has since been removed and targets Python 3.4
msg341555 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-05-06 16:23
Please don't close an issue too fast.

The PR 1368 is still open and contains valuable code.

Moreover, I don't see anyone here saying that the feature is a bad idea. The feature has not been implemented, so the issue should remain open, even if PR 1368 is outdated.

> ...issue... targets Python 3.4

Well, it's just that the issue has been created a long time ago, but the feature request remain value for Python 3.8.
msg342118 - (view) Author: Ivan Levkivskyi (levkivskyi) * (Python committer) Date: 2019-05-10 18:44
Btw, I am +1 on this feature (preferably with an option to check line, column, end line, and end column). I always wanted this, but never had time to actually implement this.
msg348545 - (view) Author: Philip Dye (Philip Dye) Date: 2019-07-27 14:32
If consensus has been reached on this, I am willing to do the work.
msg349185 - (view) Author: Ivan Levkivskyi (levkivskyi) * (Python committer) Date: 2019-08-07 17:55
> If consensus has been reached on this, I am willing to do the work.

It looks like there is already an active PR, there are some non-implemented comments from a core review.
msg365213 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-03-28 14:46
I am not sure that implementing a rich comparison of AST nodes is the right way. There are too much options (compare attributes or not, compare recursively or not, compare types or not) and __eq__ does not support options. In addition, it requires implementing compatible __hash__, but how do you implement a hash of mutable object? In addition, recursive comparison can introduce a regression in existing code -- instead of fast returning False the comparison will spent a nontrivial amount of time.

If implement a comparison of AST nodes, it should be a separate function which support multiple options. Would be nice also to have examples where this feature can be used before implementing it.

See also issue37792.
msg365215 - (view) Author: Batuhan Taskaya (BTaskaya) * (Python committer) Date: 2020-03-28 15:24
The solution to hashing used in PR 14970 was just using the default hashing function which is kind of a workaround to the real problem. I now concur with your comments. Will try to draft out an `` function.
Date User Action Args
2020-03-28 18:12:54BTaskayasetpull_requests: + pull_request18574
2020-03-28 15:24:16BTaskayasetnosy: + BTaskaya
messages: + msg365215
2020-03-28 14:46:54serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg365213
2020-01-29 00:39:26brett.cannonsetnosy: - brett.cannon
2019-08-07 17:55:31levkivskyisetmessages: + msg349185
2019-07-29 15:46:44vstinnersetnosy: - vstinner
2019-07-27 14:32:06Philip Dyesetnosy: + Philip Dye
messages: + msg348545
2019-07-26 20:49:25flavianhautboissetkeywords: + patch
stage: resolved -> patch review
pull_requests: + pull_request14739
2019-05-10 18:44:54levkivskyisetnosy: + levkivskyi
messages: + msg342118
2019-05-10 01:57:24vstinnersetversions: + Python 3.8, - Python 3.4
2019-05-06 16:23:18vstinnersetstatus: closed -> open
resolution: out of date ->
messages: + msg341555
2019-05-06 15:49:53anthonypjshawsetstatus: open -> closed

nosy: + anthonypjshaw
messages: + msg341546

resolution: out of date
stage: test needed -> resolved
2019-05-06 15:39:40anthony shawsetnosy: + anthony shaw
messages: + msg341541
2017-05-02 08:03:22louielusetpull_requests: + pull_request1483
2017-05-02 06:35:21louielusetmessages: + msg292717
2017-05-01 14:50:15louielusetmessages: + msg292665
2017-05-01 14:47:38louielusetpull_requests: + pull_request1477
2017-05-01 10:04:21louielusetnosy: + louielu
messages: + msg292656
2013-03-26 18:17:47brett.cannonsetmessages: + msg185289
2013-03-26 18:13:23brett.cannonsetmessages: + msg185288
2012-09-22 14:46:03benjamin.petersonsetmessages: + msg171000
2012-09-22 01:30:40terry.reedysetstage: test needed
versions: - Python 3.3
2012-09-21 18:12:13Juliansetmessages: + msg170909
2012-09-21 17:42:12benjamin.petersonsetnosy: + benjamin.peterson
messages: + msg170907
2012-09-21 12:42:10brett.cannonsetnosy: + brett.cannon
2012-09-20 21:30:11vstinnersetnosy: + vstinner
2012-09-20 18:43:56Juliancreate