Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ast.NodeVisitor no longer calls visit_Str #81098

Closed
asottile mannequin opened this issue May 14, 2019 · 17 comments
Closed

ast.NodeVisitor no longer calls visit_Str #81098

asottile mannequin opened this issue May 14, 2019 · 17 comments
Labels
3.8 only security fixes stdlib Python modules in the Lib dir

Comments

@asottile
Copy link
Mannequin

asottile mannequin commented May 14, 2019

BPO 36917
Nosy @serhiy-storchaka, @Carreau, @asottile, @isidentical
PRs
  • bpo-36917: Add default implementation of ast.NodeVisitor.visit_Constant(). #15490
  • [3.8] bpo-36917: Add default implementation of ast.NodeVisitor.visit_Constant(). (GH-15490) #15509
  • [3.7] bpo-36917: Backport basic test for ast.NodeVisitor. #15511
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2019-08-26.12:14:21.119>
    created_at = <Date 2019-05-14.14:21:22.553>
    labels = ['3.8', 'library']
    title = 'ast.NodeVisitor no longer calls visit_Str'
    updated_at = <Date 2019-08-26.12:14:21.117>
    user = 'https://github.com/asottile'

    bugs.python.org fields:

    activity = <Date 2019-08-26.12:14:21.117>
    actor = 'serhiy.storchaka'
    assignee = 'none'
    closed = True
    closed_date = <Date 2019-08-26.12:14:21.119>
    closer = 'serhiy.storchaka'
    components = ['Library (Lib)']
    creation = <Date 2019-05-14.14:21:22.553>
    creator = 'Anthony Sottile'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 36917
    keywords = ['patch']
    message_count = 17.0
    messages = ['342463', '342474', '342494', '342497', '342501', '342572', '342578', '342583', '342584', '342595', '342597', '342827', '346562', '350496', '350508', '350527', '350528']
    nosy_count = 6.0
    nosy_names = ['serhiy.storchaka', 'mbussonn', 'Anthony Sottile', 'BTaskaya', 'dhilst', 'Tyler Wince']
    pr_nums = ['15490', '15509', '15511']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue36917'
    versions = ['Python 3.8']

    @asottile
    Copy link
    Mannequin Author

    asottile mannequin commented May 14, 2019

    More fallout from the Constant change in https://bugs.python.org/issue32892

    minimal reproduction:

    import ast
    
    
    class V(ast.NodeVisitor):
        def visit_Str(self, node):
            print(node.s)
    
    
    def main():
        V().visit(ast.parse('x = "hi"'))
    
    
    if __name__ == '__main__':
        exit(main())
    $ python3.7 t.py
    hi
    $ python3.8 t.py
    $ python3.8 --version --version
    Python 3.8.0a4 (default, May  8 2019, 01:43:53) 
    [GCC 7.4.0]

    @asottile asottile mannequin added 3.8 only security fixes stdlib Python modules in the Lib dir labels May 14, 2019
    @asottile
    Copy link
    Mannequin Author

    asottile mannequin commented May 14, 2019

    wrong bpo, this is the right one: https://bugs.python.org/issue32892

    @Carreau
    Copy link
    Mannequin

    Carreau mannequin commented May 14, 2019

    Would it be useful to add a default implementation of visit_Constant(self, node) on NodeVisitor that go through all the isinstance() check and call the appropriate backward compatible method ?

    We would still have the simplification without having any breakage in backward compatibility.

    That feels like best of both worlds ?

    @asottile
    Copy link
    Mannequin Author

    asottile mannequin commented May 14, 2019

    There would still be a breakage for that if someone was defining py36+ visit_Constant (which would clobber the ast.NodeVisitor.visit_Constant if we were to add it)

    @Carreau
    Copy link
    Mannequin

    Carreau mannequin commented May 14, 2019

    There would still be a breakage for that if someone was defining py36+ visit_Constant (which would clobber the ast.NodeVisitor.visit_Constant if we were to add it)

    Ok, it would still break in some cases, but that would still be a net improvement for codebase that do not override visit_Constant; right ?

    We could also push the patch further if we really want and call explicitly NodeVisitor.visit_Constant before the overridden method in NodeVisitor.visit.

    Not advocating for the second part, but minimal breakage and code repetition in libraries when we can.

    @asottile
    Copy link
    Mannequin Author

    asottile mannequin commented May 15, 2019

    spent some more time thinking about this and I think we should strongly consider reverting. simplification in the core interpreter should not be weighed lightly against complexity and breaking changes for users.

    the change is also unfortunate because it reduces the expressivity of the ast

    as we discuss shims, I believe the intention is to remove the Num / Str / etc. layer eventually so it's really just delaying that.

    so my vote is to revert, keep the complexity in the compiler and out of user code

    @Carreau
    Copy link
    Mannequin

    Carreau mannequin commented May 15, 2019

    so my vote is to revert, keep the complexity in the compiler and out of user code

    Playing Devil advocate here, but the complexity depends on what transformations user want to do; with the current state of 3.8, a user that wants to visit all immutable types can do with a single visit_Constant; once 32892 reverted; they would need to implement all of visit_Str,NamedConstant,Num,Bytes, and have it called the same visit_Constant.

    @serhiy-storchaka
    Copy link
    Member

    This is not a new case. This is exactly the same problem which has already been fixed in multiple other projects.

    Could you show your real code Anthony? In all known opensource examples (astroid, pyflakes, genshi, chameleon, mako) the compatibility fix is as simple as

        def visit_Constant(self, node):
            if node.value is Ellipsis:
                self._write('...')
            else:
                self._write(repr(node.value))

    In return, you can remove several methods, such as visit_Str, visit_Num, etc, once you drop the support of Python versions <3.8.

    AST is not stable. Virtually in every version the new nodes are added: for asynchronous constructions, for f-strings, etc, and the structure and meaning of existing nodes can be changed.

    @asottile
    Copy link
    Mannequin Author

    asottile mannequin commented May 15, 2019

    The simplest case is just the addition of an isinstance check:

    https://github.com/asottile/dead/blob/85f5edbb84b5e118beab4be3346a630e41418a02/dead.py#L165-L170

    class V(ast.NodeVisitor):
        def visit_Str(self, node):
            ...
    
        def visit_Constant(self, node):
            if isinstance(node.value, str):
                self.visit_Str(node)
            else:
                self.generic_visit(node)

    But I have another case where there's much more complex (sorry this one's private, I'll give the pseudo-code of it though which ends up in a whole mess of slow isinstance(...) calls

    class V(ast.NodeVisitor):
        def visit_Str(self, node):
            ...
    
        def visit_Bytes(self, node):
            ...
    
        def visit_Num(self, node):
            ...
    
        def visit_Constant(self, node):
            if isinstance(node.value, str):
                 return self.visit_Str(node)
            # Annoying special case, bools are ints, before this wouldn't call
            # `visit_Num` because there was a separate `visit_NameConstant` for `True` / `False`
            elif isinstance(node.value, int) and not isinstance(node.value, bool):
                return self.visit_Num(node)
            elif isinstance(node.value, bytes):
                return self.visit_Bytes(node)
            else:
                return self.generic_visit(node)

    Whereas the opposite case (where handling all constants the same) is much much easier to simplify the code and not need the Constant mess (if required)

    class V(ast.NodeVisitor):
        def _visit_constant(self, node):
            # generic handling of constant _if you want it_
    
        visit_Str = visit_Num = visit_Bytes = visit_NameConstant = visit_Ellipsis = _visit_constant

    Maybe I haven't been in the community long enough but this is the first *removal* of the ast I've seen, and it makes all my uses of these functions objectively worse, and none of the cases get simpler (whereas there's an existing easy way to simplify constants already, without breaking the ast)

    github search isn't a great measure of this but it's currently showing 10k+ usages of visit_{Str,Num,Bytes} that would presumably be broken by this change: https://github.com/search?q=visit_Num+visit_Str+visit_Bytes&type=Code -- backward compatibility is very valuable, I would hope it isn't squandered for an internal cleanup

    @serhiy-storchaka
    Copy link
    Member

    You can not use the same implementation of the visitor for Num, Str, NameConstant and Ellipsis, because all these classes use different attribute for saving the value (Ellipsis does not have it at all). For the same reasons you can not just pass the argument of visit_Constant() to visit_Str() -- they have different types.

    No need to call generic_visit() from visit_Constant() -- Constant nodes should not contain AST nodes.

    @asottile
    Copy link
    Mannequin Author

    asottile mannequin commented May 15, 2019

    You can not use the same implementation of the visitor for Num, Str, NameConstant and Ellipsis, because all these classes use different attribute for saving the value

    ah yes, this is true -- maybe the better change would be to just add @property value to each of those instead of collapsing the classes? (If that's the actual convenience we're trying to achieve)

    No need to call generic_visit() from visit_Constant() -- Constant nodes should not contain AST nodes.

    correct there's no need, but it's a best practice to always call generic_visit in all visit_* methods, I have a linter I haven't finished up yet that checks this. And who knows if that'll be true in the future!

    @isidentical
    Copy link
    Sponsor Member

    What about adding visit_Constant to NodeVisitor for at least one relase period and call visit_Str, visit_Num etc?

    @TylerWince
    Copy link
    Mannequin

    TylerWince mannequin commented Jun 25, 2019

    Another example of the breaking change here is in the security linter PyCQA/bandit. It may be a simple addition of a check in the meta parser for the ast but I would tend to agree with Anthony around whether or not it is worth it.

    Anyone else have any thoughts on this since the conversation ended last month?

    @serhiy-storchaka
    Copy link
    Member

    New changeset c3ea41e by Serhiy Storchaka in branch 'master':
    bpo-36917: Add default implementation of ast.NodeVisitor.visit_Constant(). (GH-15490)
    c3ea41e

    @serhiy-storchaka
    Copy link
    Member

    New changeset 522a394 by Serhiy Storchaka (Miss Islington (bot)) in branch '3.8':
    [3.8] bpo-36917: Add default implementation of ast.NodeVisitor.visit_Constant(). (GH-15490) (GH-15509)
    522a394

    @serhiy-storchaka
    Copy link
    Member

    New changeset c238862 by Serhiy Storchaka in branch '3.7':
    bpo-36917: Backport basic test for ast.NodeVisitor. (GH-15511)
    c238862

    @serhiy-storchaka
    Copy link
    Member

    Thank you for your report and discussion Anthony.

    Added the default implementation of visit_Constant which calls corresponding visitor for old constant nodes. It emits a deprecation warning (PendingDeprecationWarning in 3.8 and DeprecationWarning in 3.9) as a reminder that you finally will need to implement your own visit_Constant. You can temporary ignore them.

    Note that the implementation in the stdlib is not future proof (and it should not, because visit_Constant will likely be removed at the same time or before removing classes Num and Str and removing properties "n" and "s" from Constant). In long term solution you should write something like:

    class V(ast.NodeVisitor):
        def _visit_string(self, node, value):
            ...
    
        def visit_Str(self, node):
            return self._visit_string(node, node.s)
    
        def visit_Constant(self, node):
            if isinstance(node.value, str):
                return self._visit_string(node, node.value)
            ...

    Otherwise you can get other deprecation warnings or errors in future releases.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.8 only security fixes stdlib Python modules in the Lib dir
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants