Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove specific constant AST types in favor of ast.Constant #77073

Closed
serhiy-storchaka opened this issue Feb 21, 2018 · 31 comments
Closed

Remove specific constant AST types in favor of ast.Constant #77073

serhiy-storchaka opened this issue Feb 21, 2018 · 31 comments
Labels
3.8 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement

Comments

@serhiy-storchaka
Copy link
Member

BPO 32892
Nosy @brettcannon, @nascheme, @ncoghlan, @vstinner, @benjaminp, @jwilk, @njsmith, @serhiy-storchaka, @1st1, @Carreau, @asottile, @hroncok, @thautwarm, @miss-islington
PRs
  • bpo-32892: Use ast.Constant instead of specific constant AST types. #9445
  • bpo-32892: Support subclasses of base types in isinstance checks for AST constants. #9934
  • bpo-32892: Update the documentation for handling constants in AST. #18514
  • [3.8] bpo-32892: Update the documentation for handling constants in AST. (GH-18514) #18534
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2018-11-21.07:58:48.572>
    created_at = <Date 2018-02-21.09:16:37.160>
    labels = ['interpreter-core', 'type-feature', '3.8']
    title = 'Remove specific constant AST types in favor of ast.Constant'
    updated_at = <Date 2020-02-17.09:09:53.661>
    user = 'https://github.com/serhiy-storchaka'

    bugs.python.org fields:

    activity = <Date 2020-02-17.09:09:53.661>
    actor = 'miss-islington'
    assignee = 'none'
    closed = True
    closed_date = <Date 2018-11-21.07:58:48.572>
    closer = 'serhiy.storchaka'
    components = ['Interpreter Core']
    creation = <Date 2018-02-21.09:16:37.160>
    creator = 'serhiy.storchaka'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 32892
    keywords = ['patch']
    message_count = 31.0
    messages = ['312482', '312487', '312490', '312491', '325882', '325884', '325904', '326006', '326030', '326041', '326043', '326078', '326087', '326130', '326209', '326295', '326565', '326566', '326568', '326711', '326712', '326767', '326873', '327902', '327903', '328685', '330179', '340664', '342486', '362121', '362123']
    nosy_count = 14.0
    nosy_names = ['brett.cannon', 'nascheme', 'ncoghlan', 'vstinner', 'benjamin.peterson', 'jwilk', 'njs', 'serhiy.storchaka', 'yselivanov', 'mbussonn', 'Anthony Sottile', 'hroncok', 'thautwarm', 'miss-islington']
    pr_nums = ['9445', '9934', '18514', '18534']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue32892'
    versions = ['Python 3.8']

    @serhiy-storchaka
    Copy link
    Member Author

    Currently different literals are represented by different types in AST:

    Num -- for int, float and complex
    Str -- for str
    Bytes -- for bytes
    Ellipsis -- for Ellipsis
    NameConstant -- for True, False and None

    And Constant (added in 3.6, bpo-26146) can represent any immutable value, including tuples and frozensets of constants. Instances of Constant are not created by the AST parser, they are created by the builtin AST optimizer and can be created manually.

    These AST types don't have one-to-one correspondence with Python types, since Num represents three numeric types, NameConstant represents bool and NoneType, and any constant type can be represented as Constant.

    I propose to get rid of Num, Str, Bytes, Ellipsis and NameConstant and always use Constant. This will simplify the code which currently needs to repeat virtually identical code for all types.

    I have almost ready patch, the only question is whether it is worth to keep deprecated aliases Num, Str, Bytes, Ellipsis and NameConstant.

    @serhiy-storchaka serhiy-storchaka added 3.8 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement labels Feb 21, 2018
    @ncoghlan
    Copy link
    Contributor

    As pure aliases, those wouldn't necessarily be that useful, as aside from NameConstant, the field names are different (Num.n, Str.s, Bytes.s).

    I do wonder whether it might be worth keeping "NameConstant", though, and use that for Ellipsis as well, so the singletons all use NameConstant, while regular constants use Constant.

    @serhiy-storchaka
    Copy link
    Member Author

    Right, they shouldn't be just aliases, but Constant subclasses with __new__ which return Constant and __instancecheck__ which checks the type of the value. And the Constant class should have writable properties n and s. All these operations should emit a deprecation warning at runtime. Even this doesn't preserve perfect compatibility. issubclass(type(node), Num) will not work, and compile(Num('123')) will raise en exception in the Num constructor instead of compile().

    @ncoghlan
    Copy link
    Contributor

    Ah, right - in that case, I think the subclasses would be worthwhile, as they shouldn't be too much hassle to maintain for a release, and will provide a more graceful migration for folks doing their own AST processing and generation.

    @serhiy-storchaka
    Copy link
    Member Author

    PR 9445 is an implementation of this idea. For simplicity and for reducing affect on tests no deprecation warning is raised, and there is no validation of the type of argument for old classes. Num('123') will just return Constant('123') without errors and warnings. isinstance(Constant('123'), Num) will return False, and isinstance(Constant('123'), Str) will return True.

    @vstinner
    Copy link
    Member

    +1. I wanted to propose this change while I was working on FAT Python, but I waited until I "completed" this project. Sadly, I abandonned the FAT Python. And I wasn't brave enough to propose to break the AST.

    @gvanrossum
    Copy link
    Member

    I don't feel confident reviewing the code, but I'm okay with the change. Can you describe what usages of the old API will continue to work, and what part will break? (It would seem that code that creates a tree using e.g. Num(42) will still work, but code inspecting a tree won't see any Num instances, only Constant instances.)

    @nascheme
    Copy link
    Member

    As someone who does AST manipulation (Quixote PTL template support), I'm interested in this patch. I think what is proposed sounds like a good change. I.e. having the extra node types is not useful and makes the compiler harder to understand. Providing subclasses for backwards compatibility would be nice. I will test my Quixote package with Serhiy's patch.

    @vstinner
    Copy link
    Member

    Hum, I'm not sure that I understood correctly: does "isinstance(node, ast.Num)" still work with the patch? If yes, I see no reason to not merge the change ;-)

    @serhiy-storchaka
    Copy link
    Member Author

    Can you describe what usages of the old API will continue to work, and what part will break?

    Very good question!

    What will continue to work:

    • Instantiating. Both Num(42) and Num(n=42) will continue to work, but they will return an instance of Constant: Constant(value=42).

    • Attribute access. Constant will get attributes "n" and "s" as aliases to the attribute "value". So Num(42).n will continue to return 42.

    • isinstance() check. Although isinstance(Num(42), Num) will continue to return True, and isinstance(Str('42'), Num) will continue to return False. Also isinstance(Constant(42), Num) will return True and isinstance(Constant('42'), Num) will return False.

    • Subclassing. Subclasses of these classes will continue to behave as normal classes. Instantiating them will return instances of that classes, not Constant, and the isinstance() check will not have any magic.

    What will no longer work:

    • issubclass() check and exact type check. issubclass(type(node), Num) and type(node) is Num will return False for node = Num(42). But seems isinstance() is a more common way of checking the type of a node.

    • If the user of NodeVisitor implemented visit_Num(), but not implemented visit_Constant(), his handler will not be triggered. This can be easily fixed by implementing visit_Constant(). Constant was introduced in 3.6.

    • isinstance() check for underinitialized node. isinstance(Num(), Num) will return False.

    @vstinner
    Copy link
    Member

    • If the user of NodeVisitor implemented visit_Num(), but not implemented visit_Constant(), his handler will not be triggered. This can be easily fixed by implementing visit_Constant(). Constant was introduced in 3.6.

    IHMO that's an acceptable tradeoff. We should just make sure that it's properly documented in the Porting section of What's New in Python 3.8.

    • issubclass() check and exact type check. issubclass(type(node), Num) and type(node) is Num will return False for node = Num(42). But seems isinstance() is a more common way of checking the type of a node.

    In the AST code that I read, isinstance() was the most popular way to check the type of a node. I don't recall any AST code using issubclass.

    isinstance() check for underinitialized node. isinstance(Num(), Num) will return False.

    I don't think that it's an issue.

    --

    In term of optimization, I'm curious, do you know if your PR optimize more cases than previously? Or do you expect futher optimizations later thanks to this change?

    @thautwarm
    Copy link
    Mannequin

    thautwarm mannequin commented Sep 22, 2018

    I'm in favor of this idea for prospective extensions that could be implemented through this brand-new ast.Constant.

    Actually I used to expect a more general ast.Constant when I was manipulating ASTs half a year ago, at that time my job is to make staging functions that take some user-defined types as constants(in these scenes, these types are definitely immutable and permanent) to gain performance improvements and avoid redundant allocations, and what I exactly wanted is such an ast.Constant.

    @serhiy-storchaka
    Copy link
    Member Author

    thautwarm, ast.Constant is not new. You can use it since 3.8. What is new in this issue -- ast.parse() will produce ast.Constant instead of ast.Num, ast.Str, etc.

    @thautwarm
    Copy link
    Mannequin

    thautwarm mannequin commented Sep 23, 2018

    Thank you, Serhiy.

    I learned python ast through ast.parse and pretty print, which prevented me from knowing this useful one.

    In fact, I wonder if we could support more species of constant values accepted by ast.Constant?

    @vstinner
    Copy link
    Member

    Serhiy Storchaka: "ast.Constant is not new. You can use it since 3.8."

    Since Python 3.6 ;-)

    @nascheme
    Copy link
    Member

    I've checked Quixote PTL module support. I will have to make some changes to support the removal of Str and other node types. E.g. I have to change visit_Str to visit_Constant. The fixes are pretty easy though and I think it is reasonable that this kind of AST manipulation is dependent on Python version and may get broken between releases.

    @serhiy-storchaka
    Copy link
    Member Author

    Since Python 3.6 ;-)

    Thank you for correction Victor. It is what I meant. Why I had written 3.8? :-?

    @serhiy-storchaka
    Copy link
    Member Author

    New changeset 3f22811 by Serhiy Storchaka in branch 'master':
    bpo-32892: Use ast.Constant instead of specific constant AST types. (GH-9445)
    3f22811

    @vstinner
    Copy link
    Member

    Thank you Serhiy :-D I really love ast.Constant, it simplify many things when writing code modifying the AST.

    @njsmith
    Copy link
    Contributor

    njsmith commented Sep 30, 2018

    As a point of information, this did in fact break pylint/astroid: pylint-dev/astroid#617

    @serhiy-storchaka
    Copy link
    Member Author

    Yesterday I submitted a PR that should fix astroid in Python 3.8: pylint-dev/astroid#616

    @jwilk
    Copy link
    Mannequin

    jwilk mannequin commented Oct 1, 2018

    Also broke pyflakes: PyCQA/pyflakes#367

    @serhiy-storchaka
    Copy link
    Member Author

    Submitted a fix for Pyflakes: PyCQA/pyflakes#369. It had more problems due to changes in line and column numbers in SyntaxError.

    @vstinner
    Copy link
    Member

    @vstinner
    Copy link
    Member

    @serhiy-storchaka
    Copy link
    Member Author

    New changeset 6015cc5 by Serhiy Storchaka in branch 'master':
    bpo-32892: Support subclasses of base types in isinstance checks for AST constants. (GH-9934)
    6015cc5

    @serhiy-storchaka
    Copy link
    Member Author

    Chameleon has been fixed. Genshi needs more work (seems it has issues with 3.7 too). There are other third-party projects broken after this change, but it is not hard to fix them.

    Closed this issue. Open new issues for new problems.

    @hroncok
    Copy link
    Mannequin

    hroncok mannequin commented Apr 22, 2019

    Mako (now fixed) sqlalchemy/mako#287

    @asottile
    Copy link
    Mannequin

    asottile mannequin commented May 14, 2019

    hitting this in https://bugs.python.org/issue36917?

    Is the simplification here really worth the breaking change to consumers?

    I now have to write something that's essentially this to work around this which feels more like the complexity has just been pushed to users instead of the stdlib:

    def visit_Constant(self, node):
        if isinstance(node.value, str):
            self.visit_Str(node)
        elif isinstance(node.value, bytes):
            self.visit_Bytes(node)
        elif node.value in {True, False}:
            self.visit_NameConstant(node)
        elif node.value is Ellipsis:
            self.visit_Ellipsis(node)
        elif isinstance(node.value, (int, float)):
            self.visit_Num(node)
        else:
            # etc...

    @serhiy-storchaka
    Copy link
    Member Author

    New changeset 85a2eef by Serhiy Storchaka in branch 'master':
    bpo-32892: Update the documentation for handling constants in AST. (GH-18514)
    85a2eef

    @miss-islington
    Copy link
    Contributor

    New changeset 988aeba by Miss Islington (bot) in branch '3.8':
    bpo-32892: Update the documentation for handling constants in AST. (GH-18514)
    988aeba

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.8 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    7 participants