Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move folding tuples of constants into compiler.c from peephole.c #77499

Open
serhiy-storchaka opened this issue Apr 20, 2018 · 5 comments
Open
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement

Comments

@serhiy-storchaka
Copy link
Member

BPO 33318
Nosy @brettcannon, @rhettinger, @ncoghlan, @pitrou, @benjaminp, @methane, @serhiy-storchaka, @1st1
PRs
  • gh-77499: Move folding tuples of constants from peephole.c into compiler.c #6545
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2018-04-20.12:57:21.721>
    labels = ['interpreter-core', 'type-feature', '3.8']
    title = 'Move folding tuples of constants into compiler.c from peephole.c'
    updated_at = <Date 2018-04-29.18:40:00.886>
    user = 'https://github.com/serhiy-storchaka'

    bugs.python.org fields:

    activity = <Date 2018-04-29.18:40:00.886>
    actor = 'Mark.Shannon'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Interpreter Core']
    creation = <Date 2018-04-20.12:57:21.721>
    creator = 'serhiy.storchaka'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 33318
    keywords = ['patch']
    message_count = 5.0
    messages = ['315508', '315529', '315530', '315549', '315552']
    nosy_count = 8.0
    nosy_names = ['brett.cannon', 'rhettinger', 'ncoghlan', 'pitrou', 'benjamin.peterson', 'methane', 'serhiy.storchaka', 'yselivanov']
    pr_nums = ['6545']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue33318'
    versions = ['Python 3.8']

    @serhiy-storchaka
    Copy link
    Member Author

    Explicit tuples of constants are folded in the AST optimizer since 3.7. But peephole.c still contains the code for folding tuples of constants, because there are tuples created by the compiler. For example "[1, 2, *a]", "f(1, 2, *a)", "def f(a=1, b=2)" all create a tuple (1, 2).

    The following PR moves this code from peephole.c into compiler.c. This makes the code a tiny bit clearer, because it works on higher level than a bytecode. An obvious benefit -- the optimization is performed before calculating the depth of the stack, thus it will be more exact.

    @serhiy-storchaka serhiy-storchaka added 3.8 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement labels Apr 20, 2018
    @rhettinger
    Copy link
    Contributor

    +0 It looks like there is a small net win. Conceptually, constant folding is better done upstream where more semantic information is available (rather than downstream when the meaning has to be deduced from the opcodes). The only drawback is that if unfolded tuples ever leak out of the AST step or get created by the peephole optimization step, then they won't get fixed-up downstream (as functionality gets moved out of peephole.c, it creates more obligations on the AST code generation step to always produce the best possible code because the finally neatening-up step no longer occurs).

    @pitrou
    Copy link
    Member

    pitrou commented Apr 20, 2018

    I'm not convinced by this. One of the points of separating the optimizer from the compiler is that people wanting to study/extend the compiler don't have to filter through optimization-related routines. Separating initial code generation from optimization passes is an established practice in compiler design.

    @ncoghlan
    Copy link
    Contributor

    If I recall Eugene Toder's previous AST-level compilation patches correctly, they handle this problem by putting the AST optimisation step *after* the output of the Py_CF_ONLY_AST step.

    This meant that if you're running the source->AST step explicitly, you still get the original unoptimised AST out, which is what you actually want for a lot of source-code-analysis-or-manipulation related use cases.

    The optimisation steps then all remained in the "AST -> executable opcodes" phase, they were just structured a bit differently due to the addition of an extra step (AST -> optimised AST -> executable opcodes -> optimised opcodes).

    This is more work initially (since it involves building out a completely new compiler pass for the "AST -> optimised AST" step), but means that:

    • existing source->AST use cases are unaffected by the compiler change
    • existing AST->opcode use cases continue to benefit from all of the optimisation passes, regardless of whether those optimisations are applied at the AST or opcode level

    @serhiy-storchaka
    Copy link
    Member Author

    For clarifying, this issue doesn't have any relations to the AST optimizer. Tuples of constants are already optimized at the AST level, but there are other tuples created in code, which don't correspond any AST node. They are worth to be optimized too. That is why we need (and have) two codes for folding tuples of constants -- at high level (in AST) and at low level (in generated bytecode).

    Different kinds of optimizations can be applied at different stages.

    1. On AST. Optimizations applied here are: constant folding (evaluating expressions with unary and binary operators and subscriptions on constants and folding tuples of constants), constant propagation of __debug__, converting "in list/set" into "in tuple/frozenset".

    2. At code generation stage. Optimizations applied here are: some special cases for "if" and "while" with a constant condition, optimizing short-circuit jumps in complex boolean expressions, avoiding generating some code (like for "if" without "else"), special case for literal dicts with constant keys, merging equal constants and names.

    3. On generated intermediate representation of bytecode (a graph of sequences of instructions). Some unreachable code is eliminated here.

    4. On raw bytecode (in peephole.c). Optimizations applied here are: other special cases for "if" and "while" with a constant condition, folding tuples of constants created at code generation stage, special cases for multitarget assignments (1-3 targets), optimizing more short-circuit jumps in complex boolean expressions, replacing jumps to unconditional jumps, removing unreachable code after RETURN_VALUE.

    5. In the code object constructor (it is applied also when unmarshal code). Names and identifier-like string constants (including nested in tuples and frozensets) are interned here.

    PR 6545 moves a particular optimization from stage 4 to stage 3 (to a tiny bit higher level). bpo-32477 moves other 2 or 3 optimizations from stage 4 to stage 3.

    It is possible to move this particular optimization even to stage 2, but I'm not sure that mixing code generation with optimization would look better. At stage 3 it is more isolated.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    @erlend-aasland erlend-aasland removed the 3.8 only security fixes label Jan 12, 2024
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants