Move folding tuples of constants into compiler.c from peephole.c #77499

serhiy-storchaka · 2018-04-20T12:57:22Z

BPO	33318
Nosy	@brettcannon, @rhettinger, @ncoghlan, @pitrou, @benjaminp, @methane, @serhiy-storchaka, @1st1
PRs	gh-77499: Move folding tuples of constants from peephole.c into compiler.c #6545

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2018-04-20.12:57:21.721>
labels = ['interpreter-core', 'type-feature', '3.8']
title = 'Move folding tuples of constants into compiler.c from peephole.c'
updated_at = <Date 2018-04-29.18:40:00.886>
user = 'https://github.com/serhiy-storchaka'

bugs.python.org fields:

activity = <Date 2018-04-29.18:40:00.886>
actor = 'Mark.Shannon'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['Interpreter Core']
creation = <Date 2018-04-20.12:57:21.721>
creator = 'serhiy.storchaka'
dependencies = []
files = []
hgrepos = []
issue_num = 33318
keywords = ['patch']
message_count = 5.0
messages = ['315508', '315529', '315530', '315549', '315552']
nosy_count = 8.0
nosy_names = ['brett.cannon', 'rhettinger', 'ncoghlan', 'pitrou', 'benjamin.peterson', 'methane', 'serhiy.storchaka', 'yselivanov']
pr_nums = ['6545']
priority = 'normal'
resolution = None
stage = 'patch review'
status = 'open'
superseder = None
type = 'enhancement'
url = 'https://bugs.python.org/issue33318'
versions = ['Python 3.8']

serhiy-storchaka · 2018-04-20T12:57:22Z

Explicit tuples of constants are folded in the AST optimizer since 3.7. But peephole.c still contains the code for folding tuples of constants, because there are tuples created by the compiler. For example "[1, 2, *a]", "f(1, 2, *a)", "def f(a=1, b=2)" all create a tuple (1, 2).

The following PR moves this code from peephole.c into compiler.c. This makes the code a tiny bit clearer, because it works on higher level than a bytecode. An obvious benefit -- the optimization is performed before calculating the depth of the stack, thus it will be more exact.

rhettinger · 2018-04-20T19:45:09Z

+0 It looks like there is a small net win. Conceptually, constant folding is better done upstream where more semantic information is available (rather than downstream when the meaning has to be deduced from the opcodes). The only drawback is that if unfolded tuples ever leak out of the AST step or get created by the peephole optimization step, then they won't get fixed-up downstream (as functionality gets moved out of peephole.c, it creates more obligations on the AST code generation step to always produce the best possible code because the finally neatening-up step no longer occurs).

pitrou · 2018-04-20T19:54:44Z

I'm not convinced by this. One of the points of separating the optimizer from the compiler is that people wanting to study/extend the compiler don't have to filter through optimization-related routines. Separating initial code generation from optimization passes is an established practice in compiler design.

ncoghlan · 2018-04-21T05:29:51Z

If I recall Eugene Toder's previous AST-level compilation patches correctly, they handle this problem by putting the AST optimisation step *after* the output of the Py_CF_ONLY_AST step.

This meant that if you're running the source->AST step explicitly, you still get the original unoptimised AST out, which is what you actually want for a lot of source-code-analysis-or-manipulation related use cases.

The optimisation steps then all remained in the "AST -> executable opcodes" phase, they were just structured a bit differently due to the addition of an extra step (AST -> optimised AST -> executable opcodes -> optimised opcodes).

This is more work initially (since it involves building out a completely new compiler pass for the "AST -> optimised AST" step), but means that:

existing source->AST use cases are unaffected by the compiler change
existing AST->opcode use cases continue to benefit from all of the optimisation passes, regardless of whether those optimisations are applied at the AST or opcode level

serhiy-storchaka · 2018-04-21T07:17:10Z

For clarifying, this issue doesn't have any relations to the AST optimizer. Tuples of constants are already optimized at the AST level, but there are other tuples created in code, which don't correspond any AST node. They are worth to be optimized too. That is why we need (and have) two codes for folding tuples of constants -- at high level (in AST) and at low level (in generated bytecode).

Different kinds of optimizations can be applied at different stages.

On AST. Optimizations applied here are: constant folding (evaluating expressions with unary and binary operators and subscriptions on constants and folding tuples of constants), constant propagation of __debug__, converting "in list/set" into "in tuple/frozenset".
At code generation stage. Optimizations applied here are: some special cases for "if" and "while" with a constant condition, optimizing short-circuit jumps in complex boolean expressions, avoiding generating some code (like for "if" without "else"), special case for literal dicts with constant keys, merging equal constants and names.
On generated intermediate representation of bytecode (a graph of sequences of instructions). Some unreachable code is eliminated here.
On raw bytecode (in peephole.c). Optimizations applied here are: other special cases for "if" and "while" with a constant condition, folding tuples of constants created at code generation stage, special cases for multitarget assignments (1-3 targets), optimizing more short-circuit jumps in complex boolean expressions, replacing jumps to unconditional jumps, removing unreachable code after RETURN_VALUE.
In the code object constructor (it is applied also when unmarshal code). Names and identifier-like string constants (including nested in tuples and frozensets) are interned here.

PR 6545 moves a particular optimization from stage 4 to stage 3 (to a tiny bit higher level). bpo-32477 moves other 2 or 3 optimizations from stage 4 to stage 3.

It is possible to move this particular optimization even to stage 2, but I'm not sure that mixing code generation with optimization would look better. At stage 3 it is more isolated.

serhiy-storchaka added 3.8 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement labels Apr 20, 2018

ezio-melotti transferred this issue from another repository Apr 10, 2022

bedevere-app bot mentioned this issue Jan 12, 2024

gh-77499: Move folding tuples of constants from peephole.c into compiler.c #6545

Open

erlend-aasland removed the 3.8 only security fixes label Jan 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Move folding tuples of constants into compiler.c from peephole.c #77499

Move folding tuples of constants into compiler.c from peephole.c #77499

serhiy-storchaka commented Apr 20, 2018

serhiy-storchaka commented Apr 20, 2018

rhettinger commented Apr 20, 2018

pitrou commented Apr 20, 2018

ncoghlan commented Apr 21, 2018

serhiy-storchaka commented Apr 21, 2018

Move folding tuples of constants into compiler.c from peephole.c #77499

Move folding tuples of constants into compiler.c from peephole.c #77499

Comments

serhiy-storchaka commented Apr 20, 2018

serhiy-storchaka commented Apr 20, 2018

rhettinger commented Apr 20, 2018

pitrou commented Apr 20, 2018

ncoghlan commented Apr 21, 2018

serhiy-storchaka commented Apr 21, 2018