classification
Title: Move jumps optimization from the peepholer to the compiler
Type: performance Stage: patch review
Components: Interpreter Core Versions: Python 3.8
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: benjamin.peterson, brett.cannon, inada.naoki, ncoghlan, pitrou, rhettinger, serhiy.storchaka, yselivanov
Priority: normal Keywords: patch

Created on 2018-01-01 22:03 by serhiy.storchaka, last changed 2019-03-07 15:44 by serhiy.storchaka.

Pull Requests
URL Status Linked Edit
PR 5077 open serhiy.storchaka, 2018-01-01 22:06
Messages (8)
msg309351 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-01-01 22:03
The proposed patch moves jumps optimization from the peepholer to the compiler. The optimization is performed for lists of instructions before generating concrete bytecode and is not restricted by the size of bytecode instructions. It allows to optimize more long chains of jumps.

This is a step to getting rid of the peepholer.
msg309432 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2018-01-03 21:03
> This is a step to getting rid of the peepholer.

There is no goal to get rid of the peepholer optimizer.  Please don't invent this as a requirement.

If some optimization are more easily performed upstream, then go ahead and move them upstream.  The goal is to have a net reduction in complexity, not just the elimination of peephole.c just because you decided it should disappear.
msg309439 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-01-03 21:37
Actually moving any of remaining optimizations (including two optimizations that are moved by this issue and two other that are left) slightly increases the complexity. But instead the optimizations become more general. At the end, after moving all optimizations, we could drop the auxiliary code in the peepholer (building the table of basic blocks, removing NOPs, fixing up jump targets) and the net complexity will be reduced. This can also reduce the time and memory consumption of compilation.

Thus getting rid of the peepholer optimizer working with the bytecode is my goal. But some peephole optimizations will work with the intermediate fixed-size representations used by the compiler, before generating the concrete bytecode.
msg309440 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2018-01-03 21:52
> But some peephole optimizations will work with the intermediate fixed-size representations used by the compiler, before generating the concrete bytecode.

Then perhaps the goal should be to make the peephole operate on that intermediate representation?

I think it's valuable to have a separate C module for optimizations, instead of cramming them in compiler.c.
msg309445 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-01-03 22:30
> I think it's valuable to have a separate C module for optimizations, instead of cramming them in compiler.c.

This would require major rewriting. PR 5077 adds just two functions in compile.c. But they use 4 structures defined locally in this file. If move these two functions in a separate file we will need to move these 4 structures and related definitions to the common header file. And maybe it will be in vain if once all these optimizations will be moved to other parts of the compiler. Some of jumps optimization already are performed on instructions emitting stage. Dead code elimination perhaps could be performed at the stage of assembling the bytecode from basic blocks. Constant tuples folding could be implemented by at least three ways besides the current implementation, I'm testing what of them is the simplest.
msg309446 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2018-01-03 22:33
Well, on the other hand, the change you're proposing is hardly necessary, unless it actually demonstrates a noticeable improvement on some workload.  So perhaps we should leave the peepholer alone until someone has a better idea.
msg330929 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-12-03 10:39
Simplified PR 5077. It uses now NEXT_BLOCK() to guarantee that a new block is always used after jumps. NEXT_BLOCK() was defined, but not used before.

bpo-35193 and bpo-9566 demonstrate that the code of peephole.c is complex and error-prone. Moving it to the upper level makes it more error-proof and general.
msg337397 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2019-03-07 15:44
After analyzing the difference between bytecodes generated by the current peepholer and the new optimizer I have found that he peepholer do not optimizes jumps in case of some multiline expressions. For example:

[x
 for x in a if x]

  1           0 BUILD_LIST               0
              2 LOAD_FAST                0 (.0)
        >>    4 FOR_ITER                12 (to 18)

  2           6 STORE_FAST               1 (x)
              8 LOAD_FAST                1 (x)
             10 POP_JUMP_IF_FALSE       16

  1          12 LOAD_FAST                1 (x)
             14 LIST_APPEND              2
        >>   16 JUMP_ABSOLUTE            4
        >>   18 RETURN_VALUE

if x:
    if (y and
        z):
        foo()
else:
    bar()

  1           0 LOAD_NAME                0 (x)
              2 POP_JUMP_IF_FALSE       20

  2           4 LOAD_NAME                1 (y)
              6 POP_JUMP_IF_FALSE       18

  3           8 LOAD_NAME                2 (z)

  2          10 POP_JUMP_IF_FALSE       18

  4          12 LOAD_NAME                3 (foo)
             14 CALL_FUNCTION            0
             16 POP_TOP
        >>   18 JUMP_FORWARD             6 (to 26)

  6     >>   20 LOAD_NAME                4 (bar)
             22 CALL_FUNCTION            0
             24 POP_TOP
        >>   26 LOAD_CONST               0 (None)
             28 RETURN_VALUE

It preserves jumps to jump instructions. I do not know the cause. All works with single-line expressions.

The new optimizer does not have this bug.
History
Date User Action Args
2019-03-07 15:44:58serhiy.storchakasetmessages: + msg337397
2018-12-03 10:39:42serhiy.storchakasetmessages: + msg330929
versions: + Python 3.8, - Python 3.7
2018-01-03 22:33:01pitrousetmessages: + msg309446
2018-01-03 22:30:04serhiy.storchakasetmessages: + msg309445
2018-01-03 21:52:19pitrousetmessages: + msg309440
2018-01-03 21:37:25serhiy.storchakasetmessages: + msg309439
2018-01-03 21:03:18rhettingersetnosy: + rhettinger
messages: + msg309432
2018-01-02 08:59:44serhiy.storchakasetnosy: + pitrou, inada.naoki
2018-01-01 22:06:04serhiy.storchakasetkeywords: + patch
stage: patch review
pull_requests: + pull_request4952
2018-01-01 22:03:27serhiy.storchakacreate