classification
Title: Produce optimized code for boolean conditions
Type: performance Stage: resolved
Components: Interpreter Core Versions: Python 3.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: benjamin.peterson, brett.cannon, emilyemorehouse, ncoghlan, pitrou, rhettinger, serhiy.storchaka, yselivanov
Priority: normal Keywords:

Created on 2017-05-29 06:58 by serhiy.storchaka, last changed 2017-06-11 11:51 by serhiy.storchaka. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 1851 merged serhiy.storchaka, 2017-05-29 07:11
Messages (8)
msg294674 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-05-29 06:58
The peephole optimizer optimizes some boolean conditions. For example in "if not a:" it replaces UNARY_NOT+POP_JUMP_IF_FALSE with POP_JUMP_IF_TRUE, and in "if a and b:" it makes checking the boolean value of a only once. But it is unable to optimize more complex conditions, like "if not a and b:".

Proposed patch makes the compiler producing optimized code for conditions. It supports expressions with arbitrary complexity containing the "not" operator, the "and" and "or" binary operators, the "if" ternary operator, and chained comparisons, used as conditions in the ternary operator, in the "if", "while" and "assert" statements, and in generator expressions and comprehensions.

It would be possible to remove the part of the peepholer optimizer, but it is needed also for optimizing the "and" and "or" operators in non-boolean context. Doing this optimization in the compiler is possible but too cumbersome, it requires 3 times more code that in the proposed patch. May be I'll find the more general solution in future.
msg294680 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-05-29 07:19
Some examples.

if not a and b: x

Unpatched:
  1           0 LOAD_NAME                0 (a)
              2 UNARY_NOT
              4 POP_JUMP_IF_FALSE       14
              6 LOAD_NAME                1 (b)
              8 POP_JUMP_IF_FALSE       14
             10 LOAD_NAME                2 (x)
             12 POP_TOP
        >>   14 LOAD_CONST               0 (None)
             16 RETURN_VALUE

Patched:
  1           0 LOAD_NAME                0 (a)
              2 POP_JUMP_IF_TRUE        12
              4 LOAD_NAME                1 (b)
              6 POP_JUMP_IF_FALSE       12
              8 LOAD_NAME                2 (x)
             10 POP_TOP
        >>   12 LOAD_CONST               0 (None)
             14 RETURN_VALUE

if a <= b < c: x

Unpatched:
  1           0 LOAD_NAME                0 (a)
              2 LOAD_NAME                1 (b)
              4 DUP_TOP
              6 ROT_THREE
              8 COMPARE_OP               1 (<=)
             10 JUMP_IF_FALSE_OR_POP    18
             12 LOAD_NAME                2 (c)
             14 COMPARE_OP               0 (<)
             16 JUMP_FORWARD             4 (to 22)
        >>   18 ROT_TWO
             20 POP_TOP
        >>   22 POP_JUMP_IF_FALSE       28
             24 LOAD_NAME                3 (x)
             26 POP_TOP
        >>   28 LOAD_CONST               0 (None)
             30 RETURN_VALUE

Patched:
  1           0 LOAD_NAME                0 (a)
              2 LOAD_NAME                1 (b)
              4 DUP_TOP
              6 ROT_THREE
              8 COMPARE_OP               1 (<=)
             10 POP_JUMP_IF_FALSE       20
             12 LOAD_NAME                2 (c)
             14 COMPARE_OP               0 (<)
             16 POP_JUMP_IF_FALSE       28
             18 JUMP_FORWARD             4 (to 24)
        >>   20 POP_TOP
             22 JUMP_FORWARD             4 (to 28)
        >>   24 LOAD_NAME                3 (x)
             26 POP_TOP
        >>   28 LOAD_CONST               0 (None)
             30 RETURN_VALUE

if not (a and b) and c: x

Unpatched:
  1           0 LOAD_NAME                0 (a)
              2 JUMP_IF_FALSE_OR_POP     6
              4 LOAD_NAME                1 (b)
        >>    6 UNARY_NOT
              8 POP_JUMP_IF_FALSE       18
             10 LOAD_NAME                2 (c)
             12 POP_JUMP_IF_FALSE       18
             14 LOAD_NAME                3 (x)
             16 POP_TOP
        >>   18 LOAD_CONST               0 (None)
             20 RETURN_VALUE

Patched:
  1           0 LOAD_NAME                0 (a)
              2 POP_JUMP_IF_FALSE        8
              4 LOAD_NAME                1 (b)
              6 POP_JUMP_IF_TRUE        16
        >>    8 LOAD_NAME                2 (c)
             10 POP_JUMP_IF_FALSE       16
             12 LOAD_NAME                3 (x)
             14 POP_TOP
        >>   16 LOAD_CONST               0 (None)
             18 RETURN_VALUE

Note that the __bool__() method of every value is evaluated only once.
msg294681 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-05-29 07:43
It would be hard to get the same optimization in the peepholer.

For the first example, "if not a and b:", it is easy, just replace UNARY_NOT+POP_JUMP_IF_FALSE with POP_JUMP_IF_TRUE. The more complex third example, "if not (a and b) and c:", would need multiple passes (and more complex examples can need more passes, having the quadratic complexity in corner cases). The second example, with chained comparisons, is the most complex. It uses the fact that after the conditional jump and some stack operations we got a false value on the stack. This is too complex for the peepholer.
msg294693 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2017-05-29 13:00
Any high-level benchmarks (i.e. other than pybench and similar programs) impacted by this?
msg294713 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2017-05-29 21:04
Will this negatively impact code coverage reporting or code tracing (by optimizing across basic blocks)?
msg294714 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2017-05-29 21:16
The improved code does look better.  I'm +1 on this proposal as long as we're sure it doesn't introduce any new problems.
msg294743 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-05-30 07:25
> Any high-level benchmarks (i.e. other than pybench and similar programs) impacted by this?

Sorry, I don't have any data. Currently I'm not able to run large benchmarks. I would be surprised if this increases the performance of a macrobenchmark even by 1%. But this change is a step in the direction of getting rid from the peephole optimizer.

> Will this negatively impact code coverage reporting or code tracing (by optimizing across basic blocks)?

I can't imagine the case. The compiler doesn't generate separate line numbers neither for JUMP instructions, nor for auxilary stack manipulating instructions in chained comparison, nor for UNARY_NOT. And these are the only things that are changed. Line number marks are left the same in optimized code, and they points to the same instructions. I think that coverage tools and debugger are not affected.
msg295705 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-06-11 11:50
New changeset 36ff451ebae41f09560bff582c95946474d898f8 by Serhiy Storchaka in branch 'master':
bpo-30501: Make the compiler producing optimized code for condition expressions. (#1851)
https://github.com/python/cpython/commit/36ff451ebae41f09560bff582c95946474d898f8
History
Date User Action Args
2017-06-11 11:51:06serhiy.storchakasetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2017-06-11 11:50:24serhiy.storchakasetmessages: + msg295705
2017-05-30 07:25:48serhiy.storchakasetmessages: + msg294743
2017-05-29 23:51:13emilyemorehousesetnosy: + emilyemorehouse
2017-05-29 21:16:36rhettingersetmessages: + msg294714
2017-05-29 21:04:47rhettingersetmessages: + msg294713
2017-05-29 13:00:54pitrousetnosy: + pitrou
messages: + msg294693
2017-05-29 07:43:21serhiy.storchakasetmessages: + msg294681
2017-05-29 07:19:43serhiy.storchakasetmessages: + msg294680
2017-05-29 07:11:40serhiy.storchakasetpull_requests: + pull_request1934
2017-05-29 06:58:36serhiy.storchakacreate