New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Produce optimized code for boolean conditions #74686
Comments
The peephole optimizer optimizes some boolean conditions. For example in "if not a:" it replaces UNARY_NOT+POP_JUMP_IF_FALSE with POP_JUMP_IF_TRUE, and in "if a and b:" it makes checking the boolean value of a only once. But it is unable to optimize more complex conditions, like "if not a and b:". Proposed patch makes the compiler producing optimized code for conditions. It supports expressions with arbitrary complexity containing the "not" operator, the "and" and "or" binary operators, the "if" ternary operator, and chained comparisons, used as conditions in the ternary operator, in the "if", "while" and "assert" statements, and in generator expressions and comprehensions. It would be possible to remove the part of the peepholer optimizer, but it is needed also for optimizing the "and" and "or" operators in non-boolean context. Doing this optimization in the compiler is possible but too cumbersome, it requires 3 times more code that in the proposed patch. May be I'll find the more general solution in future. |
Some examples. if not a and b: x Unpatched: Patched: if a <= b < c: x Unpatched: Patched: if not (a and b) and c: x Unpatched: Patched: Note that the __bool__() method of every value is evaluated only once. |
It would be hard to get the same optimization in the peepholer. For the first example, "if not a and b:", it is easy, just replace UNARY_NOT+POP_JUMP_IF_FALSE with POP_JUMP_IF_TRUE. The more complex third example, "if not (a and b) and c:", would need multiple passes (and more complex examples can need more passes, having the quadratic complexity in corner cases). The second example, with chained comparisons, is the most complex. It uses the fact that after the conditional jump and some stack operations we got a false value on the stack. This is too complex for the peepholer. |
Any high-level benchmarks (i.e. other than pybench and similar programs) impacted by this? |
Will this negatively impact code coverage reporting or code tracing (by optimizing across basic blocks)? |
The improved code does look better. I'm +1 on this proposal as long as we're sure it doesn't introduce any new problems. |
Sorry, I don't have any data. Currently I'm not able to run large benchmarks. I would be surprised if this increases the performance of a macrobenchmark even by 1%. But this change is a step in the direction of getting rid from the peephole optimizer.
I can't imagine the case. The compiler doesn't generate separate line numbers neither for JUMP instructions, nor for auxilary stack manipulating instructions in chained comparison, nor for UNARY_NOT. And these are the only things that are changed. Line number marks are left the same in optimized code, and they points to the same instructions. I think that coverage tools and debugger are not affected. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: