Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Produce optimized code for boolean conditions #74686

Closed
serhiy-storchaka opened this issue May 29, 2017 · 8 comments
Closed

Produce optimized code for boolean conditions #74686

serhiy-storchaka opened this issue May 29, 2017 · 8 comments
Labels
3.7 (EOL) end of life interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage

Comments

@serhiy-storchaka
Copy link
Member

BPO 30501
Nosy @brettcannon, @rhettinger, @ncoghlan, @pitrou, @benjaminp, @serhiy-storchaka, @1st1, @emilyemorehouse
PRs
  • bpo-30501: Make the compiler producing optimized code for condition expressions. #1851
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2017-06-11.11:51:06.123>
    created_at = <Date 2017-05-29.06:58:36.457>
    labels = ['interpreter-core', '3.7', 'performance']
    title = 'Produce optimized code for boolean conditions'
    updated_at = <Date 2017-06-11.11:51:06.122>
    user = 'https://github.com/serhiy-storchaka'

    bugs.python.org fields:

    activity = <Date 2017-06-11.11:51:06.122>
    actor = 'serhiy.storchaka'
    assignee = 'none'
    closed = True
    closed_date = <Date 2017-06-11.11:51:06.123>
    closer = 'serhiy.storchaka'
    components = ['Interpreter Core']
    creation = <Date 2017-05-29.06:58:36.457>
    creator = 'serhiy.storchaka'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 30501
    keywords = []
    message_count = 8.0
    messages = ['294674', '294680', '294681', '294693', '294713', '294714', '294743', '295705']
    nosy_count = 8.0
    nosy_names = ['brett.cannon', 'rhettinger', 'ncoghlan', 'pitrou', 'benjamin.peterson', 'serhiy.storchaka', 'yselivanov', 'emilyemorehouse']
    pr_nums = ['1851']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'performance'
    url = 'https://bugs.python.org/issue30501'
    versions = ['Python 3.7']

    @serhiy-storchaka
    Copy link
    Member Author

    The peephole optimizer optimizes some boolean conditions. For example in "if not a:" it replaces UNARY_NOT+POP_JUMP_IF_FALSE with POP_JUMP_IF_TRUE, and in "if a and b:" it makes checking the boolean value of a only once. But it is unable to optimize more complex conditions, like "if not a and b:".

    Proposed patch makes the compiler producing optimized code for conditions. It supports expressions with arbitrary complexity containing the "not" operator, the "and" and "or" binary operators, the "if" ternary operator, and chained comparisons, used as conditions in the ternary operator, in the "if", "while" and "assert" statements, and in generator expressions and comprehensions.

    It would be possible to remove the part of the peepholer optimizer, but it is needed also for optimizing the "and" and "or" operators in non-boolean context. Doing this optimization in the compiler is possible but too cumbersome, it requires 3 times more code that in the proposed patch. May be I'll find the more general solution in future.

    @serhiy-storchaka serhiy-storchaka added 3.7 (EOL) end of life interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage labels May 29, 2017
    @serhiy-storchaka
    Copy link
    Member Author

    Some examples.

    if not a and b: x

    Unpatched:
    1 0 LOAD_NAME 0 (a)
    2 UNARY_NOT
    4 POP_JUMP_IF_FALSE 14
    6 LOAD_NAME 1 (b)
    8 POP_JUMP_IF_FALSE 14
    10 LOAD_NAME 2 (x)
    12 POP_TOP
    >> 14 LOAD_CONST 0 (None)
    16 RETURN_VALUE

    Patched:
    1 0 LOAD_NAME 0 (a)
    2 POP_JUMP_IF_TRUE 12
    4 LOAD_NAME 1 (b)
    6 POP_JUMP_IF_FALSE 12
    8 LOAD_NAME 2 (x)
    10 POP_TOP
    >> 12 LOAD_CONST 0 (None)
    14 RETURN_VALUE

    if a <= b < c: x

    Unpatched:
    1 0 LOAD_NAME 0 (a)
    2 LOAD_NAME 1 (b)
    4 DUP_TOP
    6 ROT_THREE
    8 COMPARE_OP 1 (<=)
    10 JUMP_IF_FALSE_OR_POP 18
    12 LOAD_NAME 2 (c)
    14 COMPARE_OP 0 (<)
    16 JUMP_FORWARD 4 (to 22)
    >> 18 ROT_TWO
    20 POP_TOP
    >> 22 POP_JUMP_IF_FALSE 28
    24 LOAD_NAME 3 (x)
    26 POP_TOP
    >> 28 LOAD_CONST 0 (None)
    30 RETURN_VALUE

    Patched:
    1 0 LOAD_NAME 0 (a)
    2 LOAD_NAME 1 (b)
    4 DUP_TOP
    6 ROT_THREE
    8 COMPARE_OP 1 (<=)
    10 POP_JUMP_IF_FALSE 20
    12 LOAD_NAME 2 (c)
    14 COMPARE_OP 0 (<)
    16 POP_JUMP_IF_FALSE 28
    18 JUMP_FORWARD 4 (to 24)
    >> 20 POP_TOP
    22 JUMP_FORWARD 4 (to 28)
    >> 24 LOAD_NAME 3 (x)
    26 POP_TOP
    >> 28 LOAD_CONST 0 (None)
    30 RETURN_VALUE

    if not (a and b) and c: x

    Unpatched:
    1 0 LOAD_NAME 0 (a)
    2 JUMP_IF_FALSE_OR_POP 6
    4 LOAD_NAME 1 (b)
    >> 6 UNARY_NOT
    8 POP_JUMP_IF_FALSE 18
    10 LOAD_NAME 2 (c)
    12 POP_JUMP_IF_FALSE 18
    14 LOAD_NAME 3 (x)
    16 POP_TOP
    >> 18 LOAD_CONST 0 (None)
    20 RETURN_VALUE

    Patched:
    1 0 LOAD_NAME 0 (a)
    2 POP_JUMP_IF_FALSE 8
    4 LOAD_NAME 1 (b)
    6 POP_JUMP_IF_TRUE 16
    >> 8 LOAD_NAME 2 (c)
    10 POP_JUMP_IF_FALSE 16
    12 LOAD_NAME 3 (x)
    14 POP_TOP
    >> 16 LOAD_CONST 0 (None)
    18 RETURN_VALUE

    Note that the __bool__() method of every value is evaluated only once.

    @serhiy-storchaka
    Copy link
    Member Author

    It would be hard to get the same optimization in the peepholer.

    For the first example, "if not a and b:", it is easy, just replace UNARY_NOT+POP_JUMP_IF_FALSE with POP_JUMP_IF_TRUE. The more complex third example, "if not (a and b) and c:", would need multiple passes (and more complex examples can need more passes, having the quadratic complexity in corner cases). The second example, with chained comparisons, is the most complex. It uses the fact that after the conditional jump and some stack operations we got a false value on the stack. This is too complex for the peepholer.

    @pitrou
    Copy link
    Member

    pitrou commented May 29, 2017

    Any high-level benchmarks (i.e. other than pybench and similar programs) impacted by this?

    @rhettinger
    Copy link
    Contributor

    Will this negatively impact code coverage reporting or code tracing (by optimizing across basic blocks)?

    @rhettinger
    Copy link
    Contributor

    The improved code does look better. I'm +1 on this proposal as long as we're sure it doesn't introduce any new problems.

    @serhiy-storchaka
    Copy link
    Member Author

    Any high-level benchmarks (i.e. other than pybench and similar programs) impacted by this?

    Sorry, I don't have any data. Currently I'm not able to run large benchmarks. I would be surprised if this increases the performance of a macrobenchmark even by 1%. But this change is a step in the direction of getting rid from the peephole optimizer.

    Will this negatively impact code coverage reporting or code tracing (by optimizing across basic blocks)?

    I can't imagine the case. The compiler doesn't generate separate line numbers neither for JUMP instructions, nor for auxilary stack manipulating instructions in chained comparison, nor for UNARY_NOT. And these are the only things that are changed. Line number marks are left the same in optimized code, and they points to the same instructions. I think that coverage tools and debugger are not affected.

    @serhiy-storchaka
    Copy link
    Member Author

    New changeset 36ff451 by Serhiy Storchaka in branch 'master':
    bpo-30501: Make the compiler producing optimized code for condition expressions. (bpo-1851)
    36ff451

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants