Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add mechanism to disable optimizations #46758

Closed
nedbat opened this issue Mar 29, 2008 · 54 comments
Closed

Add mechanism to disable optimizations #46758

nedbat opened this issue Mar 29, 2008 · 54 comments
Labels
3.8 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement

Comments

@nedbat
Copy link
Member

nedbat commented Mar 29, 2008

BPO 2506
Nosy @warsaw, @brettcannon, @rhettinger, @terryjreedy, @amauryfa, @abalkin, @vstinner, @devdanzin, @nedbat, @merwok, @alex, @florentx, @ethanfurman, @markshannon, @ericsnowcurrently, @serhiy-storchaka, @1st1, @dianaclarke, @skirpichev, @artgoldberg, @pablogsal
PRs
  • bpo-2506: Experiment with adding a "-X noopt" flag #9693
  • bpo-2506: Add -X noopt command line option #13600
  • bpo-2506: Add mechanism to disable optimizations #22027
  • Files
  • continue.py: Test file to trace to see problem.
  • traceme.py
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2021-01-13.15:54:30.125>
    created_at = <Date 2008-03-29.13:38:40.439>
    labels = ['interpreter-core', 'type-feature', '3.8']
    title = 'Add mechanism to disable optimizations'
    updated_at = <Date 2021-01-13.15:54:30.124>
    user = 'https://github.com/nedbat'

    bugs.python.org fields:

    activity = <Date 2021-01-13.15:54:30.124>
    actor = 'Mark.Shannon'
    assignee = 'none'
    closed = True
    closed_date = <Date 2021-01-13.15:54:30.125>
    closer = 'Mark.Shannon'
    components = ['Interpreter Core']
    creation = <Date 2008-03-29.13:38:40.439>
    creator = 'nedbat'
    dependencies = []
    files = ['9888', '48371']
    hgrepos = []
    issue_num = 2506
    keywords = ['patch']
    message_count = 54.0
    messages = ['64692', '64699', '64713', '64716', '64720', '64725', '64726', '64729', '64754', '64764', '64765', '64768', '64774', '64775', '64809', '64957', '140281', '140290', '140303', '218811', '218814', '218816', '218829', '218869', '218891', '218892', '218893', '218894', '218923', '221250', '253345', '279493', '279494', '304195', '316928', '316930', '316935', '318510', '324296', '327021', '327022', '327036', '333185', '333245', '335013', '343717', '343752', '350448', '350560', '350576', '363926', '376147', '383615', '385044']
    nosy_count = 25.0
    nosy_names = ['barry', 'brett.cannon', 'rhettinger', 'terry.reedy', 'amaury.forgeotdarc', 'belopolsky', 'vstinner', 'ajaksu2', 'nedbat', 'eric.araujo', 'alex', 'flox', 'THRlWiTi', 'ethan.furman', 'Mark.Shannon', 'tshepang', 'eric.snow', 'serhiy.storchaka', 'yselivanov', 'diana', 'Trip.Volpe', 'pgimeno', 'Sergey.Kirpichev', 'ArthurGoldberg', 'pablogsal']
    pr_nums = ['9693', '13600', '22027']
    priority = 'normal'
    resolution = 'rejected'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue2506'
    versions = ['Python 3.8']

    @nedbat
    Copy link
    Member Author

    nedbat commented Mar 29, 2008

    When tracing line execution with sys.settrace, a particular code
    structure fails to report an executed line. The line is a continue
    statement after an if condition in which the if condition is true every
    time it is executed.

    Attached is a file with two copies of the same code, except in the first
    the if condition is always true, and in the second it is sometimes true.
    In the first, trace.py reports that the continue is never executed,
    even though it is (as evidenced from the values of a, b, and c after
    execution).

    In the second code, the continue is properly reported.

    This bug has been present since version 2.3. 2.2 does not exhibit it
    (trace.py didn't exist in 2.2, but coverage.py shows the problem also).

    To see the problem, execute "trace.py -c -m continue.py". Then
    continue.py.cover will show:

        1: a = b = c = 0
      101: for n in range(100):
      100:     if n % 2:
       50:         if n % 4:
       50:             a += 1
    >>>>>>         continue
               else:
       50:         b += 1
       50:     c += 1
        1: assert a == 50 and b == 50 and c == 50
    1: a = b = c = 0
    

    101: for n in range(100):
    100: if n % 2:
    50: if n % 3:
    33: a += 1
    17: continue
    else:
    50: b += 1
    50: c += 1
    1: assert a == 33 and b == 50 and c == 50

    @nedbat nedbat added type-bug An unexpected behavior, bug, or error interpreter-core (Objects, Python, Grammar, and Parser dirs) labels Mar 29, 2008
    @amauryfa
    Copy link
    Member

    This is because of a "peephole" optimization of the generated bytecode:
    a jump instruction which target is another jump instruction can be
    modified modified to target the final location.

    You gain a few opcodes, but tracing is confusing...
    Not sure how to fix this, though.

    @abalkin
    Copy link
    Member

    abalkin commented Mar 29, 2008

    I think this is not a bug. Here is a simpler way to illustrate the
    issue:

    def f(x): 
        for i in range(10): 
            if x: 
                pass 
            continue 
    f(True)
    f(False)

    If you run the code above under trace, you get the following coverage:

    1: def f(x):
    

    22: for i in range(10):
    20: if x:
    10: pass
    10: continue
    1: f(True)
    1: f(False)

    Note that the 'continue' line is executed 10 instead of expected 20
    times. This happens exactly as Amaury explained. If you disassemble f,
    you'll see

    2 0 SETUP_LOOP 34 (to 37)
    3 LOAD_GLOBAL 0 (range)
    6 LOAD_CONST 1 (10)
    9 CALL_FUNCTION 1
    12 GET_ITER
    >> 13 FOR_ITER 20 (to 36)
    16 STORE_FAST 1 (i)

    3 19 LOAD_FAST 0 (x)
    22 JUMP_IF_FALSE 4 (to 29)
    25 POP_TOP

    4 26 JUMP_ABSOLUTE 13
    >> 29 POP_TOP

    5 30 JUMP_ABSOLUTE 13
    33 JUMP_ABSOLUTE 13
    >> 36 POP_BLOCK
    >> 37 LOAD_CONST 0 (None)
    40 RETURN_VALUE

    Note how peephole optimizer replaced jump to the 'continue' line (5)
    from the 'pass' line (4) with a jump to the 'for' line by replacing

    4 26 JUMP_FORWARD 1 (to 30)

    with

    4 26 JUMP_ABSOLUTE 13

    I say this is not a bug because trace is correct in showing that the
    continue statement is never reached when executing f(True).

    @nedbat
    Copy link
    Member Author

    nedbat commented Mar 29, 2008

    I see that the cause of the problem is the peephole optimizer. That
    doesn't mean this isn't a problem.

    I am measuring the code coverage of a set of tests, and one of my lines
    is being marked as not executed. This is not the fault of the tests,
    because in fact, without the optimization, the line would be executed.
    Conceptually, the line has been executed (the loop is restarted, rather
    than execution continuing).

    I don't know what the solution to this is. Some options include fixing
    the line tracing code to somehow indicate that the continue was
    executed; or providing a way to disable peephole optimization for times
    when accurate execution tracing is more important than speed.

    @abalkin
    Copy link
    Member

    abalkin commented Mar 29, 2008

    On Sat, Mar 29, 2008 at 2:51 PM, Ned Batchelder <report@bugs.python.org> wrote:

    Ned Batchelder <nedbat@users.sourceforge.net> added the comment:

    I am measuring the code coverage of a set of tests, and one of my lines
    is being marked as not executed. This is not the fault of the tests,
    because in fact, without the optimization, the line would be executed.
    Conceptually, the line has been executed (the loop is restarted, rather
    than execution continuing).

    .. but the continue statement on line 5 is NOT executed in x == True
    case. Note that without optimization, the if statement + the
    continue line translate to

    3 19 LOAD_FAST 0 (x)
    22 JUMP_IF_FALSE 4 (to 29)
    25 POP_TOP

    4 26 JUMP_FORWARD 1 (to 30)
    >> 29 POP_TOP

    5 >> 30 JUMP_ABSOLUTE 13

    where the second jump is to the continue statement. Peephole
    optimizer recognizes that the jump target is an unconditional jump and
    changes the code to jump directly to the final target bypassing the
    continue line. The optimized code is

    3 19 LOAD_FAST 0 (x)
    22 JUMP_IF_FALSE 4 (to 29)
    25 POP_TOP

    4 26 JUMP_ABSOLUTE 13
    >> 29 POP_TOP

    5 30 JUMP_ABSOLUTE 13

    If x is true, line five is NOT executed.

    I don't know what the solution to this is. Some options include fixing
    the line tracing code to somehow indicate that the continue was
    executed; or providing a way to disable peephole optimization for times
    when accurate execution tracing is more important than speed.

    I think it is a good idea to provide a way to disable peephole
    optimizer. In fact, I recently proposed exactly that in msg64638. My
    only problem is that I would like to follow gcc tradition and make -O
    option take an optional numeric argument with 0 meaning no
    optimization and increasingly aggressive optimization as the argument
    increases. Unfortunately -O0 will be confusingly similar to -OO.
    Since -OO is not really optimization, but rather "strip" option, it
    should probably migrate to -s or something. In any case, such drastic
    changes to command line options are not acceptable for 2.x, but maybe
    possible for 3.0.

    I can easily implement -N (No optimization) or -g (debug) option that
    will disable the peephole optimizer if there is support for such
    feature.

    @amauryfa
    Copy link
    Member

    Unfortunately -O0 will be confusingly similar to -OO.
    On my browser, both are shown identically at the pixel level.

    Microsoft compilers use -Od to disable optimization...

    @rhettinger
    Copy link
    Contributor

    This has basically almost never been a problem in the real world. No
    need to complicate the world further by adding yet another option and
    the accompanying implementation-specific knowledge of why you would
    ever want to use it.

    Also, when the peepholer is moved (after the AST is created, but before
    the opcodes), then little oddities like this will go away.

    Recommend closing as "won't fix".

    @nedbat
    Copy link
    Member Author

    nedbat commented Mar 29, 2008

    I recognize that this is an unusual case, but it did come up in the real
    world. I found this while measuring test coverage, and the continue
    line was marked as not executed, when it was.

    I don't understand "when the peepholer is moved", so maybe you are right
    that this will no longer be an issue. But it seems to me to be endemic
    to code optimization to lose the one-to-one correspondence between
    source lines and ranges of bytecodes. And as the compiler becomes more
    complex and performs more optmizations, problems like this will likely
    increase, no?

    In any case, I'd like to know more about the changes planned for the AST
    and compiler...

    @abalkin
    Copy link
    Member

    abalkin commented Mar 30, 2008

    On Sat, Mar 29, 2008 at 4:58 PM, Raymond Hettinger
    <report@bugs.python.org> wrote:

    This has basically almost never been a problem in the real world.

    I believe Ned gave an important use case. In coverage testing,
    optimized runs can show false gaps in coverage. In addition, a no
    optimize option would provide a valuable learning tool. Python has an
    excellent simple VM very suitable for a case study in introductory CS
    courses. Unfortunately, inability to disable peephole optimizer makes
    understanding the resulting bytecode more difficult, particularly
    given some arbitrary choices made by the optimizer (such as 2*3+1 =>
    7, but 1+2*3 => 1+6). Furthermore, as Raymond suggested in another
    thread, peephole optimizer was deliberately kept to bare minimum out
    of concerns about compilation time. Given that most python code is
    pre-compiled, I think it is a rare case when code size/speed
    improvements would not be worth increased compilation time. In a rare
    case when compilation time is an issue, users can consider disabling
    optimization. Finally, an easy way to disable the optimizer would
    help in developing the optimizer itself by providing an easy way to
    measure improvements and debugging.

    No need to complicate the world further by adding yet another option and
    the accompanying implementation-specific knowledge of why you would
    ever want to use it.

    This would not really be a new option. Most users expect varying
    levels of optimization with -O option and python already has 3 levels:
    plain, -O, and -OO or Py_OptimizeFlag = 0,1, and 2. Moreover, in fact,
    Py_OptimizeFlag can be set to an arbitrary positive integer using
    undocumented -OOO.. option. I don't see how anyone would consider
    adding say -G with Py_OptimizeFlag = -1 that would disable all
    optimization as "complicating the world."

    Also, when the peepholer is moved (after the AST is created, but before
    the opcodes), then little oddities like this will go away.

    I don't see how moving optimization up the chain will help with this
    particular issue. Note that the problem is not with peepholer emiting
    erroneous line number information, but the fact that the continue
    statement is optimized away by replacing the if statement's jump to
    continue with a direct jump to the start of the loop. As I stated in
    my first comment, trace output is correct and as long as the compiler
    avoids redundant double jumps, the continue statement will not show up
    in trace regardless where in compilation chain it is optimized. The
    only way to get correct coverage information is to disable double jump
    optimization.

    @rhettinger
    Copy link
    Contributor

    Weigh the cost/benefit carefully before pushing further. I don't doubt
    the legitimacy of the use case, but do think it affects far fewer than
    one percent of Python programmers. In contrast, introducing new
    command line options is a big deal and will cause its own issues
    (possibly needing its own buildbot runs to exercise the non-optimized
    version, having optimized code possibly have subtle differences from
    the code being traced/debugged/profiled, and more importantly the
    mental overhead of having to learn what it is, why it's there, and when
    to use it).

    My feeling is that adding a new compiler option using a cannon to kill
    a mosquito. If you decide to press the case for this one, it should go
    to python-dev since command line options affect everyone.

    This little buglet has been around since Py2.3. That we're only
    hearing about it now is a pretty good indicator that this is a very
    minor in the Python world and doesn't warrant a heavy-weight solution.

    It would be *much* more useful to direct effort improving the mis-
    reporting of the number of arguments given versus those required for 
    instance methods:
       >>> a.f(1, 2)
       TypeError: f() takes exactly 1 argument (3 given)

    @nedbat
    Copy link
    Member Author

    nedbat commented Mar 30, 2008

    Raymond, do you have a cannon-less recommendation of how to kill this
    particular mosquito?

    @abalkin
    Copy link
    Member

    abalkin commented Mar 30, 2008

    On Sun, Mar 30, 2008 at 5:01 PM, Raymond Hettinger
    <report@bugs.python.org> wrote:
    ..
    >  It would be *much* more useful to direct effort improving the mis-
    >  reporting of the number of arguments given versus those required for
    >  instance methods:
    >    >>> a.f(1, 2)
    >    TypeError: f() takes exactly 1 argument (3 given)

    Please see bpo-2516.

    @abalkin
    Copy link
    Member

    abalkin commented Mar 31, 2008

    On Sun, Mar 30, 2008 at 5:01 PM, Raymond Hettinger
    <report@bugs.python.org> wrote:
    ..

    Weigh the cost/benefit carefully before pushing further. I don't doubt
    the legitimacy of the use case, but do think it affects far fewer than
    one percent of Python programmers.

    I agree with you, but only because fewer than 1% of Python programmers
    have complete test coverage for their code. :-) On the other hand, I
    wanted a no-optimize option regardless of the trace issue. Once it is
    there, I am sure everyone interested in how python compiler works will
    use it. (I am not sure what % of Python programmers would fall into
    that category.)

    I don't know how big of a deal an extra buildbot is, but I don't think
    it will be necessary. It is hard to imagine optimization that would
    fix (mask) errors in non-optimized code. Therefore, a non-optimized
    buildbot is unlikely to flag errors that ar not present in optimized
    runs. On the other hand errors introduced by optimizer will be easier
    to diagnose if they disappear when the code runs without optimization.

    Mental overhead is important, but I think it will be easier to explain
    the effect of no optimize option than to explain what -O does in the
    current version. As far as I can tell, -O has nothing to do with
    peephole optimization and only removes assert statements and replaces
    __debug__ with 0. I am sure most python users are not aware of the
    fact that peephole optimization is performed without -O option.

    My feeling is that adding a new compiler option using a cannon to kill
    a mosquito. If you decide to press the case for this one, it should go
    to python-dev since command line options affect everyone.

    As an alternative to the command line option, what would you say to
    making sys.flags.optimize writeable and disable peepholer if
    Py_OprimizeFlag < 0? This will allow python tracing tools to disable
    optimization from within python code. The fact that setting
    sys.flags.optimize flag will not affect modules that are already
    loaded is probably a good thing because tracing code itself will run
    optimized. Such tracing tools may also need to use a custom importer
    that would ignore precompiled code and effectively set
    dont_write_bytecode flag.

    This little buglet has been around since Py2.3. That we're only
    hearing about it now is a pretty good indicator that this is a very
    minor in the Python world and doesn't warrant a heavy-weight solution.

    I still maintain that this is not a bug. Not hearing about it before
    is probably an indication that users sophisticated enough to try to
    achieve full test coverage for their code were able to recognize false
    coverage gaps as such.

    @rhettinger
    Copy link
    Contributor

    Marking this one as closed.

    Also, rejecting the various ways to disable peephole optimization.
    This was discussed with Guido long ago and the decision essentially
    recognized that for most practical purposes the output of the peepholer
    is the generated code and no good would come from exposing upstream
    intermediate steps.

    Since then, I believe Neal got Guido's approval for either the -O or -
    OO option to generate new optimizations that potentially change
    semantics. In that situation, there is a worthwhile reason for the
    enable/disable option.

    @nedbat
    Copy link
    Member Author

    nedbat commented Apr 1, 2008

    It's hard for me to agree with your assessment that for no practical
    good would come from disabling the optimizer. Broadly speaking, there
    are two types of code execution: the vast majority of the time, you
    execute the code so that it can do its work. In this case, speed is
    most important, and the peephole optimizer is a good thing. But another
    important case is when you need to reason about the code. This second
    case includes coverage testing, debugging, and other types of analysis.

    Compiled languages have long recognized the need for both types of
    compilation, which is why they support disabling optimization entirely.

    As Python becomes more complex, and more broadly deployed, the needs of
    the two types of execution will diverge more and more. More complex
    optimizations will be attempted in order to squeeze out every last drop
    of performance. And more complex tools to reason about the code will be
    developed to provide rich support to those using Python for complex
    development.

    I see discussion here of moving the optimizer to the AST level instead
    of the bytecode level. This won't change the situation. The optimizer
    will still interfere with analysis tools.

    As a developer of analysis tools, what should I tell my users when their
    code behaves mysteriously?

    @terryjreedy
    Copy link
    Member

    While I agree with Raymond that the interpreter should be left alone,
    this could be reclassified (and reopened) as a doc issue. The current
    trace doc (Lib Ref 25.10) says rather tersely "The trace module allows
    you to trace program execution, generate annotated statement coverage
    listings, print caller/callee relationships and list functions executed
    during a program run." This could be augmented with a general statement
    that the effect of certain statements may get computed during
    compilation and not appear in the runtime trace -- or a more specific
    statement about continue, break, and whatever else.

    AS for continue.py, it seems that the apparent non-execution of a
    continue line indicates one of two possible problems.

    1. The if statement is equivalent to 'if True:', at least for the
      intended domain of input, hence redundant, and hence could/should be
      removed.
    2. Otherwise, the inputs are incomplete as far as testing the effect of
      not taking the if-branch, and hence could/should be augmented.

    Either way, it seems to me that the lack of runtime execution of
    continue, coupled with better documentation, could usefully point to
    possible action.

    @exarkun
    Copy link
    Mannequin

    exarkun mannequin commented Jul 13, 2011

    Since the main argument for not fixing this bug seems to be that it doesn't affect many users, it seems like I should comment here that the issue is affecting me. A recently proposed addition to Twisted gets bitten by this case, resulting in a report of less than full test coverage when in fact the tests do exercise every line and branch of the change.

    Perhaps it is too hard to add and maintain a no-optimizations feature for Python (although I agree with Ned that this would be a useful feature for many reasons, not just to fix this bug). There are other possible solutions to the issue of inaccurate coverage reports though.

    For example, Python could provide an API for determining which lines have code that might be executed. coverage.py (and the stdlib trace.py) currently use the code object's lnotab to decide which lines might be executable. Maybe that should omit "continue" lines that get jumped over. If the line will never execute, it seems there is no need to have it in the lnotab.

    Using the lnotab is something of a hack though, so it might also make sense to leave it alone but introduce an API to get the same information, but corrected for whatever peephole optimizations the interpreter happens to have.

    As far as the "not a bug" arguments go, I don't think it matters much whether you ultimately decide to call it a bug or a feature request. It *is* clearly a useful feature to some people though, and rejecting the requested behavior as "not a bug" doesn't help anyone. So call it a feature request if that makes it more palletable. :)

    @exarkun exarkun mannequin reopened this Jul 13, 2011
    @terryjreedy
    Copy link
    Member

    I think supporters of this feature request should take discussion to python-ideas to try to gather more support. The initial post should summarize reasons for the request, possible implementations, and the counter-arguments of Raymond.

    @terryjreedy terryjreedy changed the title Line tracing of continue after always-taken if is incorrect Add mechanism to diasable optimizations Jul 13, 2011
    @terryjreedy terryjreedy added type-feature A feature request or enhancement and removed type-bug An unexpected behavior, bug, or error labels Jul 13, 2011
    @terryjreedy
    Copy link
    Member

    Choose pydev if you want. Discussion there is *usually* (but definitely not always) more focused on implementation of uncontroversial changes. I am pretty much +-0 on the issue, though Jean-Paul's post seems to add to the + side arguments that might be persuasive to others also.

    @merwok merwok changed the title Add mechanism to diasable optimizations Add mechanism to disable optimizations Jul 15, 2011
    @rhettinger
    Copy link
    Contributor

    There has been no activity on this for several year. Marking as rejected for the reasons originally listed.

    @nedbat
    Copy link
    Member Author

    nedbat commented May 19, 2014

    Raymond, thanks for keeping us honest!

    I am still hoping to convince people that this is a good idea. I think Guido's +1 (https://mail.python.org/pipermail/python-dev/2012-December/123099.html) should help in that regard.

    Part of your reason for today's rejection is the lack of activity. Can I assume that with a patch you would be supportive?

    @pgimeno
    Copy link
    Mannequin

    pgimeno mannequin commented Jun 22, 2014

    I consider peephole optimization when no optimization was requested a bug.

    Documentation for -O says it "Turns on basic optimizations". Peephole optimization is a basic optimization, yet it is performed even when no basic optimizations were requested.

    No need to add a switch. Just don't optimize if not requested.

    @brettcannon
    Copy link
    Member

    I believe the python-ideas thread on this topic came to the conclusion that a -X flag -- e.g., -X DisableOptimizations -- would be a good way to turn off all optimizations. The flag could then either blindly set sys.dont_write_bytecode to True or set sys.flags.optimize to -1 in which case a bytecode file named e.g. foo.cpython-36.opt--1.pyc would be written which won't lead to any conflicts (I wish we could use False for sys.flags.optimize but that has the same values as 0 which is the default optimization level).

    Does one of those proposal seems acceptable to everyone? Do people like Ned who asked for this feature have a preference as to whether the bytecode is or is not written out to a .pyc file?

    @vstinner
    Copy link
    Member

    I would suggest -X noopt and use "noopt" in .pyc filenames. That's what I proposed in my PEP-511.

    @vstinner
    Copy link
    Member

    Since the discussion restarted, I reopen the issue and assigned it to Python 3.6. Maybe it's too late for such tiny change?

    @vstinner vstinner reopened this Oct 26, 2016
    @alex
    Copy link
    Member

    alex commented Oct 12, 2017

    If anyone has needed a workaround in the past 9 years and hasn't yet found one: pyca/cryptography@3b585f8

    @serhiy-storchaka
    Copy link
    Member

    If anyone has needed a workaround in the past 9 years and hasn't yet found one:

    This no longer works in 3.7 due to folding constants at the AST level. :-)

    @serhiy-storchaka serhiy-storchaka added the 3.8 only security fixes label May 17, 2018
    @nedbat
    Copy link
    Member Author

    nedbat commented May 17, 2018

    Folding constants won't affect control flow. The important thing here is to disable optimizing away jumps to jumps.

    @serhiy-storchaka
    Copy link
    Member

    Few different optimizations work together here. Folding constants at the AST level allows to eliminate the constant expression statement in the code generation stage. This makes 'continue' a first statement in the 'if' body. Boolean expressions optimizations (performed in the code generation stage now) creates a conditional jump to the start of the 'if' body (which is 'continue' now). If 'continue' is not nested in 'try' or 'with' blocks, it is compiled to an unconditional jump. And finally the jump optimization in the peepholer retargets the conditional jump from the unconditional jump to the start of the loop.

    @nedbat
    Copy link
    Member Author

    nedbat commented Jun 2, 2018

    Serhiy: thanks for the fuller story about the optimizations in place now. I'll repeat my earlier plea: providing a way to disable optimization is beneficial for tooling like coverage.py, and also for helping us to ensure that the optimizer is working correctly.

    I doubt anyone commenting here would be happy with a C compiler that optimized with no off-switch.

    @artgoldberg
    Copy link
    Mannequin

    artgoldberg mannequin commented Aug 28, 2018

    I'm another user of Ned's coverage tool. Our team at the Mount Sinai School of Medicine is building tools to model the dynamics of biochemistry inside individual cells. Our short term aims are to better understanding microbiology and model microorganisms so they can be engineered to more effectively produce drugs and do other industrial tasks. Long term, we seek to build genetically personalized models of human cells which can be used to improve the medical care of cancer and other illnesses. We're funded by several agencies of the federal government. Our field is called whole-cell modeling.

    We use Python because it provides a wide array of powerful tools we can reuse to reduce our development time, enables us to rapidly prototype software to test and advance our modeling ideas, and is fun to program. Using git, pip, coverage, GitHub, CircleCI, Docker and other tools we've built a robust development environment that enables multiple people to contribute to advancing our tools for whole-cell modeling. We strongly emphasize software engineering because the data we use is large, incomplete and inconsistent, and our models are complex and difficult to train, verify and validate. We want to have a high level of confidence in our tested code so that if we have trouble with a model we can focus on checking the data and understanding the model design. Coverage testing is an important part of our software engineering. We test both line and branch coverage.

    While working today on our simulator I found code that should have been fully covered except for a # pragma no cover, but was not fully covered. I reported it to Ned (nedbat/coveragepy#697) who reproduced it in a simpler example and pointed out that this "Add mechanism to disable optimizations" issue contributed to the problem.

    I realize that approximately 0.0% of Python users work on whole-cell modeling, which diminishes the importance of this use case. But Python is widely used in computational biomedicine, which represents many more users. Case in point -- I've created and teach a course in Biomedical Software Engineering which uses Python and teaches coverage testing to masters, PhD, and MD/PhD students.

    We'd appreciate your help improving Ned's coverage tool. You can learn more about us at http://www.karrlab.org/ and https://github.com/KarrLab.

    Regards
    Arthur

    @1st1
    Copy link
    Member

    1st1 commented Oct 4, 2018

    Having properly working coverage tooling is simply invaluable to pretty much every serious Python user. I support Ned's idea of adding an option to disable peephole optimizer (and similar other optimization passes). Why is this even debated?

    @1st1
    Copy link
    Member

    1st1 commented Oct 4, 2018

    I would suggest -X noopt and use "noopt" in .pyc filenames. That's what I proposed in my PEP-511.

    Sounds good to me.

    @serhiy-storchaka
    Copy link
    Member

    Note that disabling bytecode optimizations will not help to solve other problems with the coverity tool in 3.8: bpo-34705 and bpo-34876.

    @artgoldberg
    Copy link
    Mannequin

    artgoldberg mannequin commented Jan 7, 2019

    This issue is well into the 2nd decade of debate.

    Who has the power to effect this change?

    Happy New Year
    Arthur

    @brettcannon
    Copy link
    Member

    If someone can get a PR into a state that is acceptable, then this can be resolved, Arthur. But at this point that hasn't occurred.

    @nedbat
    Copy link
    Member Author

    nedbat commented Feb 7, 2019

    FWIW, Yury started a pull request: #9693

    @vstinner
    Copy link
    Member

    I proposed PR 13600 which is based PR 9693, but more complete and up to date.

    @vstinner
    Copy link
    Member

    My PR 13600 works as expected.

    I simplified attached continue.py: see attached traceme.py.

    Output with optimizations
    ---

    $ ./python traceme.py  
      6           0 LOAD_CONST               1 (0)
                  2 STORE_FAST               0 (a)

    7 4 LOAD_GLOBAL 0 (range)
    6 LOAD_CONST 2 (3)
    8 LOAD_CONST 3 (4)
    10 CALL_FUNCTION 2
    12 GET_ITER
    >> 14 FOR_ITER 30 (to 46)
    16 STORE_FAST 1 (n)

    8 18 LOAD_FAST 1 (n)
    20 LOAD_CONST 4 (2)
    22 BINARY_MODULO
    24 POP_JUMP_IF_FALSE 14

    9 26 LOAD_FAST 1 (n)
    28 LOAD_CONST 2 (3)
    30 BINARY_MODULO
    32 POP_JUMP_IF_FALSE 14

    10 34 LOAD_FAST 0 (a)
    36 LOAD_CONST 5 (1)
    38 INPLACE_ADD
    40 STORE_FAST 0 (a)

    11 42 JUMP_ABSOLUTE 14
    44 JUMP_ABSOLUTE 14
    >> 46 LOAD_CONST 0 (None)
    48 RETURN_VALUE
    --- modulename: traceme, funcname: func
    traceme.py(6): a = 0
    traceme.py(7): for n in range(3, 4):
    traceme.py(8): if n % 2:
    traceme.py(9): if n % 3:
    traceme.py(7): for n in range(3, 4):
    ---

    Output without optimizations (-X noopt):
    ---

    $ ./python -X noopt traceme.py  
      6           0 LOAD_CONST               1 (0)
                  2 STORE_FAST               0 (a)

    7 4 LOAD_GLOBAL 0 (range)
    6 LOAD_CONST 2 (3)
    8 LOAD_CONST 3 (4)
    10 CALL_FUNCTION 2
    12 GET_ITER
    >> 14 FOR_ITER 30 (to 46)
    16 STORE_FAST 1 (n)

    8 18 LOAD_FAST 1 (n)
    20 LOAD_CONST 4 (2)
    22 BINARY_MODULO
    24 POP_JUMP_IF_FALSE 44

    9 26 LOAD_FAST 1 (n)
    28 LOAD_CONST 2 (3)
    30 BINARY_MODULO
    32 POP_JUMP_IF_FALSE 42

    10 34 LOAD_FAST 0 (a)
    36 LOAD_CONST 5 (1)
    38 INPLACE_ADD
    40 STORE_FAST 0 (a)

    11 >> 42 JUMP_ABSOLUTE 14
    >> 44 JUMP_ABSOLUTE 14
    >> 46 LOAD_CONST 0 (None)
    48 RETURN_VALUE
    --- modulename: traceme, funcname: func
    traceme.py(6): a = 0
    traceme.py(7): for n in range(3, 4):
    traceme.py(8): if n % 2:
    traceme.py(9): if n % 3:
    traceme.py(11): continue
    traceme.py(7): for n in range(3, 4):
    ---

    The difference on the trace is that using -X noopt, "traceme.py(11): continue" line is traced as expected.

    The difference on the bytecode is that jumps are no longer optimized using -X noopt:

    • Optimized:

      "32 POP_JUMP_IF_FALSE 14"

    • Not optimized:

      "32 POP_JUMP_IF_FALSE 42"
      ">> 42 JUMP_ABSOLUTE 14"

    The peephole optimizer replaces a jump to an unconditional jump with a jump directly to the target of the unconditional jump.

    I documented peephole optimizations in my reimplementation in pure Python:
    https://bytecode.readthedocs.io/en/latest/peephole.html#optimizations
    (I'm not sure that it's up to date, but it should give you an idea of which kinds of optimizations are implemented.)

    @serhiy-storchaka
    Copy link
    Member

    There are different optimizations on different levels (AST, bytecode generation, peepholer), would be nice to control them separately. This means that we should pass a bitset to the compiler.

    @artgoldberg
    Copy link
    Mannequin

    artgoldberg mannequin commented Aug 26, 2019

    Appreciate you working on this Serhiy and Victor!

    @vstinner
    Copy link
    Member

    There are different optimizations on different levels (AST, bytecode generation, peepholer), would be nice to control them separately. This means that we should pass a bitset to the compiler.

    What's the use case for enabling some AST optimizations but disable bytecode generation optimizations?

    @pablogsal
    Copy link
    Member

    I will pick this up from Victor's last patch

    @nedbat
    Copy link
    Member Author

    nedbat commented Aug 31, 2020

    If there is any doubt that this affects people, it was reported *again* against coverage.py today: nedbat/coveragepy#1025

    @markshannon
    Copy link
    Member

    I think this can finally be closed.
    A mere 12 years after it was opened :)

    PEP-626 specifies what the correct behavior is, regardless of whether optimizations are turned on or off, so there is no point in a no-optimize option.
    The compiler is fast enough that it is never worth turning off, even for iterate development.

    If the bytecode optimizer produces incorrect or inefficient code for a particular example, please file a bug report for that case, and assign me.

    @markshannon
    Copy link
    Member

    In general, it is hard to define what is an optimization, and what is part of the compiler.

    The original request was to disable optimizations that changed observable behavior w.r.t. line numbers.

    All optimizations now respect line numbers, so proposed mechanism would be pointless.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.8 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests