Author nedbat
Recipients Mark.Shannon, nedbat, rhettinger, serhiy.storchaka
Date 2020-12-22.12:21:48
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1608639709.41.0.162468277729.issue42693@roundup.psfhosted.org>
In-reply-to
Content
Mark said:

> An optimization (CS not math) is a change to the program such that it has the same effect, according to the language spec, but improves some aspect of the behavior, such as run time or memory use.
> 
> Any transformation that changes the effect of the program is not an optimization.
> 
> You shouldn't be able to tell, without timing the program (or measuring memory use, observing race conditions, etc.) whether optimizations are turned on or not.

It's not that simple. Many aspects of the program can be observed, and coverage.py observes them and reports on them.

Coverage.py reports on branch coverage by tracking pairs of line numbers: in the trace function, the last line number is remembered, then paired with the current line number to note how execution moved from line to line.  This is an observable behavior of the program.  The optimization of removing jumps to jumps changes this observable behavior.

Here is a bug report against coverage.py that relates to this: https://github.com/nedbat/coveragepy/issues/1025

To reproduce this in the small, here is bug1025.py:

    nums = [1,2,3,4,5,6,7,8]        # line 1
    for num in nums:                # line 2
        if num % 2 == 0:            # line 3
            if num % 4 == 0:        # line 4
                print(num)          # line 5
            continue                # line 6
        print(-num)                 # line 7

Here is branch_trace.py:

    import sys

    pairs = set()
    last = -1

    def trace(frame, event, arg):
        global last
        if event == "line":
            this = frame.f_lineno
            pairs.add((last, this))
            last = this
        return trace

    code = open(sys.argv[1]).read()
    sys.settrace(trace)
    exec(code)
    print(sorted(pairs))

Running "python branch_trace.py bug1025.py" produces:

    -1
    -3
    4
    -5
    -7
    8
    [(-1, 1), (1, 2), (2, 3), (3, 4), (3, 7), (4, 2), (4, 5), (5, 6), (6, 2), (7, 2)]

Conceptually, executing bug1025.py should sometimes jump from line 4 to line 6. When line 4 is false, execution moves to the continue and then to the top of the for loop.  But CPython optimizes away the jump to a jump, so the pair (4, 6) never appears in our trace output.  The result is that coverage.py thinks there is a branch that could have occurred, but was never observed during the run.  It reports this as a missed branch.

Coverage.py currently deals with these sorts of issues by understanding the kinds of optimizations that can occur, and taking them into account when figuring "what could have happened during execution". Currently, it does not understand the jump-to-jump optimizations, which is why bug 1025 happens.

This pairing of line numbers doesn't relate specifically to the  "if 0:" optimizations that this issue is about, but this is where the observability point was raised, so I thought I would discuss it here.  As I said earlier, this probably should be worked out in a better forum.

This is already long, so I'm not sure what else to say.  Optimizations complicate things for tools that want to analyze code and help people reason about code.  You can't simply say, "optimizations should not be observable."  They are observable.
History
Date User Action Args
2020-12-22 12:21:49nedbatsetrecipients: + nedbat, rhettinger, Mark.Shannon, serhiy.storchaka
2020-12-22 12:21:49nedbatsetmessageid: <1608639709.41.0.162468277729.issue42693@roundup.psfhosted.org>
2020-12-22 12:21:49nedbatlinkissue42693 messages
2020-12-22 12:21:48nedbatcreate