classification
Title: "if 0:" lines are traced; they didn't use to be
Type: behavior Stage: resolved
Components: Versions: Python 3.10
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: BTaskaya, Mark.Shannon, benjamin.peterson, nedbat, pablogsal, rhettinger, serhiy.storchaka
Priority: normal Keywords:

Created on 2020-12-20 21:06 by nedbat, last changed 2021-01-15 13:40 by Mark.Shannon. This issue is now closed.

Messages (25)
msg383452 - (view) Author: Ned Batchelder (nedbat) * (Python triager) Date: 2020-12-20 21:06
(Using CPython commit c95f8bc270.)

This program has an "if 0:" line that becomes a NOP bytecode.  It didn't used to in Python 3.9

print(1)
if 0:               # line 2
    print(3)
print(4)

Using a simple trace program (https://github.com/nedbat/coveragepy/blob/master/lab/run_trace.py), it produces this output:

call <string> 1 @-1
    line <string> 1 @0
1
    line <string> 2 @8
    line <string> 4 @10
4
    return <string> 4 @20

Using Python3.9 gives this output:

call <string> 1 @-1
    line <string> 1 @0
1
    line <string> 4 @8
4
    return <string> 4 @18

Is this change intentional?
msg383455 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-12-20 21:12
All NOP bytecodes should be removed. If any is left it is a regression.
msg383459 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2020-12-20 21:25
Confirmed.  There is bogus NOP in the code.

=======================================================================

Python 3.10.0a3+ (heads/master:17ef4319a3, Dec 18 2020, 09:35:26)
[Clang 12.0.0 (clang-1200.0.32.28)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from textwrap import dedent
>>> from dis import dis
>>> s = '''
        x = 10
        if 0:
                print('hello')
        y = 20
'''
>>> code = compile(dedent(s), '', 'exec')
>>> list(code.co_code)
[100, 0, 90, 0, 9, 0, 100, 3, 90, 2, 100, 4, 83, 0]
>>> dis(code)
  2           0 LOAD_CONST               0 (10)
              2 STORE_NAME               0 (x)

  3           4 NOP

  5           6 LOAD_CONST               3 (20)
              8 STORE_NAME               2 (y)
             10 LOAD_CONST               4 (None)
             12 RETURN_VALUE
msg383505 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2020-12-21 10:06
Yes, this is change is deliberate.

PEP 626 states that *all* executed code gets traced.
Occasionally that will result in NOPs in the code, that are necessary 
for correctness, but cannot be removed by the optimizer.
Is this a problem in practice?

In fact this NOP can be removed by rewriting the bytecode to:

  2           0 LOAD_CONST               0 (10)
              2 STORE_NAME               0 (x)

  3           4 LOAD_CONST               3 (20)

  5           6 STORE_NAME               2 (y)
              8 LOAD_CONST               4 (None)
             10 RETURN_VALUE

but that is a relatively rare case where the proceeding or following instruction has no observable side-effects.
It is also non-trivial to implement, as the NOP and LOAD_CONST are in different basic blocks.
msg383507 - (view) Author: Ned Batchelder (nedbat) * (Python triager) Date: 2020-12-21 10:16
Previous versions of Python completely removed the "if 0:" statement, because it would have no effect.  The condition was checked at compile time, effectively.  It seems odd for 3.10 to add it back in.
msg383509 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2020-12-21 11:01
But why remove it? It is in the source code.

Do we want tracing and profiling to depend on what transformations the optimizer does or does not make?

What lines are to be traced here? What is the last line executed?

def no_code():
    if 0:
        some_code


Consider the code:

def one():
    return 1

def two():
    return one() + 1

def const_test():
    if two() != 2:
       lots_of_code_here

What lines should be traced here?
If the runtime does speculatively inlining, should it change what code is to be traced?
msg383511 - (view) Author: Ned Batchelder (nedbat) * (Python triager) Date: 2020-12-21 11:44
> PEP 626 states that *all* executed code gets traced.

Yes, but reading the PEP now, it's not clear what "executed" means. Do we mean "conceptually executed", or, "executed as compiled by the optimizing compiler"?

> In fact this NOP can be removed by rewriting the bytecode to:

That change looks both difficult and confusing.  If we want to keep an "if 0:" line in the bytecode, NOP is the best way to do it.

> But why remove it? It is in the source code.

I think the reason you are keeping it now isn't because it is in the source code.  Lines after returns are in the source code, but are now removed because they aren't in the path of execution.  "if 0:" is in the path of execution, so now you are keeping it. (I guess: I don't mean to put words in your mouth.)

As I mentioned, this is a change from previous versions, which removed the "if 0:" as uninteresting code.

> Do we want tracing and profiling to depend on what transformations the optimizer does or does not make?

Tracing has long depended on the optimizer's transformations.  When the optimizer removes a jump to a jump, the tracing never sees the removed intermediate jump.

The reason I am writing these bug reports is because coverage.py tries to track CPython's optimizations.  Coverage.py "knows" that "if 0:" lines won't appear in the trace, because previous versions removed them.

Unfortunately, it's not possible to make the match perfect. See bug 2506 for the original debate about this point.  As it stands now, CPython will keep statements in the byte code that are "conceptually" executed, but because of optimization, are never executed.  This creates a gap between the developer's mental model and the data produced by tools.

I can adjust coverage.py to reflect the decision either way from this bug report. Mostly I raise the issues to point out differences and to get a clear decision about whether it's desired behavior.

> If the runtime does speculatively inlining, should it change what code is to be traced?

I hope this is purely hypothetical. If CPython gets to that point we are going to have to revisit what kinds of controls there are to disable optimizations.
msg383521 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2020-12-21 13:55
You should never have to disable optimizations.

Let's be clear about what an optimization is.

An optimization (CS not math) is a change to the program such that it has the same effect, according to the language spec, but improves some aspect of the behavior, such as run time or memory use.

Any transformation that changes the effect of the program is not an optimization.

You shouldn't be able to tell, without timing the program (or measuring memory use, observing race conditions, etc.) whether optimizations are turned on or not.

Removing bytecodes for dead code is a legal optimization. It can never run, so we can't tell by observation whether it is there or not.

PEP 626 makes the lines number traced part of the language specification.
So the optimizer *must* respect line numbers.

Thus, removing all trace of `if 0:` is not legal since we could observe that change.
Hypothetically, we could change the line number table to allow multiple lines per bytecode and remove the NOP. The important point is that is coverage.py shouldn't care, or even know, if we do so.
msg383527 - (view) Author: Ned Batchelder (nedbat) * (Python triager) Date: 2020-12-21 14:53
This feels like a recurring debate.  I will need some time to lay out my case.  Perhaps it needs a wider audience than a bpo issue?
msg383569 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2020-12-22 03:04
> But why remove it? It is in the source code.

Because traditionally, "if 0: <suite>" has been used as a way to comment out code while leaving syntax highlighting in place.  The expectation is all the code effectively ceases to exist, just like a comment.
msg383576 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-12-22 08:39
I just noticed that "while True:" is now compiled to NOP. It is clearly a regression which affects performance.
msg383584 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2020-12-22 11:41
What is the problem with NOPs?
If there are performance regressions in real code, then that needs to be addressed. 
If only in a few toy examples, then that is (IMO) a price worth paying for predictable debugging and profiling.

Raymond:
   Effectively it does cease to exist. A NOP in the bytecode, except maybe in the tightest of loops, is going to be impossible to detect without inspecting the bytecode.

Serhiy:
   The NOP is outside of the loop, so won't slow down the loop.
msg383586 - (view) Author: Ned Batchelder (nedbat) * (Python triager) Date: 2020-12-22 12:21
Mark said:

> An optimization (CS not math) is a change to the program such that it has the same effect, according to the language spec, but improves some aspect of the behavior, such as run time or memory use.
> 
> Any transformation that changes the effect of the program is not an optimization.
> 
> You shouldn't be able to tell, without timing the program (or measuring memory use, observing race conditions, etc.) whether optimizations are turned on or not.

It's not that simple. Many aspects of the program can be observed, and coverage.py observes them and reports on them.

Coverage.py reports on branch coverage by tracking pairs of line numbers: in the trace function, the last line number is remembered, then paired with the current line number to note how execution moved from line to line.  This is an observable behavior of the program.  The optimization of removing jumps to jumps changes this observable behavior.

Here is a bug report against coverage.py that relates to this: https://github.com/nedbat/coveragepy/issues/1025

To reproduce this in the small, here is bug1025.py:

    nums = [1,2,3,4,5,6,7,8]        # line 1
    for num in nums:                # line 2
        if num % 2 == 0:            # line 3
            if num % 4 == 0:        # line 4
                print(num)          # line 5
            continue                # line 6
        print(-num)                 # line 7

Here is branch_trace.py:

    import sys

    pairs = set()
    last = -1

    def trace(frame, event, arg):
        global last
        if event == "line":
            this = frame.f_lineno
            pairs.add((last, this))
            last = this
        return trace

    code = open(sys.argv[1]).read()
    sys.settrace(trace)
    exec(code)
    print(sorted(pairs))

Running "python branch_trace.py bug1025.py" produces:

    -1
    -3
    4
    -5
    -7
    8
    [(-1, 1), (1, 2), (2, 3), (3, 4), (3, 7), (4, 2), (4, 5), (5, 6), (6, 2), (7, 2)]

Conceptually, executing bug1025.py should sometimes jump from line 4 to line 6. When line 4 is false, execution moves to the continue and then to the top of the for loop.  But CPython optimizes away the jump to a jump, so the pair (4, 6) never appears in our trace output.  The result is that coverage.py thinks there is a branch that could have occurred, but was never observed during the run.  It reports this as a missed branch.

Coverage.py currently deals with these sorts of issues by understanding the kinds of optimizations that can occur, and taking them into account when figuring "what could have happened during execution". Currently, it does not understand the jump-to-jump optimizations, which is why bug 1025 happens.

This pairing of line numbers doesn't relate specifically to the  "if 0:" optimizations that this issue is about, but this is where the observability point was raised, so I thought I would discuss it here.  As I said earlier, this probably should be worked out in a better forum.

This is already long, so I'm not sure what else to say.  Optimizations complicate things for tools that want to analyze code and help people reason about code.  You can't simply say, "optimizations should not be observable."  They are observable.
msg383593 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2020-12-22 16:14
Ned, I agree that up until 3.9, it is wasn't that simple.
But from 3.10 onward, it should be that simple.
That's the point of PEP 626.

If a transformation changes observable behavior within the scope of language specification, then it is not an optimization but a bug.

Up to 3.9, the language did not specify the behavior of sys.settrace w.r.t. line numbers, so coverage.py had to second guess what the bytecode optimizer was doing. 

In 3.10 onwards, the behavior of sys.settrace w.r.t. line numbers is part of the language spec. The optimizer cannot change it, and coverage.py should not need care what the optimizer does.

If you can observe the effect of the optimizer, w.r.t. line numbers, then it's a bug.

(According to that definition, there are still bugs in the optimizer relating to jumps-to-jumps. I plan to fix them)
msg383594 - (view) Author: Ned Batchelder (nedbat) * (Python triager) Date: 2020-12-22 16:24
> According to that definition, there are still bugs in the optimizer relating to jumps-to-jumps. I plan to fix them

Can you say more about that? What is the bug? How will you fix it?
msg383614 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2020-12-22 21:00
The fix, which explains the bug:
https://github.com/python/cpython/pull/23896
msg383616 - (view) Author: Ned Batchelder (nedbat) * (Python triager) Date: 2020-12-22 21:21
Mark, I'm trying to follow along with your ideas, but it's hard to piece together the implications.  Is there a way to have a discussion about where you are headed?

https://github.com/python/cpython/pull/23896 "fixes" the jump to jump problems by no longer optimizing away jumps to jumps.  https://bugs.python.org/issue42719 mentions creating zero-width entries in the line number table.  Does that let us get rid of jumps to jumps but still produce two trace events?  I'd like to understand more.
msg383618 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2020-12-22 21:57
Are you interested in the "what" or the "how"?

The "what":
    PEP 626 defines what CPython must do in terms of tracing lines executed.

The "how":
    Obviously, we want to execute Python as efficiently as possible.
To do so, we use a series of "peephole" optimizations that make the bytecode more efficient but preserve PEP 626 semantics.

The front-end of the compiler produces code that conforms to PEP 626.
The optimizer should make the bytecode faster, while still conforming to PEP 626. Since the optimizer is inherited from 3.9, it doesn't always respect PEP 626. I am in the process of fixing that.

If you need to know what the optimizer does, in order to correctly implement coverage.py, then that is a bug in the optimizer.

Zero-width entries will, in theory, allow some additional optimizations.
But probably not that much, in practice.
msg383619 - (view) Author: Ned Batchelder (nedbat) * (Python triager) Date: 2020-12-22 22:01
Specifically, for jumps to jumps, what is the plan?
msg383621 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2020-12-22 22:25
There isn't much of a plan.

https://github.com/python/cpython/pull/23896 makes the optimizer respect PEP 626 w.r.t. jumps-to-jumps. 

From the point of view of optimization, there is nothing special about jumps-to-jumps.
Any optimization that offers a speed up is good, regardless of which language feature it relates to.

To be honest, bytecode optimizations are going to win us a few percent max, whereas speeding up the interpreter could gain us 50-100%. So I'm not going to worry about trivial differences.
msg383622 - (view) Author: Ned Batchelder (nedbat) * (Python triager) Date: 2020-12-22 22:42
This seems like a perspective that needs a wider audience.  PEP 626 says there will be no performance slowdown:

> Performance Implications
> 
> In general, there should be no change in performance. When tracing,
> programs should run a little faster as the new table format can be
> designed with line number calculation speed in mind. Code with long
> sequences of pass statements will probably become a bit slower.

Now you are saying that is fine to take a small performance hit in order to conform to PEP 626.  

I am usually on the side of good tracing, but I imagine other people feel differently, that it is unacceptable to slow down all execution for sake of accurate tracing.  I would love to have a clear agreement before moving forward so that we don't have to backtrack.
msg383631 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2020-12-23 03:42
Maybe sys.settrace() is not ultimately the best tool for coverage reporting? If the bytecode compiler natively supported coverage instrumentation, source semantics would be easier to respect. A nice implementation could use Knuth & Stevenson "Optimal measurement of points for program frequency counts".
msg383632 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-12-23 06:02
If the NOP is outside of the loop, then what is the reason of adding it? You cannot set a breakpoint on "while True:" which stops on every iteration.

The goal "tracing every line of code" is incompatible with code optimization. It would be better to add a special option to produce slower bytecode for tracing if you want to trace every line.
msg383642 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2020-12-23 11:30
I'll briefly answer the above questions, but we could we close this issue?

If there are specific issues regarding PEP 626, please make a new issue. Feel free to +nosy me on those issues.

I intend to review all the recent changes to the compiler in the new year,
in case there are any obvious inefficiencies.

Ned,
The bytecode for 3.10a should be more efficient than 3.9.
IMO, it is not worth spending too much time on making the bytecode look pretty, and that if we care about performance it would be better to spend our time elsewhere.

Serhiy,
The bytecode for the test in the while statement is duplicated; one in the loop header, one in the body. Try it under pdb, you'll see that it breaks at two different places in the bytecode. 
See https://github.com/python/cpython/pull/23743

Benjamin,
Eventually, yes we will want to implement something like the JVMTI for the JVM. However, for that to be useful we still need precise line number tracing.
msg385041 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2021-01-13 15:49
Unless anyone objects, I'm going to close this issue.
History
Date User Action Args
2021-01-15 13:40:05Mark.Shannonsetstatus: open -> closed
resolution: not a bug
stage: resolved
2021-01-13 15:49:33Mark.Shannonsetmessages: + msg385041
2020-12-23 11:30:33Mark.Shannonsetmessages: + msg383642
2020-12-23 06:02:26serhiy.storchakasetmessages: + msg383632
2020-12-23 03:42:04benjamin.petersonsetnosy: + benjamin.peterson
messages: + msg383631
2020-12-22 22:42:00nedbatsetmessages: + msg383622
2020-12-22 22:25:24Mark.Shannonsetmessages: + msg383621
2020-12-22 22:01:29nedbatsetmessages: + msg383619
2020-12-22 21:57:34Mark.Shannonsetmessages: + msg383618
2020-12-22 21:53:49BTaskayasetnosy: + BTaskaya
2020-12-22 21:21:30nedbatsetmessages: + msg383616
2020-12-22 21:00:05Mark.Shannonsetmessages: + msg383614
2020-12-22 16:24:09nedbatsetmessages: + msg383594
2020-12-22 16:14:23Mark.Shannonsetmessages: + msg383593
2020-12-22 12:25:56nedbatsetnosy: + pablogsal
2020-12-22 12:21:49nedbatsetmessages: + msg383586
2020-12-22 11:41:53Mark.Shannonsetmessages: + msg383584
2020-12-22 08:39:45serhiy.storchakasetmessages: + msg383576
2020-12-22 03:04:50rhettingersetmessages: + msg383569
2020-12-21 14:53:37nedbatsetmessages: + msg383527
2020-12-21 13:55:22Mark.Shannonsetmessages: + msg383521
2020-12-21 11:44:02nedbatsetmessages: + msg383511
2020-12-21 11:01:26Mark.Shannonsetmessages: + msg383509
2020-12-21 10:16:14nedbatsetmessages: + msg383507
2020-12-21 10:06:17Mark.Shannonsetmessages: + msg383505
2020-12-20 21:25:46rhettingersetnosy: + rhettinger
messages: + msg383459
2020-12-20 21:12:09serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg383455
2020-12-20 21:06:08nedbatcreate