classification
Title: Duplicated unused bytecodes at end of function
Type: Stage:
Components: Versions: Python 3.10
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Mark.Shannon Nosy List: Mark.Shannon, nedbat, rhettinger
Priority: normal Keywords:

Created on 2020-12-20 21:29 by nedbat, last changed 2020-12-22 20:46 by Mark.Shannon.

Messages (6)
msg383460 - (view) Author: Ned Batchelder (nedbat) * (Python triager) Date: 2020-12-20 21:29
(Using CPython commit c95f8bc270.)

This program has extra bytecodes:

    def f():
        for i in range(10):
            break
        return 17

The dis output is:

      1           0 LOAD_CONST               0 (<code object f at 0x109cf7450, file "afterbreak.py", line 1>)
                  2 LOAD_CONST               1 ('f')
                  4 MAKE_FUNCTION            0
                  6 STORE_NAME               0 (f)
                  8 LOAD_CONST               2 (None)
                 10 RETURN_VALUE

    Disassembly of <code object f at 0x109cf7450, file "afterbreak.py", line 1>:
      2           0 LOAD_GLOBAL              0 (range)
                  2 LOAD_CONST               1 (10)
                  4 CALL_FUNCTION            1
                  6 GET_ITER
                  8 FOR_ITER                 8 (to 18)
                 10 STORE_FAST               0 (i)

      3          12 POP_TOP

      4          14 LOAD_CONST               2 (17)
                 16 RETURN_VALUE
            >>   18 LOAD_CONST               2 (17)
                 20 RETURN_VALUE

The break has something to do with it, because if I change the Python to:

    def f():
        for i in range(10):
            a = 1
        return 17

then the dis output is:

      1           0 LOAD_CONST               0 (<code object f at 0x10bf8f450, file "afterbreak.py", line 1>)
                  2 LOAD_CONST               1 ('f')
                  4 MAKE_FUNCTION            0
                  6 STORE_NAME               0 (f)
                  8 LOAD_CONST               2 (None)
                 10 RETURN_VALUE

    Disassembly of <code object f at 0x10bf8f450, file "afterbreak.py", line 1>:
      2           0 LOAD_GLOBAL              0 (range)
                  2 LOAD_CONST               1 (10)
                  4 CALL_FUNCTION            1
                  6 GET_ITER
            >>    8 FOR_ITER                 8 (to 18)
                 10 STORE_FAST               0 (i)

      3          12 LOAD_CONST               2 (1)
                 14 STORE_FAST               1 (a)
                 16 JUMP_ABSOLUTE            8

      4     >>   18 LOAD_CONST               3 (17)
                 20 RETURN_VALUE
msg383510 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2020-12-21 11:09
My guess is that we are changing the CFG after computing reachability leaving unreachable blocks marked as reachable.
msg383512 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2020-12-21 11:45
Our reachability analysis is correct (in this case at least).
There is no unreachable code here.

In theory we could merge the two basic blocks at the end, but searching for identical blocks and merging them is potentially quite expensive (and fiddly).

What might be better is to be change the order of optimizations, so that we duplicate BBs after we have performed jump-to-jump elimination.

Is this a problem in practice?
msg383598 - (view) Author: Ned Batchelder (nedbat) * (Python triager) Date: 2020-12-22 18:50
This isn't a problem for me.  I noticed it and figured I'd mention it.
msg383602 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2020-12-22 19:16
> Is this a problem in practice?

It's just a hint that code generation has imperfections.

From a teacher's or author's point of view, it is something we have to explain away when demonstrating dis().  It leaves the impression that Python is kludgy.
msg383613 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2020-12-22 20:46
In what way is it "kludgy"?
There is a mathematical theorem that states that generated code will be imperfect, 
so we will have to live with that :)
https://en.wikipedia.org/wiki/Full_employment_theorem

3.10a produces more efficient bytecode than 3.9.

3.9:

  2           0 LOAD_GLOBAL              0 (range)
              2 LOAD_CONST               1 (10)
              4 CALL_FUNCTION            1
              6 GET_ITER
        >>    8 FOR_ITER                 8 (to 18)
             10 STORE_FAST               0 (i)

  3          12 POP_TOP
             14 JUMP_ABSOLUTE           18
             16 JUMP_ABSOLUTE            8

  4     >>   18 LOAD_CONST               2 (17)
             20 RETURN_VALUE

3.10a:

  2           0 LOAD_GLOBAL              0 (range)
              2 LOAD_CONST               1 (10)
              4 CALL_FUNCTION            1
              6 GET_ITER
              8 FOR_ITER                 8 (to 18)
             10 STORE_FAST               0 (i)

  3          12 POP_TOP

  4          14 LOAD_CONST               2 (17)
             16 RETURN_VALUE
        >>   18 LOAD_CONST               2 (17)
             20 RETURN_VALUE
History
Date User Action Args
2020-12-22 20:46:01Mark.Shannonsetmessages: + msg383613
2020-12-22 19:16:28rhettingersetmessages: + msg383602
2020-12-22 18:50:35nedbatsetmessages: + msg383598
2020-12-21 11:45:14Mark.Shannonsetmessages: + msg383512
2020-12-21 11:09:32Mark.Shannonsetassignee: Mark.Shannon
messages: + msg383510
2020-12-21 04:39:22rhettingersetnosy: + rhettinger
2020-12-20 21:29:18nedbatcreate