This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Wrong line information in bytecode generated by compiler module
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 2.7
process
Status: closed Resolution: out of date
Dependencies: Superseder:
Assigned To: Nosy List: arnau, benjamin.peterson, brett.cannon, georg.brandl, nascheme, ncoghlan, pitrou
Priority: normal Keywords: patch

Created on 2014-02-13 03:17 by arnau, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
compiler_wrong_ordering_block_test.py arnau, 2014-02-13 03:17 Test program
fix_compiler_module_issue_20613.patch arnau, 2015-03-10 08:38
Messages (4)
msg211124 - (view) Author: Arnaud Fontaine (arnau) Date: 2014-02-13 03:17
I know that compiler module is deprecated since 2.6, but I found a regression introduced in version 2.7 by the following cleanup commit (replacing pyassem.py modified in this commit by the one just before does not trigger this bug):

http://hg.python.org/releasing/2.7.6/rev/42faa8054c3d

Actually, there is nothing wrong when executing the generated bytecode, it's just that tracebacks and anything that display source code from the generated bytecode object gets sometimes completely wrong when there is if/elif/else because of compiler.pyassem.order_blocks().

This issue was found when noticing that tracebacks in Zope Python Script (using RestrictedPython which in turn uses compiler module) were displaying completely wrong line information and stepping with pdb into the source code was displaying wrong source code and line information too.

After investigating, I found out that the problem is actually in compiler module and wrote a small program triggering the bug (I know that some variables are undefined but it does not matter here):

if if1:
  if1_entered = 1

if if2:
  if subif1:
    subif1_entered = 1
  elif subif2:
    subif2_entered = 1

  subif_ended = 1

Using compiler module to generate the bytecode with the program attached returns two different bytecodes:

$ python2.7 compiler_wrong_ordering_block_test.py 
====================================
  1           0 LOAD_NAME                0 (if1)
              3 POP_JUMP_IF_FALSE       30

  2           6 LOAD_CONST               1 (1)
              9 STORE_NAME               1 (if1_entered)
             12 JUMP_FORWARD            15 (to 30)

  7     >>   15 LOAD_NAME                2 (subif2)
             18 POP_JUMP_IF_FALSE       51

  8          21 LOAD_CONST               1 (1)
             24 STORE_NAME               3 (subif2_entered)
             27 JUMP_FORWARD            21 (to 51)
        >>   30 LOAD_NAME                4 (if2)
             33 POP_JUMP_IF_FALSE       60
             36 LOAD_NAME                5 (subif1)
             39 POP_JUMP_IF_FALSE       15
             42 LOAD_CONST               1 (1)
             45 STORE_NAME               6 (subif1_entered)
             48 JUMP_FORWARD             0 (to 51)

 10     >>   51 LOAD_CONST               1 (1)
             54 STORE_NAME               7 (subif_ended)
             57 JUMP_FORWARD             0 (to 60)
        >>   60 LOAD_CONST               0 (None)
             63 RETURN_VALUE        
====================================
  1           0 LOAD_NAME                0 (if1)
              3 POP_JUMP_IF_FALSE       15

  2           6 LOAD_CONST               1 (1)
              9 STORE_NAME               1 (if1_entered)
             12 JUMP_FORWARD             0 (to 15)

  4     >>   15 LOAD_NAME                2 (if2)
             18 POP_JUMP_IF_FALSE       60

  5          21 LOAD_NAME                3 (subif1)
             24 POP_JUMP_IF_FALSE       36

  6          27 LOAD_CONST               1 (1)
             30 STORE_NAME               4 (subif1_entered)
             33 JUMP_FORWARD            15 (to 51)

  7     >>   36 LOAD_NAME                5 (subif2)
             39 POP_JUMP_IF_FALSE       51

  8          42 LOAD_CONST               1 (1)
             45 STORE_NAME               6 (subif2_entered)
             48 JUMP_FORWARD             0 (to 51)

 10     >>   51 LOAD_CONST               1 (1)
             54 STORE_NAME               7 (subif_ended)
             57 JUMP_FORWARD             0 (to 60)
        >>   60 LOAD_CONST               0 (None)
             63 RETURN_VALUE        

As you can see from the output above, the first generated bytecode gets all the line numbers (lnotab) messed up because 'subif2' block (and its children too) is emitted *before* its 'dominator' block, namely 'if2'.

Now, compiler.pyassem.order_blocks() arranges blocks to be emitted within a 'dominators' dict where the key is a block and the value a list of blocks which must be emitted *before* that one. However, for at least 'if/elif' blocks (created by compiler.pycodegen.CodeGenerator().visitIf()), the 'elif' block (subif2) does not have prev/next (where prev and next define respectively the parent and children blocks) sets to its parent (if2) as it is only set for the first 'if' (subif1) and furthermore for some reasons I don't know the code asserts that next list length must be equal to 1 anyway (anyone knows why BTW?)...

Thus, when creating 'dominators' dict using next/prev block attributes, the 'elif' block (subif2) ends up being considered kind of an orphan block because next/prev is not set and thus could be emitted anytime... Enabling debug mode in compiler.pyassem and adding a print call in order_blocks() to display 'dominators' dict confirms this (eg 'subif2' block (<block id=9>) dominators value list is empty):

=> Blocks:
        ('SET_LINENO', 0)
        ('SET_LINENO', 1)
        ('LOAD_NAME', 'if1')
        ('POP_JUMP_IF_FALSE', <block id=3>)
end <block id=0>
    next [<block id=4>]
    prev []
    [<block id=3>, <block id=4>]
<block id=4>
        ('SET_LINENO', 2)
        ('LOAD_CONST', 1)
        ('STORE_NAME', 'if1_entered')
        ('JUMP_FORWARD', <block id=2>)
end <block id=4>
    next []
    prev [<block id=0>]
    [<block id=2>]
<block id=3>
end <block id=3>
    next [<block id=2>]
    prev []
    [<block id=2>]
<block id=2>
        ('SET_LINENO', 4)
        ('LOAD_NAME', 'if2')
        ('POP_JUMP_IF_FALSE', <block id=6>)
end <block id=2>
    next [<block id=7>]
    prev [<block id=3>]
    [<block id=6>, <block id=7>]
end <block id=7>
    next [<block id=10>]
    prev [<block id=2>]
    [<block id=9>, <block id=10>]
<block id=10>
        ('SET_LINENO', 6)
        ('LOAD_CONST', 1)
        ('STORE_NAME', 'subif1_entered')
        ('JUMP_FORWARD', <block id=8>)
end <block id=10>
    next []
    prev [<block id=7>]
    [<block id=8>]
<block id=9>
        ('SET_LINENO', 7)
        ('LOAD_NAME', 'subif2')
        ('POP_JUMP_IF_FALSE', <block id=11>)
end <block id=9>
    next [<block id=12>]
    prev []
    [<block id=11>, <block id=12>]
<block id=12>
        ('SET_LINENO', 8)
        ('LOAD_CONST', 1)
        ('STORE_NAME', 'subif2_entered')
        ('JUMP_FORWARD', <block id=8>)
end <block id=12>
    next []
    prev [<block id=9>]
    [<block id=8>]
<block id=11>
end <block id=11>
    next [<block id=8>]
    prev []
    [<block id=8>]
<block id=8>
        ('SET_LINENO', 10)
        ('LOAD_CONST', 1)
        ('STORE_NAME', 'subif_ended')
        ('JUMP_FORWARD', <block id=5>)
end <block id=8>
    next []
    prev [<block id=11>]
    [<block id=5>]
<block id=6>
end <block id=6>
    next [<block id=5>]
    prev []
    [<block id=5>]
<block id=5>
        ('LOAD_CONST', None)
        ('RETURN_VALUE',)

=> Dominators:
{<block id=0>: set([]),
 <block id=2>: set([<block id=3>, <block id=4>]),
 <block id=3>: set([<block id=4>]),
 <block id=4>: set([<block id=0>]),
 <block id=5>: set([<block id=6>, <block id=8>]),
 <block id=6>: set([<block id=8>]),
 <block id=7>: set([<block id=2>]),
 <block id=8>: set([<block id=10>, <block id=11>, <block id=12>]),
 <block id=9>: set([]),
 <block id=10>: set([<block id=7>]),
 <block id=11>: set([<block id=10>, <block id=12>]),
 <block id=12>: set([<block id=9>])}

I have tried to write a patch for this issue but to be honest I could not come up with something nice, any idea about implementing a proepr fix is welcome?
msg237569 - (view) Author: Mark Lawrence (BreamoreBoy) * Date: 2015-03-08 21:16
Am I correct in assuming that we'd need to fix a regression in 2.7 owing to its extended life span, or would the compiler being deprecated make this not worth the trouble, or what?
msg237735 - (view) Author: Arnaud Fontaine (arnau) Date: 2015-03-10 08:38
compiler module being deprecated does not mean that it's not used. For example, it is used by RestrictedPython as part of Zope2.

As I could not figure out a way to fix this bug properly, I reverted the following commit and have not had any problem so far (nor my colleagues):
https://hg.python.org/cpython/raw-rev/42faa8054c3d

I have attached the reverted patch to this bug report in case of.
msg364555 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2020-03-18 18:41
Since 2.7 development is done there is no compiler packages anymore to worry about. :)
History
Date User Action Args
2022-04-11 14:57:58adminsetgithub: 64812
2020-03-18 18:41:35brett.cannonsetstatus: open -> closed
resolution: out of date
messages: + msg364555

stage: resolved
2019-03-15 22:27:14BreamoreBoysetnosy: - BreamoreBoy
2015-03-10 14:17:53BreamoreBoysetnosy: + brett.cannon, georg.brandl, ncoghlan, benjamin.peterson
2015-03-10 08:38:35arnausetfiles: + fix_compiler_module_issue_20613.patch
keywords: + patch
messages: + msg237735
2015-03-08 21:16:52BreamoreBoysetnosy: + BreamoreBoy
messages: + msg237569
2014-02-13 03:17:09arnaucreate