classification
Title: Python compiles dead code
Type: performance Stage:
Components: Interpreter Core Versions: Python 2.7
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: loewis Nosy List: abbeyj, collinwinter, jyasskin, loewis, pitrou, rhettinger
Priority: low Keywords: patch

Created on 2009-06-10 03:59 by abbeyj, last changed 2009-06-15 21:34 by loewis. This issue is now closed.

Files
File name Uploaded Description Edit
deadcode.patch abbeyj, 2009-06-10 03:59
Messages (12)
msg89183 - (view) Author: James Abbatiello (abbeyj) Date: 2009-06-10 03:59
Python currently emits bytecode for code that is unreachable (e.g.
following a return statement).  This doesn't hurt anything but it takes
up space doing nothing.

This patch attempts to avoid generating any bytecode in this situation.
 There's an optional warning, enabled with the -r command line switch,
which notifies you if any unreachable code is found.  Running
regrtest.py with this switch produces a bit of noise but also revealed
issue6227 which looks like a real bug.
msg89186 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2009-06-10 06:38
We've rejected dead-code elimination in the past (too much effort for
nearly zero benefit).  Will take a look at your patch though.
msg89213 - (view) Author: Collin Winter (collinwinter) * (Python committer) Date: 2009-06-10 16:26
As another data point, Unladen Swallow had to take explicit steps to
deal with this dead code when compiling bytecode to machine code. Since
Python's compiler isn't smart enough to ignore code following a "return"
or "raise" in the same suite, support for that had to percolate into our
compiler.

For me, it's cleanliness issue, not a performance issue. That certainly
lowers the priority, though.

The warning James is adding for dead code detection may also be useful;
it looks to have already found one bug in the stdlib.
msg89216 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2009-06-10 16:37
It looks like the patch is extensive and well thought out.  I look
forward to going through it in detail.
msg89396 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2009-06-15 07:59
I'm curious to whether this needs to be embedded in the compiler or
whether a standalone tool can be offered.  See
http://code.activestate.com/recipes/277940/ for an example of processing
the bytecodes directly.  FWIW, deadcode can already be detected by some
of the existing code quality tools: 
http://www.doughellmann.com/articles/CompletelyDifferent-2008-03-linters/index.html
.  What does this patch offer beyond what we already have with those tools?
msg89406 - (view) Author: Collin Winter (collinwinter) * (Python committer) Date: 2009-06-15 18:28
Standalone bytecode-modifying tools almost never check that they're
outputting correct bytecode. http://code.activestate.com/recipes/277940/
makes no attempt to check what version of Python it's running under;
running it under Unladen Swallow 2009Q1 would have produced completely
incorrect code because we had modified how opcode arguments were stored.

I don't want to encourage people to write tools that operate over
CPython bytecode. As such code proliferates, the alternate
implementations have to chase it down and ask that it be removed. It
complicates the process of verifying that other implementations are
compatible with CPython.

Like I said earlier in this issue, for me this is more a compiler
cleanliness issue rather than a performance issue.
msg89407 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2009-06-15 18:37
Are you looking for dead code detection (like a lint utility) or
automatic dead code removal (modified code)?
msg89408 - (view) Author: James Abbatiello (abbeyj) Date: 2009-06-15 19:08
I should add that the patch doesn't only address dead user-code.  It
also eliminates code that the compiler generates itself but that would
be unreachable at runtime.  Consider the following function:

def foo(x):
  if x:
    raise Something
  else:
    raise SomethingElse


Without the patch this would compile to:
  2           0 LOAD_FAST                0 (x)
              3 POP_JUMP_IF_FALSE       15

  3           6 LOAD_GLOBAL              0 (Something)
              9 RAISE_VARARGS            1
             12 JUMP_FORWARD             6 (to 21)

  5     >>   15 LOAD_GLOBAL              1 (SomethingElse)
             18 RAISE_VARARGS            1
        >>   21 LOAD_CONST               0 (None)
             24 RETURN_VALUE


With the patch this slims down to:

  2           0 LOAD_FAST                0 (x)
              3 POP_JUMP_IF_FALSE       12

  3           6 LOAD_GLOBAL              0 (Something)
              9 RAISE_VARARGS            1

  5     >>   12 LOAD_GLOBAL              1 (SomethingElse)
             15 RAISE_VARARGS            1


So there are benefits even for normal code.

Also the END_FINALLY handling would be really hard to do in a pure
bytecode walker since context from the AST is needed (am I in a
try-finally or try-except or with statement?)
msg89409 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2009-06-15 19:48
James, the peepholer currently assumes that codestrings terminate with
RETURN_VALUE.  Is this assumption invalidated by your code?
msg89412 - (view) Author: James Abbatiello (abbeyj) Date: 2009-06-15 21:15
Raymond, I've updated peephole.c to allow code to terminate with any of
RETURN_VALUE, END_FINALLY or RAISE_VARARGS [0-3].  I also added tests to
test_peepholer.py to make sure the peepholer still works in these cases.
msg89413 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2009-06-15 21:28
Martin, am transferring this to you for adjudication.  I'm -0 on the patch.

On the plus side, the code is well thought-out and reasonably well
tested.  On the minus side, it complexifies the compiler in a way that
may make it more difficult to maintain.

The benefits to the end-user are almost invisible.  

Developers may possibly benefit from the -r command-line argument as a
way of detecting errors in their own code, but this functionality may be
better suited to lint utilities such as pylint which already supports
detection of unreachable code).

We've rejected previous, less sophisticated patches for dead code
elimination on the theory that there was too little pay-off. And at one
point, Guido expressed a disinclination to any efforts to neaten-up the
generated bytecode (he compared it to rearranging assembly language and
thought there was little point to it).
msg89414 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2009-06-15 21:34
I'm also happy with rejecting the patch, and agree with everything that
has been brought up against it: from an end-user point of view, there is
a negligible-if-any benefit (e.g. if you want to speed up importing, try
to reduce the number of stat() calls instead). For a developer, this
analysis is much better added to pylint.

Thanks for the contribution, anyway - don't get discouraged by this
rejection.
History
Date User Action Args
2009-06-15 21:34:04loewissetstatus: open -> closed
resolution: rejected
messages: + msg89414
2009-06-15 21:28:42rhettingersetassignee: rhettinger -> loewis

messages: + msg89413
nosy: + loewis
2009-06-15 21:15:44abbeyjsetmessages: + msg89412
2009-06-15 19:48:36rhettingersetmessages: + msg89409
2009-06-15 19:08:27abbeyjsetmessages: + msg89408
2009-06-15 18:37:44rhettingersetmessages: + msg89407
2009-06-15 18:28:15collinwintersetmessages: + msg89406
2009-06-15 07:59:07rhettingersetmessages: + msg89396
2009-06-10 16:37:18rhettingersetmessages: + msg89216
2009-06-10 16:26:29collinwintersetmessages: + msg89213
2009-06-10 06:38:47rhettingersetpriority: low

nosy: + rhettinger
messages: + msg89186

assignee: rhettinger
type: performance
2009-06-10 03:59:43abbeyjcreate