classification
Title: compiler: ignore constants used as statements (don't emit LOAD_CONST+POP_TOP)
Type: Stage: commit review
Components: Interpreter Core Versions: Python 3.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: Aaron.Meurer, Jim.Jewett, georg.brandl, jayvdb, python-dev, rhettinger, serhiy.storchaka, vstinner, yselivanov
Priority: normal Keywords: patch

Created on 2016-01-26 01:00 by vstinner, last changed 2017-02-08 10:57 by vstinner. This issue is now closed.

Files
File name Uploaded Description Edit
compiler.patch vstinner, 2016-01-26 01:00 review
Messages (17)
msg258939 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-01-26 01:00
The bytecode compilers ignores ast.Str and ast.Num nodes:

----------------------------
>>> def func():
...     123
...     "test"
... 
>>> import dis; dis.dis(func)
  3           0 LOAD_CONST               0 (None)
              3 RETURN_VALUE
----------------------------

But other ast nodes which are constant are not ignored:

----------------------------
>>> def func2():
...     b'bytes'
...     (1, 2)
... 
>>> import dis; dis.dis(func2)
  2           0 LOAD_CONST               1 (b'bytes')
              3 POP_TOP

  3           4 LOAD_CONST               4 ((1, 2))
              7 POP_TOP
              8 LOAD_CONST               0 (None)
             11 RETURN_VALUE
----------------------------

I don't understand the point of loading a constant and then unload it (POP_TOP).


Attached patch changes the compiler to not emit LOAD_CONST+POP_TOP anymore.


My patch only affects constants. Example with the patch:
----------------------------
>>> def f():
...  x
... 
>>> import dis
>>> dis.dis(f)
  2           0 LOAD_GLOBAL              0 (x)
              3 POP_TOP
              4 LOAD_CONST               0 (None)
              7 RETURN_VALUE
----------------------------

The compiler still emits "LOAD_GLOBAL x" for the instruction "x".

Ignoring the Python instruction "x" would change the Python semantics, because the function would not raise a NameError anymore if x is not defined.

Note: I noticed this inefficient bytecode while working on the issue #26146 (add ast.Constant).
msg258946 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2016-01-26 04:51
The patch looks alright.

Will the following code compile OK?  What will it compile to?

   if 1:
      42
msg258968 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-01-26 19:25
LGTM.

It looks to me that this optimization was added to avoid spending executing time for docstrings. Other cases almost never occur in real code and are not worth to be optimized. But the patch makes the code cleaner (it would even more cleaner if collapse all kinds of constants in Constant).
msg258976 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-01-26 22:01
Serhiy: "It looks to me that this optimization was added to avoid spending executing time for docstrings."

Hum, let me dig Mercurial history.
----
changeset:   39364:8ef3f8cf90e1
branch:      legacy-trunk
user:        Neal Norwitz <nnorwitz@gmail.com>
date:        Fri Aug 04 05:09:28 2006 +0000
files:       Lib/test/test_code.py Misc/NEWS Python/compile.c
description:
Bug #1333982: string/number constants were inappropriately stored
in the byte code and co_consts even if they were not used, ie
immediately popped off the stack.
---

https://bugs.python.org/issue1333982#msg26659:
"For reference, an optimization that got lost: (...) 'a' is the docstring, but the 'b' previously did not show up anywhere in the code object.  Now there is the LOAD_CONST/POP_TOP pair."

Ah, it was a regression introduced by the new AST compiler. But the change introduced a new optimization: numbers are now also ignored. In Python 2.4 (before AST), numbers were not ignored:
---
>>> def f():
...  "a"
...  1
...  "b"
... 

>>> import dis; dis.dis(f)
  3           0 LOAD_CONST               1 (1)
              3 POP_TOP             
              4 LOAD_CONST               2 (None)
              7 RETURN_VALUE        
---


If we continue to dig deeper, before AST, I found:
---
changeset:   4991:8276916e1ea8
branch:      legacy-trunk
user:        Guido van Rossum <guido@python.org>
date:        Fri Jan 17 21:04:03 1997 +0000
files:       Python/compile.c
description:
Add co_stacksize field to codeobject structure, and stacksize argument
to PyCode_New() argument list.  Move MAXBLOCKS constant to conpile.h.

Added accurate calculation of the actual stack size needed by the
generated code.

Also commented out all fprintf statements (except for a new one to
diagnose stack underflow, and one in #ifdef'ed out code), and added
some new TO DO suggestions (now that the stacksize is taken of the TO
DO list).
---

This patch added the following code to com_expr_stmt() in Python/compile.c:

+       /* Forget it if we have just a doc string here */
+       if (NCH(n) == 1 && get_rawdocstring(n) != NULL)
+               return;

I'm unable to find the exact part of the compiler which ignores strings in statement expressions.
msg258977 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-01-26 22:05
"""
Will the following code compile OK?  What will it compile to?

   if 1:
      42
"""

compile OK: what do you mean? This code doesn't make any sense to me :-)


Currently text strings statements are ignored:
---
>>> def f():
...  if 1:
...   "abc"
... 
>>> import dis
>>> dis.dis(f)
  3           0 LOAD_CONST               0 (None)
              3 RETURN_VALUE
---


But byte strings emit a LOAD_CONST+POP_TOP:
---
>>> def g():
...  if 1:
...   b'bytes'
... 
>>> dis.dis(g)
  3           0 LOAD_CONST               1 (b'bytes')
              3 POP_TOP
              4 LOAD_CONST               0 (None)
              7 RETURN_VALUE
---


With my patch, all constants statements will be ignored. Example with my patch:
---
>>> def h():
...  if 1:
...   b'bytes'
... 
>>> import dis; dis.dis(h)
  3           0 LOAD_CONST               0 (None)
              3 RETURN_VALUE
---
msg258979 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-01-26 22:09
Serhiy: "It looks to me that this optimization was added to avoid spending executing time for docstrings. Other cases almost never occur in real code and are not worth to be optimized. But the patch makes the code cleaner (it would even more cleaner if collapse all kinds of constants in Constant)."

Oh, I don't really care of performance. The bytecode just doesn't make any sense to me. I don't understand why we load a constant.

Maybe the compiler should emit a warning to say that the code doesn't make sense at all and is ignored?

Example with GCC:

$ cat x.c 
int main()
{
    1;
}


$ gcc x.c -Wall -o x
x.c: In function 'main':
x.c:3:5: warning: statement with no effect [-Wunused-value]
     1;
     ^
msg258981 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-01-26 22:33
> Maybe the compiler should emit a warning to say that the code doesn't make sense at all and is ignored?

Oh ok, now I recall a similar issue that I posted 3 years ago: issue #17516, "Dead code should be removed".

Example of suspicious code:

def func():
   func2(),

func() calls func2() and then create a tuple of 1 item with the func2() result. See my changeset 33bdd0a985b9 for examples in the Python source code. The parser or compiler should maybe help to developer to catch such strange code :-)

In some cases, the code really contains code explicitly dead:

def test_func():
   return
   do_real_stuff()

do_real_stuff() is never called. Maybe it was a deliberate choice, maybe it was a mistake in a very old code base, bug introduced after multiple refactoring, and high turn over in a team? Again, we should emit a warning on such case?

Hum, these warnings have a larger scope than this specific issue (don't emit LOAD_CONST for constants in expressions).

See also the related thread on python-dev for the specific case of triple quoted strings ("""string"""): https://mail.python.org/pipermail/python-dev/2013-March/124925.html

It was admitted that it's a convenient way to insert a comment and it must not emit a warning (at least, not by default?)
msg259775 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2016-02-07 09:38
+1
msg259862 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2016-02-08 17:12
New changeset a0d053899ff8 by Victor Stinner in branch 'default':
Simplify main() of test_ast
https://hg.python.org/cpython/rev/a0d053899ff8

New changeset bcf27fa55632 by Victor Stinner in branch 'default':
Replace noop constant statement with expression
https://hg.python.org/cpython/rev/bcf27fa55632
msg259864 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-02-08 17:23
changeset:   100192:4bdb21380743
tag:         tip
user:        Victor Stinner <victor.stinner@gmail.com>
date:        Mon Feb 08 18:17:58 2016 +0100
files:       Lib/test/test_ast.py Lib/test/test_code.py Lib/test/test_grammar.py Misc/NEWS Python/compile.c
description:
compiler now ignores constant statements

The compile ignores constant statements and emit a SyntaxWarning warning.

Don't emit the warning for string statement because triple quoted string is a
common syntax for multiline comments.

Don't emit the warning on ellipis neither: 'def f(): ...' is a legit syntax for
abstract functions.

Changes:

* test_ast: ignore SyntaxWarning when compiling test statements. Modify
  test_load_const() to use assignment expressions rather than constant
  expression.
* test_code: add more kinds of constant statements, ignore SyntaxWarning when
  testing that the compiler removes constant statements.
* test_grammar: ignore SyntaxWarning on the statement "1"
msg259865 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-02-08 17:27
I changed my patch to emit a SyntaxWarning. If too many users complain of the warning, maybe we can remove it. IMHO it's useful to detect bugs.
msg259876 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2016-02-08 19:14
Shouldn't the message be "constant statement ignored"?  The current wording reads strange to me.
msg259890 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2016-02-08 21:47
New changeset 15531b10976c by Victor Stinner in branch 'default':
compiler: don't emit SyntaxWarning on const stmt
https://hg.python.org/cpython/rev/15531b10976c
msg259895 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-02-08 22:20
> Shouldn't the message be "constant statement ignored"?  The current wording reads strange to me.

I removed the warning ;)
msg259959 - (view) Author: Jim Jewett (Jim.Jewett) (Python triager) Date: 2016-02-09 22:38
I think the warning was helpful; it just had confusing wording.

Instead of: """
>>> def f():
...  False
...
<stdin>:2: SyntaxWarning: ignore constant statement
"""

perhaps: """
>>> def f():
...  False
...
<stdin>:2: SyntaxWarning: ignoring constant statement
"""

or even: """
>>> def f():
...  False
...
<stdin>:2: SyntaxWarning: ignoring unused constant 'False'
"""
msg259989 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-02-10 07:11
Sorry you are late :-) I started a thread on python-dev and it was decided
to let linters handle this warning.
msg287297 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-02-08 10:57
FYI the thread was in February 2016:
https://mail.python.org/pipermail/python-dev/2016-February/143163.html
"[Python-Dev] Issue #26204: compiler now emits a SyntaxWarning on constant statement"
History
Date User Action Args
2017-02-08 10:57:31vstinnersetmessages: + msg287297
2016-02-10 07:11:26vstinnersetmessages: + msg259989
2016-02-09 22:38:05Jim.Jewettsetnosy: + Jim.Jewett
messages: + msg259959
2016-02-08 22:20:42vstinnersetmessages: + msg259895
2016-02-08 22:07:54Aaron.Meurersetnosy: + Aaron.Meurer
2016-02-08 21:47:54python-devsetmessages: + msg259890
2016-02-08 19:24:49jayvdbsetnosy: + jayvdb
2016-02-08 19:14:47georg.brandlsetnosy: + georg.brandl
messages: + msg259876
2016-02-08 17:27:38vstinnersetstatus: open -> closed
resolution: fixed
messages: + msg259865

title: compiler: ignore constants used as statements? (don't emit LOAD_CONST+POP_TOP) -> compiler: ignore constants used as statements (don't emit LOAD_CONST+POP_TOP)
2016-02-08 17:23:07vstinnersetmessages: + msg259864
2016-02-08 17:12:07python-devsetnosy: + python-dev
messages: + msg259862
2016-02-07 09:38:47rhettingersetnosy: + rhettinger
messages: + msg259775
2016-01-26 22:39:17vstinnersettitle: compiler: don't emit LOAD_CONST instructions for constant statements? -> compiler: ignore constants used as statements? (don't emit LOAD_CONST+POP_TOP)
2016-01-26 22:33:02vstinnersetmessages: + msg258981
2016-01-26 22:09:08vstinnersetmessages: + msg258979
2016-01-26 22:05:19vstinnersetmessages: + msg258977
2016-01-26 22:01:33vstinnersetmessages: + msg258976
2016-01-26 19:25:56serhiy.storchakasetnosy: + serhiy.storchaka

messages: + msg258968
stage: commit review
2016-01-26 04:51:05yselivanovsetmessages: + msg258946
2016-01-26 01:01:06vstinnersettitle: compiler: don't emit LOAD_CONST instructions for constant expressions? -> compiler: don't emit LOAD_CONST instructions for constant statements?
2016-01-26 01:00:40vstinnercreate