classification
Title: LOAD_GLOBAL used to load `None` under certain circumstances
Type: behavior Stage: resolved
Components: Interpreter Core Versions: Python 3.4
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: DragonFireCK, Horpner, alex, benjamin.peterson, brett.cannon, bruno.dupuis, cvrebert, daniel.urban, georg.brandl, haypo, ikelly, loewis, meador.inge, mrabarnett, ncoghlan, python-dev, steven.daprano, terry.reedy
Priority: normal Keywords: patch

Created on 2012-12-05 19:28 by bruno.dupuis, last changed 2012-12-14 16:39 by haypo. This issue is now closed.

Files
File name Uploaded Description Edit
16619-1.patch bruno.dupuis, 2012-12-05 21:47 quick and dirty patch. Just proves the point. It solves the issue for < 32700 bytes functions. review
const-node.patch benjamin.peterson, 2012-12-06 00:46 review
Messages (14)
msg176999 - (view) Author: Bruno Dupuis (bruno.dupuis) Date: 2012-12-05 19:28
We found some strange behaviour of the compiler in this discussion on python-list: http://mail.python.org/pipermail/python-list/2012-December/636104.html

The fact is, `return` and `return None` result in inconsistent bytecode depending on the context.

Consider :

>>> import dis
>>> def f(x):
...     return None
... 
>>> dis.dis(f)
  2           0 LOAD_CONST               0 (None) 
              3 RETURN_VALUE         
>>> def g(x):
...     return None
...     print(x)
... 
>>> dis.dis(g)
  2           0 LOAD_GLOBAL              0 (None) 
              3 RETURN_VALUE         

  3           4 LOAD_GLOBAL              1 (print) 
              7 LOAD_FAST                0 (x) 
             10 CALL_FUNCTION            1 (1 positional, 0 keyword pair) 
             13 POP_TOP              

`return None` statement results in LOAD_GLOBAL 0 if there is some unreachable code after it. I first saw that as an optimization issue, but Ian Kelly's message http://mail.python.org/pipermail/python-list/2012-December/636117.html gives an extensive analysis and some examples:

"""
  I think this should even be considered a bug, not just a missing
  optimization.  Consider:

  >>> globals()['None'] = 42
  >>> def f(x):
  ...     return None
  ...     print(x)
  ...
  >>> f('test')
  42

  The use of the LOAD_GLOBAL allows None to effectively be reassigned.

"""

Ian also points out in this message that `return` and `return None` don't result in the same bytecode when followed by trash code.
msg177007 - (view) Author: Matthew Barnett (mrabarnett) * Date: 2012-12-05 20:45
The same problem occurs with both `False` and `True`.
msg177008 - (view) Author: Bruno Dupuis (bruno.dupuis) Date: 2012-12-05 20:59
To me, the whole issue is at line ~ 480 in peehole.c in the LOAD_NAME/LOAD_GLOBAL switch case.

This explains the difference between `return` and `return None` as the former is actually compiled to LOAD_CONST. OTOH, `return None` has to pass the optim pass to be changed in LOAD_CONST. The real bug is sometime it doesn't.
msg177009 - (view) Author: Bruno Dupuis (bruno.dupuis) Date: 2012-12-05 21:10
line 426 in peehole.c :

if (codestr[codelen-1] != RETURN_VALUE)
    goto exitUnchanged;

before the optim. I'm quite sure it's here. i patch and see...
msg177011 - (view) Author: Bruno Dupuis (bruno.dupuis) Date: 2012-12-05 21:47
This first patch spots the issue, but doesn't solve it if the bytecode of the function is > 32700 (see PyCode_Optimize comment). However with this patch, we get the LOAD_CONST anytime for None, True and False.

There is two questions :

1- is it safe to strip all the code after RETURN_VALUE as the patch does?

2- to correct this bug, we will need a deep refactoring of PyCode_Optimize (so that it accepts any code length).

The other way, is not to rely on PyCode_Optimize to compile return None/True/False, but do modifictations in the compiler itself. This must be the right way to do this, but it's far beyond my C skills and python core knowledge.
msg177013 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2012-12-05 22:39
(2.6 is security fix only)

Stripping truly dead code after return is really tricky in general because
a) it might be in a conditional block, and 
b) unreachable yield and assignment can affect compilation.
Assignments that make names local are detected on a first pass, but I do not know about yield. Anyway, I believe the policy has been to not do it.

The final patch will need a cpython-only test based on g(x), with dead code.

Possibly intersecting issues are proposals to change where optimization is done and, for testing, to add a generator to dis so its output can be directly captured and analyzed instead of having to redirect, capture, and parse stdout.
msg177017 - (view) Author: Bruno Dupuis (bruno.dupuis) Date: 2012-12-06 00:37
We definitely need to put the code that loads constants with LOAD_CONST out of the optimization code. It's not optim, it's a language feature: None *is* a 'singleton' constant.

I'm trying to figure out how to change compile.c to achieve this, as it's my first dive into the compiler code, it's not that easy.

Another approch is to strip unreachable nodes in AST, but

a) it's quite complex, as Terry said

b) it solves only this particular bug, not the general assertion "None, True and False are reserved words bound to constants. It can not ever be loaded with LOAD_NAME or LOAD_GLOBAL"
msg177018 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2012-12-06 00:46
Here's what I think we should do for 3.4. Nick, care to commment?
msg177023 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2012-12-06 02:55
First reaction is +1 for finally switching to real constant nodes in the AST for 3.4+. This is an inherited behaviour from 2.x where these were ordinary names rather than true keywords, so we weren't able to completely disallow overwriting them.

As a smaller impact change for earlier versions, we should be able to do something in compiler_nameop [1] that picks up the 3 singleton names and allows only LOAD_CONST, erroring out otherwise.

http://hg.python.org/cpython/file/default/Python/compile.c#l2625
msg177024 - (view) Author: Alex Gaynor (alex) * (Python committer) Date: 2012-12-06 02:56
Nick, None was a constant even in 2k
msg177026 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2012-12-06 04:09
True and False weren't, though (and even None wasn't a proper keyword)
msg177027 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2012-12-06 04:17
The difference in the errors below is the reason the systematic fix in Benjamin's patch simply wasn't practical in 2.x (as it would have required a complex deprecation dance to turn None, True and False into real keywords):

Python 2.7.3 (default, Jul 24 2012, 10:05:38) 
[GCC 4.7.0 20120507 (Red Hat 4.7.0-5)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> class C: pass
... 
>>> C.None
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: class C has no attribute 'None'
>>> 

Python 3.2.3 (default, Jun  8 2012, 05:36:09) 
[GCC 4.7.0 20120507 (Red Hat 4.7.0-5)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> class C: pass
... 
>>> C.None
  File "<stdin>", line 1
    C.None
         ^
SyntaxError: invalid syntax
msg177046 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2012-12-06 15:37
I claim that this is not a bug in the existing versions. It is documented as "illegal" to assign to None:

http://docs.python.org/2.7/library/constants.html?highlight=none#None

So what exactly happens if you manage to bypass the existing checks is implementation-specific (or, in the C sense, "undefined behavior"). Therefore, the reaction of Python 2.7 (e.g.) is perfectly fine.
msg177062 - (view) Author: Roundup Robot (python-dev) Date: 2012-12-06 22:41
New changeset 03f92a9f0875 by Benjamin Peterson in branch 'default':
create NameConstant AST class for None, True, and False literals (closes #16619)
http://hg.python.org/cpython/rev/03f92a9f0875
History
Date User Action Args
2012-12-14 16:39:12hayposetnosy: + haypo
2012-12-06 22:41:12python-devsetstatus: open -> closed
resolution: fixed
messages: + msg177062

stage: patch review -> resolved
2012-12-06 15:37:02loewissetnosy: + loewis

messages: + msg177046
versions: - Python 2.7, Python 3.2, Python 3.3
2012-12-06 10:25:08daniel.urbansetnosy: + daniel.urban
2012-12-06 04:17:13ncoghlansetmessages: + msg177027
2012-12-06 04:09:40ncoghlansetmessages: + msg177026
2012-12-06 02:56:14alexsetmessages: + msg177024
2012-12-06 02:55:01ncoghlansetmessages: + msg177023
2012-12-06 00:46:34benjamin.petersonsetfiles: + const-node.patch

messages: + msg177018
2012-12-06 00:37:02bruno.dupuissetmessages: + msg177017
2012-12-06 00:18:32meador.ingesetnosy: + meador.inge
2012-12-05 22:39:06terry.reedysetversions: - Python 2.6
nosy: + brett.cannon, ncoghlan, benjamin.peterson, terry.reedy, georg.brandl

messages: + msg177013

type: behavior
stage: patch review
2012-12-05 21:47:46bruno.dupuissetfiles: + 16619-1.patch
keywords: + patch
messages: + msg177011
2012-12-05 21:35:44cvrebertsetnosy: + cvrebert
2012-12-05 21:24:53DragonFireCKsetnosy: + DragonFireCK

versions: + Python 2.6
2012-12-05 21:11:23alexsetnosy: + alex
2012-12-05 21:10:22bruno.dupuissetmessages: + msg177009
2012-12-05 20:59:26bruno.dupuissetmessages: + msg177008
2012-12-05 20:45:30mrabarnettsetnosy: + mrabarnett
messages: + msg177007
2012-12-05 19:28:31bruno.dupuiscreate