classification
Title: runpy.run_path doesn't set __package__ correctly
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.2
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: ncoghlan Nosy List: Arfrever, Benjamin.S.Wolf, eric.araujo, ncoghlan, pitrou, python-dev
Priority: normal Keywords:

Created on 2012-07-01 00:21 by Benjamin.S.Wolf, last changed 2012-07-17 11:10 by ncoghlan. This issue is now closed.

Messages (11)
msg164437 - (view) Author: Benjamin S Wolf (Benjamin.S.Wolf) Date: 2012-07-01 00:21
(Python 3.2.3)

1. After discarding the module run_path used to run the code in, all references to variables from local scopes (even if they are references to global variables) are bound to None, preventing any code in functions from running properly.

/tmp/a.py --------------------------------------
FOO = 'bar'

def f():
    print(FOO)

f()
------------------------------------------------
/tmp/b.py --------------------------------------
# Hack needed for:
#  python3 /tmp/b.py,
#  python3 -m /tmp/b
#  runpy.run_path('/tmp/b.py')
from os.path import dirname
__path__ = [dirname(__file__)]
del dirname

# Hack needed for:
#  python3 -m /tmp/b
if __name__ == '__main__' and not __package__:
    __package__ = '__main__'

from . import a

def g():
    print(a.FOO)

g()
------------------------------------------------

~$ python3
>>> import runpy
>>> d = runpy.run_module('/tmp/a')
bar
>>> d2 = runpy.run_path('/tmp/a.py')
bar
>>> d['f']
<function f at 0xb7451b6c>
>>> d['FOO']
'bar'
>>> d['f']()
bar
>>> d2['f']
<function f at 0xb7451bac>
>>> d2['FOO']
'bar'
>>> d2['f']()
None
>>> d3 = runpy.run_path('/tmp/b.py')
bar
bar
>>> d3['g']
<function g at 0xb746e26c>
>>> d3['a']
<module '<run_path>.a' from '/tmp/a.py'>
>>> d3['g']()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/tmp/b.py", line 15, in g
    print(a.FOO)
AttributeError: 'NoneType' object has no attribute 'FOO'

Notice that run_module gets this right, as d['f']() prints 'bar' but d2['f']() and d3['g']() do not.


2. run_path pollutes the module namespace when running a module that uses relative imports. This prevents any code that imports the same module in the same manner from running.

Continuing from #1 without having closed the interpreter:
>>> d4 = runpy.run_path('/tmp/b.py')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.2/runpy.py", line 250, in run_path
    return _run_module_code(code, init_globals, run_name, path_name)
  File "/usr/lib/python3.2/runpy.py", line 83, in _run_module_code
    mod_name, mod_fname, mod_loader, pkg_name)
  File "/usr/lib/python3.2/runpy.py", line 73, in _run_code
    exec(code, run_globals)
  File "/tmp/b.py", line 12, in <module>
    from . import a
ImportError: cannot import name a
>>> '<run_path>' in sys.modules
False
>>> '<run_path>.a' in sys.modules
True
>>> d3['a'].f()
bar
>>> del sys.modules['<run_path>.a']
>>> d4 = runpy.run_path('/tmp/b.py')
bar
bar
>>>

run_module, on the other hand, also alters sys.modules, but this does not prevent the module from being run, only from the secondary module from being re-imported:
[Create an empty file /tmp/__init__.py]
>>> sys.path = ['/'] + sys.path
>>> d5 = runpy.run_module('tmp.b')
bar
bar
>>> d6 = runpy.run_module('tmp.b')
bar
>>> d7 = runpy.run_module('tmp.b')
bar
>>> 'tmp' in sys.modules
True
>>> 'tmp.b' in sys.modules
False
>>> 'tmp.a' in sys.modules
True

[This was the only way I could get run_module to run /tmp/b.py, regardless of the presence or lack of the path and __package__ hacks at the top of the file, or any other changes I've experimented with. runpy.run_module('/tmp/b'), runpy.run_module('b') [with '/tmp' in sys.path] would generally result in:
  ValueError: Attempted relative import in non-package
and setting run_name='__main__' alongside any of the other changes would result in:
  ImportError: cannot import name a

python3 /tmp/b.py and python3 -m /tmp/b run fine.]


3. And finally, an examination of the run_path code shows that it doesn't, as the docs indicate, set __package__ to be run_name.rpartition('.')[0], but either the empty string or None: http://hg.python.org/cpython/file/3.2/Lib/runpy.py#l269
msg164733 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2012-07-06 16:25
“python3 -m /tmp/b” is invalid IIUC.  -m takes a module name, not a path.
msg164817 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2012-07-07 10:47
Firstly, I think you've identified a real bug with __package__ not being set correctly when using runpy.run_path (and I have updated this issue title accordingly).

I have also created a separate bug report (#15272) for the bizarre behaviour you identified in runpy.run_module - names containing "/" characters should be rejected as invalid. 

One of the main reasons you're having trouble though is that you can only do relative imports when inside a package - at the top level (as both of your modules are) relative imports are illegal. By forcing package to "__main__" you are claiming a location in the package namespace of "__main__.__main__" which doesn't make any sense.

The module life cycle problem for functions is covered in #812369. The only reason you're not hitting it in the run_module case is that when "alter_sys" is False, no temporary module is created. For data attributes (the intended use case for runpy), this all works fine regardless, but functions (which retain a reference to the original module namespace) will only work properly with alter_sys turned off. There should probably be a general disclaimer in the module docs that functions and classes are not guaranteed to be valid after using runpy, and importlib.import_module should be used instead for such cases.

#9235 looks into ways the runpy module might be enhanced with a CodeRunner class for other reasons, but the same mechanism could be used to keep the temporary module alive without storing it in sys.modules.
msg164822 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2012-07-07 11:17
You may also have identified a bug with pkgutil's import emulation failing to clean up sys.modules correctly when an import fails.
msg164823 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2012-07-07 11:21
Sorry, that's not accurate - you have enough code in b.py to make the relative import work by convincing the interpreter it's actually being done in a package.

So what you're seeing in that regard is the fact that runpy is not any kind of sandbox - it shares process global state, including the import system, with all other modules. While the temporary module will be reverted automatically by runpy, any child imports will always remain visible in sys.modules, and any other side effects will remain in place (e.g. codec registrations).
msg165450 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2012-07-14 14:08
New changeset 3b05cf877124 by Nick Coghlan in branch '3.2':
Close #15230: runpy.run_path now sets __package__ correctly. Also refactored the runpy tests to use a more systematic approach
http://hg.python.org/cpython/rev/3b05cf877124

New changeset 8a44e7c0fa30 by Nick Coghlan in branch 'default':
Merge fix for #15230 from 3.2
http://hg.python.org/cpython/rev/8a44e7c0fa30
msg165452 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2012-07-14 14:38
New changeset 4880aac5c665 by Nick Coghlan in branch '3.2':
Issue #15230: Update runpy docs to clarify a couple of points that came up in this issue
http://hg.python.org/cpython/rev/4880aac5c665

New changeset 416cd57d38cf by Nick Coghlan in branch 'default':
Merge #15230 doc updates from 3.2
http://hg.python.org/cpython/rev/416cd57d38cf
msg165656 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-07-16 21:23
This broke some 3.2 buildbots, e.g.:
http://buildbot.python.org/all/builders/AMD64%20Lion%203.2/builds/25
msg165697 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2012-07-17 10:30
*grumble*grumble*os-x-and-its-crazy-symlink-as-tmp-dir*grumble*grumble*

I'm fairly sure I've tripped over this kind of thing before, so I believe I know how to fix it (add a realpath() call when figuring out the expected path value).

The buildbots will let me know if I'm right, of course :)
msg165701 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2012-07-17 10:43
New changeset 07ed744a47f6 by Nick Coghlan in branch '3.2':
Issue #15230: Attempt to make the OS X buildbots happy by resolving the tmp dir symlink in the test suite
http://hg.python.org/cpython/rev/07ed744a47f6

New changeset cb7e8ee489a1 by Nick Coghlan in branch 'default':
Merge Issue #15230 OS X buildbot fix from 3.2
http://hg.python.org/cpython/rev/cb7e8ee489a1
msg165705 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2012-07-17 11:10
Closing again, since the OS X buildbot was happy with the change on 3.2. If trunk fails (which it shouldn't), that should be filed as a new issue (as it would almost certainly relate to the move from the pkgutil emulation to importlib).
History
Date User Action Args
2012-07-17 11:10:09ncoghlansetstatus: open -> closed

messages: + msg165705
2012-07-17 10:43:53python-devsetmessages: + msg165701
2012-07-17 10:30:47ncoghlansetmessages: + msg165697
2012-07-16 21:23:27pitrousetstatus: closed -> open
nosy: + pitrou
messages: + msg165656

2012-07-14 14:38:58python-devsetmessages: + msg165452
2012-07-14 14:08:03python-devsetstatus: open -> closed

nosy: + python-dev
messages: + msg165450

resolution: fixed
stage: resolved
2012-07-11 03:20:29Arfreversetnosy: + Arfrever
2012-07-07 11:21:53ncoghlansetmessages: + msg164823
2012-07-07 11:17:08ncoghlansetmessages: + msg164822
2012-07-07 11:10:33ncoghlansetassignee: ncoghlan
2012-07-07 10:47:44ncoghlansetmessages: + msg164817
title: runpy.run_path broken: Breaks scoping; pollutes sys.modules; doesn't set __package__ -> runpy.run_path doesn't set __package__ correctly
2012-07-06 16:25:11eric.araujosetnosy: + eric.araujo, ncoghlan
messages: + msg164733
2012-07-01 00:21:25Benjamin.S.Wolfcreate