classification
Title: Different behaviours in script run directly and via runpy.run_module
Type: behavior Stage:
Components: Library (Lib) Versions: Python 3.2, Python 3.3, Python 2.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Arfrever, eric.araujo, eric.snow, isaiah, jaraco, ncoghlan, serhiy.storchaka, vinay.sajip
Priority: normal Keywords:

Created on 2012-12-20 13:37 by vinay.sajip, last changed 2020-01-29 00:36 by brett.cannon.

Messages (19)
msg177814 - (view) Author: Vinay Sajip (vinay.sajip) * (Python committer) Date: 2012-12-20 13:37
If a script is run directly, the value of __file__ in it is relative to the current directory. If run via runpy.run_module, the value of __file__ is an absolute path. This is a problem in certain scenarios - e.g. if the script is a distribution's setup.py, a lot of distributions (rightly or wrongly) assume that the __file__ in setup.py will be relative, and mess up if it's absolute.

Example:
# script.py
print(__file__, __name__)

#runscript.py
import runpy
runpy.run_module('script', run_name='__main__')

Example output (2.7):
$ python script.py
('script.py', '__main__')
$ python runscript.py
('/home/vinay/projects/scratch/script.py', '__main__')

Example output (3.2):
$ python3.2 script.py
script.py __main__
$ python3.2 runscript.py
/home/vinay/projects/scratch/script.py __main__
msg177818 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-12-20 13:53
$ ./python -m script
/home/serhiy/py/cpython/script.py __main__
$ ./python -c "import runpy; runpy.run_path('script.py', run_name='__main__')"
script.py __main__

This looks consistent.
msg177824 - (view) Author: Vinay Sajip (vinay.sajip) * (Python committer) Date: 2012-12-20 14:18
I'd use runpy.run_path if I could, but it's not available on Python 2.6 - that's why I'm using run_module.

A lot of setup.py files out there use __file__ to compute additional package names, package data locations etc. - this can lead to bogus package names computed blindly from paths assumed to be relative, when the setup.py file is run using run_module. I'm not sure what you mean when you say "looks consistent" - ISTM there is a difference, i.e. inconsistency, between a direct run and a run via run_module.
msg177825 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-12-20 14:27
I'm just pointing that run_module() and run_path() differs in the same way as `python -m` and `python`. If you want to change behavior of run_module(), then you should to change behavior of `python -m` too. And I'm not sure that this change will not break a lot of third-part code.

Try to backport run_path() to 2.6 if you need it.
msg177884 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2012-12-21 17:16
FTR, distutils only recommends and supports running “python setup.py”, i.e. relative path in the script’s directory.
msg177888 - (view) Author: Vinay Sajip (vinay.sajip) * (Python committer) Date: 2012-12-21 18:30
> FTR, distutils only recommends and supports running “python setup.py”, i.e. relative path in the script’s directory.

Right, but this behaviour is seen even when the script is in the current directory.
msg177932 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2012-12-22 10:12
Ah, some glorious (in)consistency here:

$ cat > echo_file.py
print(__file__)

(2.7, old import system)
$ python -c "import echo_file"
echo_file.pyc
$ python -m "echo_file"
/home/ncoghlan/devel/play/echo_file.py
$ python echo_file.py
echo_file.py

(3.2, cache directories)
$ python3 -c "import echo_file"
echo_file.py
$ python3 -m "echo_file"
/home/ncoghlan/devel/play/echo_file.py
$ python3 echo_file.py
echo_file.py

(3.3, new import system)
$ ../py33/python -c "import echo_file"
./echo_file.py
$ ../py33/python -m "echo_file"
/home/ncoghlan/devel/play/echo_file.py
$ ../py33/python echo_file.py
echo_file.py

However, if we change Python's behaviour here, it's more likely to be in the direction of making __file__ reliably absolute, as allowing relative paths in __file__ can cause problems if the current directory ever changes. (I do wonder if this could be the reason nosetests doesn't work with -m, though).
msg177933 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2012-12-22 10:16
So yes, any code that assumes __main__.__file__ is a relative path is just plain wrong, as Python provides no such guarantee. It may currently be either relative or absolute at the implementation's discretion.

If the status quo ever changes, it would be to switch to requiring that all module __file__ attributes be absolute paths (including in __main__).
msg177960 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2012-12-23 00:22
I would totally support tossing relative file paths in Python 3.4 as it has been nothing but backwards-compatibility headaches and is essentially wrong as the module's file is not relative to the current directory necessarily.
msg178834 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2013-01-02 18:28
Just to have this written down somewhere, site.py already goes through and changes __file__ to absolute for modules already imported before it is run, so there is some precedent to not caring about relative file paths.
msg282872 - (view) Author: Jason R. Coombs (jaraco) * (Python committer) Date: 2016-12-10 20:17
I've found some other inconsistencies with the use of python -m. One is the surprising behavior when running pip:

$ touch logging.py
$ pip --help > /dev/null
$ python -m pip --help > /dev/null
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/runpy.py", line 183, in _run_module_as_main
    mod_name, mod_spec, code = _get_module_details(mod_name, _Error)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/runpy.py", line 142, in _get_module_details
    return _get_module_details(pkg_main_name, error)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/runpy.py", line 109, in _get_module_details
    __import__(pkg_name)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pip/__init__.py", line 21, in <module>
    from pip._vendor.requests.packages.urllib3.exceptions import DependencyWarning
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pip/_vendor/requests/__init__.py", line 62, in <module>
    from .packages.urllib3.exceptions import DependencyWarning
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pip/_vendor/requests/packages/__init__.py", line 27, in <module>
    from . import urllib3
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pip/_vendor/requests/packages/urllib3/__init__.py", line 8, in <module>
    from .connectionpool import (
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pip/_vendor/requests/packages/urllib3/connectionpool.py", line 35, in <module>
    from .connection import (
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pip/_vendor/requests/packages/urllib3/connection.py", line 44, in <module>
    from .util.ssl_ import (
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pip/_vendor/requests/packages/urllib3/util/__init__.py", line 20, in <module>
    from .retry import Retry
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pip/_vendor/requests/packages/urllib3/util/retry.py", line 15, in <module>
    log = logging.getLogger(__name__)
AttributeError: module 'logging' has no attribute 'getLogger'

Obviously this is a case of don't create any files that mask stdlib or other modules that your call stack might try to import. But the takeaway is that you can't in general rely on `python -m` to launch behavior comparable to launching a script.


Additionally, this inconsistency led to a [subtle bug in pytest when launched with -m](https://github.com/pytest-dev/pytest/issues/2104). In that ticket, the recommended solution (at least thusfar) is "don't use -m". I imagine that pytest (and every other project that exposes a module for launching core behavior) could work around the issue by explicitly removing the cwd from sys.path, but that seems messy.

I imagine it could prove difficult to overcome the backward incompatibilities of changing this behavior now, so I don't have a strong recommendation, but I wanted to share these experiences and get feedback and recommendations.
msg282939 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2016-12-11 19:23
Maybe we just need to clarify some documentation at this point? All of the differences in semantics make total sense when you realize `-m pkg` is really conceptually shorthand for `import pkg.__main__` (w/ the appropriate __name__ flourishes). When you begin to view it that way then specifying the file path starts to look like the odd way by bypassing import and simply running open() on a file and passing the result to exec() (once again, with the appropriate __name__ flourishes). It also makes the point that -m isn't really shorthand for specifying a script which seems to be where people are getting tripped up by the differences.

So instead of selling `-m` as a way to run a module/package as a script, I say we change the message/documentation to say it does an import with __name__ changed and make specifying a script as the weird reading-a-file-directly thing.
msg282956 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2016-12-12 01:42
Something we don't really have anywhere is a short summary of how sys.path[0] gets initialised - it's scattered through the descriptions of the various invocation options in https://docs.python.org/3/using/cmdline.html#interface-options

I figure it's also worth mentioning https://bugs.python.org/issue13475 where we discussed the idea of a "--nopath0" option so folks could easily disable imports from the current directory (-c, -m) or from the script directory (direct execution) without otherwise impacting startup behaviour. (Specifying --nopath0, and then requesting execution of a directory or zipfile would necessarily be an error, since modifying sys.path[0] is an essential part of making those work)
msg304662 - (view) Author: Jason R. Coombs (jaraco) * (Python committer) Date: 2017-10-20 16:07
> All of the differences in semantics make total sense when you realize `-m pkg` is really conceptually shorthand for `import pkg.__main__` (w/ the appropriate __name__ flourishes).
> So instead of selling `-m` as a way to run a module/package as a script, I say we change the message/documentation to say it does an import with __name__ changed and make specifying a script as the weird reading-a-file-directly thing.

I agree the explanation makes sense. But is that the design we want?

In my opinion, the most meaningful purpose of `-m` is its ability to run a module/package as a script. I've rarely seen it used as a convenience for executing a Python file, but I've seen it used extensively for providing entry to the canonical behavior for a library:

- pip: python -m pip
- pytest: python -m pytest
- setuptools: python -m easy_install
- virtualenv: python -m virtualenv

[rwt](https://pypi.org/project/rwt) takes advantage of this mechanism as the primary way to invoke a module (because scripts don't provide an API at the Python interpreter level):

    $ rwt -q paver -- -m paver --version
    Paver 1.2.4 

I began using this functionality extensively when on Windows using pylauncher, as it was the only reliable mechanism to invoke behavior with a desired Python interpreter:

    PS> py -3.3 -m pip

If the semantics of -m are by design and need to be retained, it seems to me there's a strong argument for a new parameter, one which works much as -m, but which has the semantics of launching a script (as much as possible). Consider "python -r module" (for Run). Alternatively, Python could distribute a new runner executable whose purpose is specifically for running modules. Consider "pyrun module".

I don't think it will be sufficient to simply add a --nopath0 option, as that would impose that usage on the user... and `python -m modulename` is just at the limit of what syntax I think I can reasonably expect my users to bear for launching a script.

My preference would be to alter the semantics of `-m` with the explicit intention to align it with script launch behavior, or to provide a new concise mechanism for achieving this goal.

Thoughts?
msg304698 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2017-10-21 08:18
Is there a relevant discrepancy other than __file__ sometimes being absolute?

If code wants to be certain that __file__ is relative to the current directory, they need to run it through os.relpath() - there's no requirement for implementations one way or the other as to whether __file__ is absolute or relative

If we changed anything in CPython, it would be to make __main__.__file__ always absolute, even for scripts - we already changed plain imports to work that way.
msg304703 - (view) Author: Jason R. Coombs (jaraco) * (Python committer) Date: 2017-10-21 13:08
The other major difference and the only one that's affected me is the presence of sys.path[0] == ''.
msg304707 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2017-10-21 15:12
Yeah, that one I definitely think could be improved. Could you file a separate RFE suggesting that we update sys.path[0] based on __main__.__spec__.origin after we look up __main__.__spec__?

That way it will only stay as the current directory if the module being executed is in a subdirectory of the current directory, and will otherwise be updated appropriately for wherever we actually found the main module.
msg305052 - (view) Author: Isaiah Peng (isaiah) Date: 2017-10-26 13:43
Not sure if it's stated before, this difference of behavior also has other effects, e.g.

$ python -m test.test_traceback

# Ran 61 tests in 0.449s
# FAILED (failures=5)

This is because the loader associated with the module get confused, it loaded the original module as the proper module and then the module changed name to __main__ but the loader is still associated with the old module name, so call to `get_source` fails.

$ cat > test_m.py
print(__loader__.get_source(__name__))

$ python -m test_m

# ImportError: loader for test_m cannot handle __main__
msg305055 - (view) Author: Jason R. Coombs (jaraco) * (Python committer) Date: 2017-10-26 14:01
I've filed a separate request here for the sys.path[0] aspect: https://bugs.python.org/issue31874
History
Date User Action Args
2020-01-29 00:36:22brett.cannonsetnosy: - brett.cannon
2017-10-26 14:01:54jaracosetmessages: + msg305055
2017-10-26 13:43:11isaiahsetnosy: + isaiah
messages: + msg305052
2017-10-21 15:12:36ncoghlansetmessages: + msg304707
2017-10-21 13:08:04jaracosetmessages: + msg304703
2017-10-21 08:18:44ncoghlansetmessages: + msg304698
2017-10-20 16:07:59jaracosetmessages: + msg304662
2016-12-12 01:42:04ncoghlansetmessages: + msg282956
2016-12-11 19:23:50brett.cannonsetmessages: + msg282939
2016-12-10 20:17:18jaracosetnosy: + jaraco
messages: + msg282872
2013-01-26 08:32:53eric.snowsetnosy: + eric.snow
2013-01-02 18:28:05brett.cannonsetmessages: + msg178834
2012-12-23 00:22:09brett.cannonsetmessages: + msg177960
2012-12-22 19:39:27Arfreversetnosy: + Arfrever
2012-12-22 10:16:15ncoghlansetmessages: + msg177933
2012-12-22 10:12:00ncoghlansetnosy: + brett.cannon

messages: + msg177932
versions: + Python 3.3
2012-12-21 18:30:42vinay.sajipsetmessages: + msg177888
2012-12-21 17:16:13eric.araujosetnosy: + eric.araujo
messages: + msg177884
2012-12-20 14:27:02serhiy.storchakasetmessages: + msg177825
2012-12-20 14:18:01vinay.sajipsetmessages: + msg177824
2012-12-20 13:53:36serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg177818
2012-12-20 13:37:02vinay.sajipcreate