Title: python sys.argv argument parsing not clear
Type: behavior Stage: resolved
Components: Documentation Versions: Python 3.8, Python 3.7, Python 3.6, Python 2.7
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: docs@python Nosy List: Jonathan Huot, docs@python, ncoghlan, terry.reedy
Priority: normal Keywords:

Created on 2018-03-22 09:21 by Jonathan Huot, last changed 2018-03-24 04:25 by ncoghlan. This issue is now closed.

Messages (3)
msg314239 - (view) Author: Jonathan Huot (Jonathan Huot) Date: 2018-03-22 09:21
Executing python modules with -m can lead to weird sys.argv parsing.

"Argument parsing" section at mention :

- When -m module is used, sys.argv[0] is set to the full name of the located module.

The word "located" is used, but it doesn't mention anything when the module is not *yet* "located".

For instance, let's see what is the sys.argv for each python files:

$ cat mainmodule/
import sys; print("{}: {}".format(sys.argv, __file__))
$ cat mainmodule/submodule/
import sys; print("{}: {}".format(sys.argv, __file__))
$ cat mainmodule/submodule/
import sys; print("{}: {}".format(sys.argv, __file__))

Then we call "foobar" with -m:

$ python -m mainmodule.submodule.foobar -o -b
['-m', '-o', 'b']: (..)/mainmodule/
['-m', '-o', 'b']: (..)/mainmodule/submodule/
['(..)/mainmodule/submodule/', '-o', 'b']: (..)/mainmodule/submodule/

We notice that only "-m" is in sys.argv before we found "foobar". This can lead to a lot of troubles when we have meaningful processing in which rely on sys.argv to initialize stuff.

IMHO, it either should be the sys.argv intact ['-m', 'mainmodule.submodule.foobar', '-o', '-b'] or empty ['', '-o', '-b'] or  only the latest ['-o', '-b'], but it should not be ['-m', '-o', '-b'] which is very confusing.
msg314350 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2018-03-24 01:15
Two of your 3 suggested alternatives could lead to bugs. To use your example:
 python -m mainmodule.submodule.foobar -o -b
is a convenient alternative and abbreviation for
 python .../somedir/mainmodule/submodule/ -o -b
The two invocations should give equivalent results and to the extent possible the same result.

[What might be different is the form of argv[0].  In the first case, argv[0] will be the "preferred" form of the path to the python file while in the second, it will be whatever is given.  On Windows, the difference might look like 'F:\\Python\\a\\' versus 'f:/python/a/']

Unless does some evil monkeypatching, it cannot affect the main module unless imported directly or indirectly.  So its behavior should be the same whether imported before or after execution of the main module.  This means that argv must be the same either way (except for argv[0]).  So argv[0:2] must be condensed to one arg before executing __init__.  I don't see that '' is an improvement over '-m'.

Command line arguments are intended for the invoked command.  An file is never the command unless invoked by its full path: "python somepath/".  In such a case, sys.argv access should be within a "__name__ == '__main__':" clause or a function called therein.
msg314355 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2018-03-24 04:25
This is deliberate, and is covered in the documentation at where it says 'If this option is given, the first element of sys.argv will be the full path to the module file (while the module file is being located, the first element will be set to "-m").'

The part in parentheses is the bit that's applicable here.

We've not going to change that, as the interpreter startup relies on checking sys.argv[0] for "-m" and "-c" in order to work out how it's expected to handle sys.path initialization.
Date User Action Args
2018-03-24 04:25:00ncoghlansetstatus: open -> closed
resolution: not a bug
messages: + msg314355

stage: resolved
2018-03-24 01:15:22terry.reedysetnosy: + terry.reedy

messages: + msg314350
versions: - Python 3.4, Python 3.5
2018-03-22 23:09:21brett.cannonsetnosy: + ncoghlan
2018-03-22 09:21:13Jonathan Huotcreate