classification
Title: argparse does not honor default argument for nargs=argparse.REMAINDER argument
Type: behavior Stage: test needed
Components: Library (Lib) Versions: Python 3.9
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: bethard, mblahay, paul.j3, rgov
Priority: normal Keywords:

Created on 2018-12-14 15:59 by rgov, last changed 2019-07-18 21:03 by mblahay.

Messages (17)
msg331837 - (view) Author: Ryan Govostes (rgov) Date: 2018-12-14 15:59
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('things', nargs=argparse.REMAINDER, default=['nothing'])
parser.parse_args([])
>>> Namespace(things=[])

Since there were no unparsed arguments remaining, the `default` setting for `things` should have been honored. However it silently ignores this setting.

If there's a reason why this wouldn't be desirable, it should raise an exception that the options aren't compatible.
msg332567 - (view) Author: paul j3 (paul.j3) * (Python triager) Date: 2018-12-26 22:49
argparse.REMAINDER matches an empty list of arguments, just like '?' and '*'.  So they are always 'filled', even by `parse_args([])`.

'?' and '*' have some special handling of defaults in this case, see in

    argparse.ArgumentParser._get_values

the two 

    value = action.default

REMAINDER has its own section in the function that does nothing with the default.

I think it should be left as is.
msg341826 - (view) Author: Michael Blahay (mblahay) * Date: 2019-05-07 21:10
Ryan, what are the exact steps to reproduce the problem? This is what I get when I run the code you included:

>>> import argparse
>>> parser = argparse.ArgumentParser()
>>> parser.add_argument('things', nargs=argparse.REMAINDER, default=['nothing'])
_StoreAction(option_strings=[], dest='things', nargs='...', const=None, default=['nothing'], type=None, choices=None, help=None, metavar=None)
>>> parser.parse_args([])
Namespace(things=[])
>>> Namespace(things=[])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'Namespace' is not defined
msg341844 - (view) Author: Ryan Govostes (rgov) Date: 2019-05-08 02:07
Just don’t run the last line which is just an echoing of the output of
parser.parse_args() repeated. The Namespace type would need to be imported
if you really wanted to but there’s no point.

On Tuesday, May 7, 2019, Michael Blahay <report@bugs.python.org> wrote:

>
> Michael Blahay <mblahay@gmail.com> added the comment:
>
> Ryan, what are the exact steps to reproduce the problem? This is what I
> get when I run the code you included:
>
> >>> import argparse
> >>> parser = argparse.ArgumentParser()
> >>> parser.add_argument('things', nargs=argparse.REMAINDER,
> default=['nothing'])
> _StoreAction(option_strings=[], dest='things', nargs='...', const=None,
> default=['nothing'], type=None, choices=None, help=None, metavar=None)
> >>> parser.parse_args([])
> Namespace(things=[])
> >>> Namespace(things=[])
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> NameError: name 'Namespace' is not defined
>
> ----------
> nosy: +mblahay
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <https://bugs.python.org/issue35495>
> _______________________________________
>
msg341938 - (view) Author: Michael Blahay (mblahay) * Date: 2019-05-08 20:08
Okay, so the expected output after running parse.parse_args([]) is Namespace(['nothing'])
msg341939 - (view) Author: Michael Blahay (mblahay) * Date: 2019-05-08 20:15
Ryan, I have reviewed the documentation at https://docs.python.org/3/library/argparse.html#nargs and must admit that there is not a definitive answer that I can see regarding the defined behavior should there be no command line arguments that are in fact remaining. One could certainly argue that the empty list is the expression of the fact that no remaining arguments could be found. One can also argue that when seeking the remaining arguments, a list that may be zero to many elements in size, that by definition there cannot be a default.

Can you cite any documentation that would support your claim?
msg342115 - (view) Author: Michael Blahay (mblahay) * Date: 2019-05-10 18:19
For the purpose of facilitating continuing conversation, here are two tests that contrast the use of * versus REMAINDER

import argparse
parser = argparse.ArgumentParser()
parser.add_argument('foo', nargs=1,default=['none'])
parser.add_argument('bar', nargs=argparse.REMAINDER,default=['nothing'])
parser.add_argument('baz', nargs='*', default=['nada'])
parser.parse_args('a b c'.split())

Out[7]: Namespace(bar=['b', 'c'], baz=['nada'], foo=['a'])

import argparse
parser = argparse.ArgumentParser()
parser.add_argument('foo', nargs=1,default=['none'])
parser.add_argument('baz', nargs='*', default=['nada'])
parser.add_argument('bar', nargs=argparse.REMAINDER,default=['nothing'])
parser.parse_args('a b c'.split())

Out[8]: Namespace(bar=[], baz=['b', 'c'], foo=['a'])

You can see that * and REMAINDER do differ in functionality when they are the last defined argument.
msg342122 - (view) Author: Michael Blahay (mblahay) * Date: 2019-05-10 19:21
Here is another take on the issue, this time illustrated through the lens of optional arguments.

import argparse
parser = argparse.ArgumentParser()
parser.add_argument('--foo', nargs=1,default=['none'])
parser.add_argument('--baz', nargs='*', default=['nada'])
parser.add_argument('--bar', nargs=argparse.REMAINDER,default=['nothing'])
parser.parse_args('--foo a --bar b --baz c'.split())

Out[9]: Namespace(bar=['b', '--baz', 'c'], baz=['nada'], foo=['a'])

import argparse
parser = argparse.ArgumentParser()
parser.add_argument('--foo', nargs=1,default=['none'])
parser.add_argument('--baz', nargs='*', default=['nada'])
parser.add_argument('--bar', nargs=argparse.REMAINDER,default=['nothing'])
parser.parse_args('--foo a --baz b --bar c'.split())

Out[10]: Namespace(bar=['c'], baz=['b'], foo=['a'])

import argparse
parser = argparse.ArgumentParser()
parser.add_argument('--foo', nargs=1,default=['none'])
parser.add_argument('--baz', nargs='*', default=['nada'])
parser.add_argument('--bar', nargs=argparse.REMAINDER,default=['nothing'])
parser.parse_args([])

Out[11]: Namespace(bar=['nothing'], baz=['nada'], foo=['none'])

It is important to note that when an optional argument is not present then the default is always used, including for one using nargs=argparse.REMAINDER. In all three tests, bar is the argument using REMAIDER. In the first test, one can see that when bar isn't the last argument then anything else, including other arguments, are swept up as being arguments of bar. This greedy behavior for REMAINDER is something that * does not share (test 2).
msg342124 - (view) Author: Michael Blahay (mblahay) * Date: 2019-05-10 20:12
With the optional arguments, the determination about whether to use the default value is made based on whether the flag is present or not. When positional arguments are involved, the need for the defaults seems to in part be determined based on whether the argument exists. The fact that * and REMAINDER are zero-to-many in nature add some ambiguity into the situation. For the *, it seems that the positional argument only exists if there is at least one actual argument value that it can consume.

import argparse
parser = argparse.ArgumentParser()
parser.add_argument('foo', nargs=1,default=['none'])
#parser.add_argument('bar', nargs=argparse.REMAINDER,default=['nothing'])
parser.add_argument('baz', nargs='*', default=['nada'])
parser.parse_args('a b'.split())

Out[25]: Namespace(baz=['b'], foo=['a'])

import argparse
parser = argparse.ArgumentParser()
parser.add_argument('foo', nargs=1,default=['none'])
#parser.add_argument('bar', nargs=argparse.REMAINDER,default=['nothing'])
parser.add_argument('baz', nargs='*', default=['nada'])
parser.parse_args('a'.split())

Out[26]: Namespace(baz=['nada'], foo=['a'])

Mean while, the REMAINDER option makes the argument act as if it exists regardless of whether an actual argument value exists.

import argparse
parser = argparse.ArgumentParser()
parser.add_argument('foo', nargs=1,default=['none'])
parser.add_argument('bar', nargs=argparse.REMAINDER,default=['nothing'])
#parser.add_argument('baz', nargs='*', default=['nada'])
parser.parse_args('a b'.split())

Out[27]: Namespace(bar=['b'], foo=['a'])

import argparse
parser = argparse.ArgumentParser()
parser.add_argument('foo', nargs=1,default=['none'])
parser.add_argument('bar', nargs=argparse.REMAINDER,default=['nothing'])
#parser.add_argument('baz', nargs='*', default=['nada'])
parser.parse_args('a'.split())

Out[28]: Namespace(bar=[], foo=['a'])

To conclude, * and REMAINDER perform similar, but different, roles when used with positional arguments. With edge cases like the ones laid out above, it can be hard to conceptualize what the exact behavior should be. I will recommend that the documentation be updated to convey the following message: "When used with positional arguments, REMAINDER will never use the designated default value list. It will instead return an empty list if there are no values for the argument to consume. If the use of default values is desired, then * must be used."
msg342129 - (view) Author: Michael Blahay (mblahay) * Date: 2019-05-10 20:17
Much detail has been provided regarding why the default is ignored when user the REMAINDER option. The desire to add an exception has not. Is there anyone that can provide guidance on whether the combination of:
1. Positional Argument 
2. nargs=REMAINDER
3. default=something  
should raise an exception upon execution of the add_argument method?
msg342150 - (view) Author: paul j3 (paul.j3) * (Python triager) Date: 2019-05-11 02:24
At the start of parse_known_args, all defaults (except SUPPRESS ones) are placed in the namespace:

        # add any action defaults that aren't present
        for action in self._actions:
            if action.dest is not SUPPRESS:
                if not hasattr(namespace, action.dest):
                    if action.default is not SUPPRESS:
                        setattr(namespace, action.dest, action.default)

at the end of _parse_known_args there's a conditional expression that cleans up remaining defaults that are strings, by passing them through the 'type` callable:

    setattr(namespace, action.dest,
        self._get_value(action, action.default))

Read the comments to see why this default setting is done in two parts.

The pattern of defaults in msg342122 with optionals is consistent with that.  If the argument is not provided, the default appears.  

In the first example of that message, the REMAINDER is given all the remaining strings including the '--bar', so there is nothing left to trigger the '--bar' optional argument, and it retains the default.

The difference for positionals is due to how the '*' and '...' are handled in _getvalues.  Both may be filled with an empty list of values.

        # when nargs='*' on a positional, if there were no command-line
        # args, use the default if it is anything other than None
        elif (not arg_strings and action.nargs == ZERO_OR_MORE and
              not action.option_strings):
            if action.default is not None:
                value = action.default
            else:
                value = arg_strings
            self._check_value(action, value)

        ....

        # REMAINDER arguments convert all values, checking none
        elif action.nargs == REMAINDER:
            value = [self._get_value(action, v) for v in arg_strings]

In the case of '*', the default is, effectively, placed back on the namespace.  REMAINDER does not - the empty list is put in the namespace.

In _get_positional_kwargs,

        # mark positional arguments as required if at least one is
        # always required
        if kwargs.get('nargs') not in [OPTIONAL, ZERO_OR_MORE]:
            kwargs['required'] = True
        if kwargs.get('nargs') == ZERO_OR_MORE and 'default' not in kwargs:
            kwargs['required'] = True

That last conditional is a little puzzling, but I suspect it has to do with mutually_exclusive_groups,  A '*' positional can be a member of a group if it has a default.

Anyways, we could add a test at this point like:

        if kwargs.get('nargs') == REMAINDER and 'default' in kwargs:
            msg = _("'default' is not allowed with a REMAINDER positional")
            raise TypeError(msg)

But reviewing the code I notice another difference.  'choices' are not honored for REMAINDER.  That makes a lot of sense.  REMAINDER is supposed to be a catch all, documented as something that might be passed on to another parser.  This parser shouldn't be doing anything with those values.

The documentation reads:

argparse.REMAINDER. All the remaining command-line arguments are gathered into a list. This is commonly useful for command line utilities that dispatch to other command line utilities:

I think REMAINDER has another quirk.  It doesn't work as the first (and only?) argument.   There should be a bug/issue to that effect.

https://bugs.python.org/issue17050, argparse.REMAINDER doesn't work as first argument

In https://bugs.python.org/issue17050#msg315716, I suggest removing REMAINDER from the docs.  We can leave the code as is, in case anyone is using still using it.  But it is probably too much work to make the code and docs match, both for that issue, and for this.


'*' plus '--' gives almost the same behavior.  So does parse_known_args.
msg342196 - (view) Author: paul j3 (paul.j3) * (Python triager) Date: 2019-05-11 15:02
Let me back off on that last suggestion.  The problems described here and in https://bugs.python.org/issue17050 only apply to a positional.

We could just add a note to the documentation to the effect.

    This nargs is best used as an optional as illustrated, where the user 
    can clearly identify which strings are the remainder.
msg342370 - (view) Author: Michael Blahay (mblahay) * Date: 2019-05-13 17:58
Ryan, What say you? Will you be satisfied with the addition of a note in the documentation?
msg343603 - (view) Author: Michael Blahay (mblahay) * Date: 2019-05-27 03:07
Ryan, last chance, do you have any feedback?
msg343710 - (view) Author: Ryan Govostes (rgov) Date: 2019-05-27 23:45
Thanks Michael for all of the examples. After reading them all, I concur that "it can be hard to conceptualize what the exact behavior should be." A documentation change is warranted, at the least.

However the argparse documentation, while great, is dense and it would be easy to overlook a simple comment. And I think the point that is being raised isn't merely a suggestion on how to design a good CLI, but a pitfall that makes the behavior of code non-obvious---it's not something someone would necessarily consult the documentation for while reviewing code.

(By the way, I'm considering CLIs like `docker run` and `ssh` which take an optional command to execute, and when absent, fall back on default behavior.)

So I would prefer a code change that makes it harder to write code that hits this corner case. Potential solutions would be either

(a) making a positional REMAINDER arg with a default value an error, as in Paul's proposed change; or
(b) making a default value with a positional REMAINDER arg 'just work'

I think (a) is the most reasonable. The exception can recommend the use of nargs='*' instead, which makes it actionable. And it is unlikely that the exception would be buried down some untested code path that would end up in released code.

(Perhaps it's also worth raising an exception when adding a positional argument after a nargs>1 positional argument already exists, but that's another issue.)
msg345916 - (view) Author: Michael Blahay (mblahay) * Date: 2019-06-17 21:31
Need some help searching github to determine the blast radius of the proposed changes. How does one look for instances of argparse.REMAINDER that are used with a default value?
msg348129 - (view) Author: Michael Blahay (mblahay) * Date: 2019-07-18 21:03
Ryan, I like option A as well, but it is a breaking change. Unlike in a compiled language where we could output a warning, making the proposed change could bring some software to a grinding halt. For now I'm going to make the documentation change and this issue can stay open for if there is a major version change that would allow such a breaking change.
History
Date User Action Args
2019-07-18 21:03:41mblahaysetmessages: + msg348129
versions: + Python 3.9, - Python 2.7, Python 3.6, Python 3.7, Python 3.8
2019-06-17 21:31:09mblahaysetmessages: + msg345916
2019-05-27 23:45:33rgovsetmessages: + msg343710
2019-05-27 03:07:05mblahaysetmessages: + msg343603
2019-05-13 17:58:09mblahaysetmessages: + msg342370
2019-05-11 15:02:17paul.j3setmessages: + msg342196
2019-05-11 02:24:23paul.j3setmessages: + msg342150
2019-05-10 20:17:58mblahaysetmessages: + msg342129
2019-05-10 20:12:08mblahaysetmessages: + msg342124
2019-05-10 19:21:20mblahaysetmessages: + msg342122
2019-05-10 18:19:48mblahaysetmessages: + msg342115
2019-05-08 20:15:15mblahaysetmessages: + msg341939
2019-05-08 20:08:19mblahaysetmessages: + msg341938
2019-05-08 02:07:54rgovsetmessages: + msg341844
2019-05-07 21:10:42mblahaysetnosy: + mblahay
messages: + msg341826
2018-12-26 22:49:30paul.j3setnosy: + paul.j3
messages: + msg332567
2018-12-14 21:57:43terry.reedysetnosy: + bethard
stage: test needed

versions: + Python 3.8, - Python 3.4, Python 3.5
2018-12-14 16:00:12rgovsetversions: + Python 2.7, Python 3.4, Python 3.5, Python 3.6, Python 3.7
2018-12-14 15:59:40rgovcreate