classification
Title: argparse.REMAINDER doesn't work as first argument
Type: behavior Stage: patch review
Components: Documentation Versions: Python 3.9, Python 3.8
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: docs@python Nosy List: dHannasch, docs@python, python-dev, rhettinger, shihai1991
Priority: normal Keywords: patch

Created on 2013-01-27 08:51 by chris.jerdonek, last changed 2020-02-25 20:33 by python-dev.

Files
File name Uploaded Description Edit
argparse_example.py dHannasch, 2020-02-13 19:25 Script demonstrating what works and what does not work.
Pull Requests
URL Status Linked Edit
PR 18661 open python-dev, 2020-02-25 20:33
Messages (17)
msg180752 - (view) Author: Chris Jerdonek (chris.jerdonek) * (Python committer) Date: 2013-01-27 08:51
Works:

>>> p = ArgumentParser(prog='test.py')
>>> p.add_argument('pos')
>>> p.add_argument('remainder', nargs=argparse.REMAINDER)
>>> p.parse_args(['abc', '--def'])
Namespace(pos='abc', remainder=['--def'])

Doesn't work:

>>> p = ArgumentParser(prog='test.py')
>>> p.add_argument('remainder', nargs=argparse.REMAINDER)
>>> p.parse_args(['--def'])
usage: test.py [-h] ...
test.py: error: unrecognized arguments: --def

This use case comes up, for example, if you would like to extract all the arguments passed to a subparser in order to pass to a different program.
msg187178 - (view) Author: paul j3 (paul.j3) * (Python triager) Date: 2013-04-17 16:34
The problem isn't with REMAINDER, but with the distinction between optionals and arguments.  If you change '--def' to 'def', the parse should work:

>>> p = ArgumentParser(prog='test.py')
>>> p.add_argument('remainder', nargs=argparse.REMAINDER)
>>> p.parse_args(['def'])

'--def' would give problems with almost all of the nargs options, especially '*' and '?'.

The issue is that '--def' looks like an optional.  Since it is not found in the defined arguments, it is classed as an unknown extra and skipped (try p.parse_known_args(['--def'])).  All of this takes place before 'REMAINDER' has a chance to look at the argument strings.

In http://bugs.python.org/issue9334   I submitted a patch that defines a 'args_default_to_positional' parser option.  If this is True, that unrecognized '--def' would be classed as a 'positional' and would be captured by the REMAINDER.
msg187182 - (view) Author: paul j3 (paul.j3) * (Python triager) Date: 2013-04-17 17:03
Here's a way of passing an optional-like argument to a subparser:

    parser = argparse.ArgumentParser()
    subparsers = parser.add_subparsers(dest='cmd')
    sub1 = subparsers.add_parser('cmd')
    sub1.add_argument('foo',nargs='*')
    args = parser.parse_args('cmd -- --def 1 2 3'.split())

producing

    Namespace(cmd='cmd', foo=['--def', '1', '2', '3'])

The  '--' forces the parser to treat '--def' as a positional.  If nargs='REMAINDER', foo=['--', '--def', ...].

But the following subparser definition would be even better:

   sub1.add_argument('--def', action='store_true')
   sub1.add_argument('rest',nargs='...')

Here the '--def' is handle explicitly, as opposed to being passed on.

You don't need the whole subparsers mechanism if you are just going to pass those arguments (unparsed) to another program.
msg312173 - (view) Author: Devin Bayer (akvadrako) * Date: 2018-02-14 13:46
Can we at least document that argparse.REMAINDER cannot be used as the first argument?
msg312174 - (view) Author: Devin Bayer (akvadrako) * Date: 2018-02-14 13:48
I don't understand how this is about positionals vs optionals. REMAINDER is supposed to capture everything from that point forward, not just positionals.
msg312179 - (view) Author: paul j3 (paul.j3) * (Python triager) Date: 2018-02-14 19:39
This is another expression of the bigger problem of handling arguments that look like flags.  In optparse the 'nargs' (or equivalent, it doesn't handle positionals), control how many arguments an Option takes, regardless of their form.  In argparse, the distinction between a option (flag) and argument has priority.  So it is difficult to treat strings like '--def' as a plain argument.  The default behavior is to treat it as an optional flag.

https://bugs.python.org/issue9334
msg312180 - (view) Author: Devin Bayer (akvadrako) * Date: 2018-02-14 19:53
I still don't understand how that corresponds to the described behavior of REMAINDER and what it has to do with this bug.

How can REMAINDER possibly ever work if optionals take priority? However it does when it's not the first argument.
msg312184 - (view) Author: paul j3 (paul.j3) * (Python triager) Date: 2018-02-14 23:19
Oops, I see I already mentioned 9334.  Here the parsing sequence is a bit different, and the fix I suggest there would not apply here.  But the underlying issue is still there - the parser has, in its first iteration, determined that the '--def' looks like an optional.  This first scan focuses on the form, not on possible 'nargs' sequences.

In _parse_known_args() it alternates between 'consume_positionals' and 'consume_optional'

In the docs example:

'--foo B cmd --arg1 XX ZZ'

It finds the '--foo' and parses that as optional, consuming the 'B'

Next it finds 'cmd', and so starts to parse positionals.  Here is pays attention to the REMAINDER, and thus consumes 'cmd' and the following strings.  In other words, once it starts to parse positionals, it can parse as many as match their nargs.

The same applies to the 'abc --def' example given at the start of this question.

But in the 2nd example, with just the REMAINDER and a ['--def'], it doesn't parse any positionals.  The code has this comment:

   # consume any Positionals preceding the next option

There aren't any strings preceding the '--def', so it moves on to parsing this 'optional'.  Yes, I know you think it really is a positional, because you know about the REMAINDER nargs, but the code doesn't know this (or at least doesn't check for that possibility.

As stressed in 9334, when a dashed string is used in a argument like slot, there's an inherent ambiguity.  Should it treat as a (potential) error, or accept the programmer and/or user is going against convention?

https://bugs.python.org/issue9334#msg111691
msg312185 - (view) Author: paul j3 (paul.j3) * (Python triager) Date: 2018-02-14 23:35
REMAINDER is not widely used, and probably was not tested thoroughly during development. It works for the example given in the docs.  

A variant, argparse.PARSER ('A...') is widely used.  This is, effectively, REMAINDER ('...') that requires an initial non-optional string.  Sort of what '+' is to '*'.

I suspect REMAINDER is most reliable when used as nargs for an optional, e.g.

    add_argument('--rest', nargs=argparse.REMAINDER)

That way it's clear to everyone, developer, user, and the parser that the following strings are to be taken is.  

When parsing the command line, clarity should have priority over convenience.
msg312187 - (view) Author: Devin Bayer (akvadrako) * Date: 2018-02-15 00:21
This bug is 5 years old and you are arguing what? That it doesn't matter because it's rarely used compared to some undocumented useless alternative?

It's mildly interesting to hear about some implementation detail but I really don't care. I think you're just wasting your time. I want the docs to match the implementation.

But it doesn't matter - argparse is shit anyway - I'll just write my own parser.
msg312255 - (view) Author: paul j3 (paul.j3) * (Python triager) Date: 2018-02-16 22:11
A REMAINDER that would work with a flag-like string would be too powerful, too greedy.

    In [64]: p = argparse.ArgumentParser();
    In [65]: p.add_argument('--foo');
    In [66]: p.add_argument('rest', nargs='...');

If the flag is first, its Action works:

    In [67]: p.parse_args('--foo x a b c'.split())
    Out[67]: Namespace(foo='x', rest=['a', 'b', 'c'])

If there's a non-flag string, REMAINDER grabs everything:

    In [68]: p.parse_args('d --foo x a b c'.split())
    Out[68]: Namespace(foo=None, rest=['d', '--foo', 'x', 'a', 'b', 'c'])

Imagine a REMAINDER could act with '--foo' as the first string.  In[67] would then parse as Out[68] but without the 'd'.  

In documented use 'cmd' acts as a gatekeeper, allowing the REMAINDER to grab the rest.  So does the '--rest' flag in:

     p.add_argument('--rest', nargs='...')

Double dash is another gatekeeper:

    In [69]: p.parse_args('-- --foo x a b c'.split())
    Out[69]: Namespace(foo=None, rest=['--', '--foo', 'x', 'a', 'b', 'c'])

If you don't want such a gatekeeper, why used argparse at all?  Why not use sys.argv[1:] directly?

So some sort of warning about the limitations of REMAINDER would be good.  But the trick is to come up with something that is clear but succinct. The argparse documentation is already daunting to beginners.  

A closed request to document the argparse.PARSER option:
https://bugs.python.org/issue16988

A closed request to document '...'
https://bugs.python.org/issue24647

There was also an issue asking to treat unrecognized flags as plain arguments.  I don't recall the status of that issue.  With that, REMAINDER could grab '--bar a b c', but 'fail' with '--foo a b c'.  It would interesting to test such a variation, but that would be even harder to document.
msg315711 - (view) Author: Alden (aldencolerain) Date: 2018-04-24 18:13
Paul.  This is a bug, not a feature in argparse.  Devin is 100% correct.  According to the docs REMAINDER should be greedy and is used for passing arguments to sub commands.  In your example the expected behavior is that if you do put "d --foo x a b c" that --foo is none and args gets everything.  We shouldn't need to use a gatekeeper or resort to manually parsing the remainder arguments.  It also shouldn't take 5 years to acknowledged that it needs to be fixed.  I'm happy to make a patch if its a bandwidth issue.  Am I misunderstanding and you feel like its not possible to fix?  I guess if there is a backward compatibility issue then we need to write a new option that does literally return the remainder arguments as documented.
msg315716 - (view) Author: paul j3 (paul.j3) * (Python triager) Date: 2018-04-24 23:30
Since this feature is buggy, and there isn't an easy fix, we should probably remove any mention of it from the docs.  We can still leave it as an undocumented legacy feature.

There is precedent for leaving `nargs` constants undocumented.  `argparse.PARSER` ('+...') is used by the subparser mechanism, but is not documented.  https://bugs.python.org/issue16988
msg361968 - (view) Author: (dHannasch) Date: 2020-02-13 19:25
I've attached a file that can be run, but it's a simple script that I can include here inline, too:


"""
Context:
I am trying to set up a cookiecutter so that newly-created packages will come with a Jupyter notebook users can play with.
That is, python -m package_name jupyter would open up a Jupyter quickstart notebook demonstrating the package's features.
argparse.REMAINDER as the first argument isn't important for a top-level parser, since we can work around it by not using argparse at all,
but using argparse.REMAINDER in a subparser seems like a pretty straightforward use case.
Any time we want to dispatch a subcommand to a separate tool --- forwarding all following arguments --- we're going to need it.
"""

import argparse

parser = argparse.ArgumentParser()
parser.add_argument('command', default='cmdname')
parser.add_argument('cmdname_args', nargs=argparse.REMAINDER)
args = parser.parse_args('cmdname --arg1 XX ZZ --foobar'.split())
if args != argparse.Namespace(cmdname_args=['--arg1', 'XX', 'ZZ', '--foobar'], command='cmdname'):
    raise Exception(args)
print('This is how argparse.REMAINDER works when there is an argument in front.')

parser = argparse.ArgumentParser()
parser.add_argument('--foo')
parser.add_argument('command', default='cmdname')
parser.add_argument('cmdname_args', nargs=argparse.REMAINDER)
args = parser.parse_args('--foo B cmdname --arg1 XX ZZ --foobar'.split())
if args != argparse.Namespace(cmdname_args=['--arg1', 'XX', 'ZZ', '--foobar'], command='cmdname', foo='B'):
    raise Exception(args)
print('This is how argparse.REMAINDER works there is an option in front.')

parser = argparse.ArgumentParser()
parser.add_argument('--foo')
subparsers = parser.add_subparsers(dest='command')
commandParser = subparsers.add_parser('cmdname')
commandParser.add_argument('--filler-boundary-marker', dest='cmdname_args', nargs=argparse.REMAINDER)
args = parser.parse_args('--foo B cmdname --filler-boundary-marker --arg1 XX ZZ --foobar'.split())
if args != argparse.Namespace(cmdname_args=['--arg1', 'XX', 'ZZ', '--foobar'], command='cmdname', foo='B'):
    raise Exception(args)
print('This is how argparse.REMAINDER works with a visible "filler" name for the list of arguments.')

parser = argparse.ArgumentParser()
parser.add_argument('--foo')
subparsers = parser.add_subparsers(dest='command')
commandParser = subparsers.add_parser('cmdname')
commandParser.add_argument('--filler-boundary-marker', dest='cmdname_args', nargs=argparse.REMAINDER)
args = parser.parse_args('cmdname --filler-boundary-marker --arg1 XX ZZ --foobar --foo B'.split())
if args != argparse.Namespace(cmdname_args=['--arg1', 'XX', 'ZZ', '--foobar', '--foo', 'B'], command='cmdname', foo=None):
    raise Exception(args)
print("If an optional argument is provided after cmdname instead of before, it will get interpreted as part of the argparse.REMAINDER instead of normally. And that's great! We don't even need to be paranoid about other functions of our command-line tool sharing arguments with the tool we want to wrap. Everything will be forwarded.")

parser = argparse.ArgumentParser()
parser.add_argument('--foo')
subparsers = parser.add_subparsers(dest='command')
commandParser = subparsers.add_parser('cmdname')
commandParser.add_argument('positional_arg')
commandParser.add_argument('cmdname_args', nargs=argparse.REMAINDER)
args = parser.parse_args('cmdname can_put_anything_here --arg1 XX ZZ --foobar --foo B'.split())
if args != argparse.Namespace(cmdname_args=['--arg1', 'XX', 'ZZ', '--foobar', '--foo', 'B'], command='cmdname', positional_arg='can_put_anything_here', foo=None):
    raise Exception(args)
print("If an optional argument is provided after cmdname instead of before, it will get interpreted as part of the argparse.REMAINDER instead of normally. And that's great! We don't even need to be paranoid about other functions of our command-line tool sharing arguments with the tool we want to wrap. Everything will be forwarded.")

"""
Note that this means we can fix the bug simply by,
whenever the cmdname subparser is invoked and the cmdname subparser uses argparse.REMAINDER,
automatically adding an imaginary first positional argument to the subparser
and inserting that imaginary first positional argument into the stream before parsing the arguments to cmdname.
https://github.com/python/cpython/blob/master/Lib/argparse.py#L1201
(Obviously it would be better to fix the underlying cause.)
"""

print('What we want to do is have a subparser that, in the case of one particular selection, forwards all following arguments to another tool.')
print('script.py --foo B cmdname --arg1 XX ZZ --foobar should dispatch to cmdname --arg1 XX ZZ --foobar.')
parser = argparse.ArgumentParser()
parser.add_argument('--foo')
subparsers = parser.add_subparsers(dest='command')
commandParser = subparsers.add_parser('cmdname')
commandParser.add_argument('cmdname_args', nargs=argparse.REMAINDER)
parser.parse_args('--foo B cmdname --arg1 XX ZZ --foobar'.split())
# error: unrecognized arguments: --arg1
msg362006 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2020-02-15 06:12
I concur with Paul's suggestion.
msg362041 - (view) Author: (dHannasch) Date: 2020-02-16 00:52
Okay. Would it be all right if I submit a fix to get it working at least in the subparser case?
msg362249 - (view) Author: hai shi (shihai1991) * Date: 2020-02-19 05:12
> Okay. Would it be all right if I submit a fix to get it working at least in the subparser case?

Hi, dHannasch. According raymond and paul's opinion, you could try to create a PR to update argparse's doc.
History
Date User Action Args
2020-02-25 20:33:42python-devsetkeywords: + patch
nosy: + python-dev

pull_requests: + pull_request18022
stage: test needed -> patch review
2020-02-19 05:12:14shihai1991setnosy: + shihai1991
messages: + msg362249
2020-02-16 00:52:16dHannaschsetmessages: + msg362041
2020-02-15 06:12:53rhettingersetversions: + Python 3.8, Python 3.9, - Python 2.7, Python 3.2, Python 3.3, Python 3.4, Python 3.5, Python 3.6, Python 3.7
nosy: + docs@python

messages: + msg362006

assignee: docs@python
components: + Documentation, - Library (Lib)
2020-02-13 19:25:10dHannaschsetfiles: + argparse_example.py
versions: + Python 3.5, Python 3.6, Python 3.7
nosy: + rhettinger, dHannasch, - bethard, chris.jerdonek, paul.j3, danielsh, aldencolerain

messages: + msg361968
2018-04-24 23:30:19paul.j3setmessages: + msg315716
2018-04-24 18:13:31aldencolerainsetnosy: + aldencolerain
messages: + msg315711
2018-02-16 22:11:49paul.j3setmessages: + msg312255
2018-02-15 00:22:10akvadrakosetnosy: - akvadrako
2018-02-15 00:21:14akvadrakosetmessages: + msg312187
2018-02-14 23:35:27paul.j3setmessages: + msg312185
2018-02-14 23:19:05paul.j3setmessages: + msg312184
2018-02-14 19:53:19akvadrakosetmessages: + msg312180
2018-02-14 19:39:50paul.j3setmessages: + msg312179
2018-02-14 13:48:23akvadrakosetmessages: + msg312174
2018-02-14 13:46:11akvadrakosetmessages: + msg312173
2018-02-14 13:44:31akvadrakosetnosy: + akvadrako
2015-09-29 22:14:47danielshsetnosy: + danielsh
2013-04-17 17:03:22paul.j3setmessages: + msg187182
2013-04-17 16:34:46paul.j3setnosy: + paul.j3
messages: + msg187178
2013-01-27 08:51:01chris.jerdonekcreate