Title: argparse.REMAINDER doesn't work as first argument
Type: behavior Stage: test needed
Components: Library (Lib) Versions: Python 3.2, Python 3.3, Python 3.4, Python 2.7
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: aldencolerain, bethard, chris.jerdonek, danielsh, paul.j3
Priority: normal Keywords:

Created on 2013-01-27 08:51 by chris.jerdonek, last changed 2018-04-24 23:30 by paul.j3.

Messages (13)
msg180752 - (view) Author: Chris Jerdonek (chris.jerdonek) * (Python committer) Date: 2013-01-27 08:51

>>> p = ArgumentParser(prog='')
>>> p.add_argument('pos')
>>> p.add_argument('remainder', nargs=argparse.REMAINDER)
>>> p.parse_args(['abc', '--def'])
Namespace(pos='abc', remainder=['--def'])

Doesn't work:

>>> p = ArgumentParser(prog='')
>>> p.add_argument('remainder', nargs=argparse.REMAINDER)
>>> p.parse_args(['--def'])
usage: [-h] ... error: unrecognized arguments: --def

This use case comes up, for example, if you would like to extract all the arguments passed to a subparser in order to pass to a different program.
msg187178 - (view) Author: paul j3 (paul.j3) * (Python triager) Date: 2013-04-17 16:34
The problem isn't with REMAINDER, but with the distinction between optionals and arguments.  If you change '--def' to 'def', the parse should work:

>>> p = ArgumentParser(prog='')
>>> p.add_argument('remainder', nargs=argparse.REMAINDER)
>>> p.parse_args(['def'])

'--def' would give problems with almost all of the nargs options, especially '*' and '?'.

The issue is that '--def' looks like an optional.  Since it is not found in the defined arguments, it is classed as an unknown extra and skipped (try p.parse_known_args(['--def'])).  All of this takes place before 'REMAINDER' has a chance to look at the argument strings.

In   I submitted a patch that defines a 'args_default_to_positional' parser option.  If this is True, that unrecognized '--def' would be classed as a 'positional' and would be captured by the REMAINDER.
msg187182 - (view) Author: paul j3 (paul.j3) * (Python triager) Date: 2013-04-17 17:03
Here's a way of passing an optional-like argument to a subparser:

    parser = argparse.ArgumentParser()
    subparsers = parser.add_subparsers(dest='cmd')
    sub1 = subparsers.add_parser('cmd')
    args = parser.parse_args('cmd -- --def 1 2 3'.split())


    Namespace(cmd='cmd', foo=['--def', '1', '2', '3'])

The  '--' forces the parser to treat '--def' as a positional.  If nargs='REMAINDER', foo=['--', '--def', ...].

But the following subparser definition would be even better:

   sub1.add_argument('--def', action='store_true')

Here the '--def' is handle explicitly, as opposed to being passed on.

You don't need the whole subparsers mechanism if you are just going to pass those arguments (unparsed) to another program.
msg312173 - (view) Author: Devin Bayer (akvadrako) * Date: 2018-02-14 13:46
Can we at least document that argparse.REMAINDER cannot be used as the first argument?
msg312174 - (view) Author: Devin Bayer (akvadrako) * Date: 2018-02-14 13:48
I don't understand how this is about positionals vs optionals. REMAINDER is supposed to capture everything from that point forward, not just positionals.
msg312179 - (view) Author: paul j3 (paul.j3) * (Python triager) Date: 2018-02-14 19:39
This is another expression of the bigger problem of handling arguments that look like flags.  In optparse the 'nargs' (or equivalent, it doesn't handle positionals), control how many arguments an Option takes, regardless of their form.  In argparse, the distinction between a option (flag) and argument has priority.  So it is difficult to treat strings like '--def' as a plain argument.  The default behavior is to treat it as an optional flag.
msg312180 - (view) Author: Devin Bayer (akvadrako) * Date: 2018-02-14 19:53
I still don't understand how that corresponds to the described behavior of REMAINDER and what it has to do with this bug.

How can REMAINDER possibly ever work if optionals take priority? However it does when it's not the first argument.
msg312184 - (view) Author: paul j3 (paul.j3) * (Python triager) Date: 2018-02-14 23:19
Oops, I see I already mentioned 9334.  Here the parsing sequence is a bit different, and the fix I suggest there would not apply here.  But the underlying issue is still there - the parser has, in its first iteration, determined that the '--def' looks like an optional.  This first scan focuses on the form, not on possible 'nargs' sequences.

In _parse_known_args() it alternates between 'consume_positionals' and 'consume_optional'

In the docs example:

'--foo B cmd --arg1 XX ZZ'

It finds the '--foo' and parses that as optional, consuming the 'B'

Next it finds 'cmd', and so starts to parse positionals.  Here is pays attention to the REMAINDER, and thus consumes 'cmd' and the following strings.  In other words, once it starts to parse positionals, it can parse as many as match their nargs.

The same applies to the 'abc --def' example given at the start of this question.

But in the 2nd example, with just the REMAINDER and a ['--def'], it doesn't parse any positionals.  The code has this comment:

   # consume any Positionals preceding the next option

There aren't any strings preceding the '--def', so it moves on to parsing this 'optional'.  Yes, I know you think it really is a positional, because you know about the REMAINDER nargs, but the code doesn't know this (or at least doesn't check for that possibility.

As stressed in 9334, when a dashed string is used in a argument like slot, there's an inherent ambiguity.  Should it treat as a (potential) error, or accept the programmer and/or user is going against convention?
msg312185 - (view) Author: paul j3 (paul.j3) * (Python triager) Date: 2018-02-14 23:35
REMAINDER is not widely used, and probably was not tested thoroughly during development. It works for the example given in the docs.  

A variant, argparse.PARSER ('A...') is widely used.  This is, effectively, REMAINDER ('...') that requires an initial non-optional string.  Sort of what '+' is to '*'.

I suspect REMAINDER is most reliable when used as nargs for an optional, e.g.

    add_argument('--rest', nargs=argparse.REMAINDER)

That way it's clear to everyone, developer, user, and the parser that the following strings are to be taken is.  

When parsing the command line, clarity should have priority over convenience.
msg312187 - (view) Author: Devin Bayer (akvadrako) * Date: 2018-02-15 00:21
This bug is 5 years old and you are arguing what? That it doesn't matter because it's rarely used compared to some undocumented useless alternative?

It's mildly interesting to hear about some implementation detail but I really don't care. I think you're just wasting your time. I want the docs to match the implementation.

But it doesn't matter - argparse is shit anyway - I'll just write my own parser.
msg312255 - (view) Author: paul j3 (paul.j3) * (Python triager) Date: 2018-02-16 22:11
A REMAINDER that would work with a flag-like string would be too powerful, too greedy.

    In [64]: p = argparse.ArgumentParser();
    In [65]: p.add_argument('--foo');
    In [66]: p.add_argument('rest', nargs='...');

If the flag is first, its Action works:

    In [67]: p.parse_args('--foo x a b c'.split())
    Out[67]: Namespace(foo='x', rest=['a', 'b', 'c'])

If there's a non-flag string, REMAINDER grabs everything:

    In [68]: p.parse_args('d --foo x a b c'.split())
    Out[68]: Namespace(foo=None, rest=['d', '--foo', 'x', 'a', 'b', 'c'])

Imagine a REMAINDER could act with '--foo' as the first string.  In[67] would then parse as Out[68] but without the 'd'.  

In documented use 'cmd' acts as a gatekeeper, allowing the REMAINDER to grab the rest.  So does the '--rest' flag in:

     p.add_argument('--rest', nargs='...')

Double dash is another gatekeeper:

    In [69]: p.parse_args('-- --foo x a b c'.split())
    Out[69]: Namespace(foo=None, rest=['--', '--foo', 'x', 'a', 'b', 'c'])

If you don't want such a gatekeeper, why used argparse at all?  Why not use sys.argv[1:] directly?

So some sort of warning about the limitations of REMAINDER would be good.  But the trick is to come up with something that is clear but succinct. The argparse documentation is already daunting to beginners.  

A closed request to document the argparse.PARSER option:

A closed request to document '...'

There was also an issue asking to treat unrecognized flags as plain arguments.  I don't recall the status of that issue.  With that, REMAINDER could grab '--bar a b c', but 'fail' with '--foo a b c'.  It would interesting to test such a variation, but that would be even harder to document.
msg315711 - (view) Author: Alden (aldencolerain) Date: 2018-04-24 18:13
Paul.  This is a bug, not a feature in argparse.  Devin is 100% correct.  According to the docs REMAINDER should be greedy and is used for passing arguments to sub commands.  In your example the expected behavior is that if you do put "d --foo x a b c" that --foo is none and args gets everything.  We shouldn't need to use a gatekeeper or resort to manually parsing the remainder arguments.  It also shouldn't take 5 years to acknowledged that it needs to be fixed.  I'm happy to make a patch if its a bandwidth issue.  Am I misunderstanding and you feel like its not possible to fix?  I guess if there is a backward compatibility issue then we need to write a new option that does literally return the remainder arguments as documented.
msg315716 - (view) Author: paul j3 (paul.j3) * (Python triager) Date: 2018-04-24 23:30
Since this feature is buggy, and there isn't an easy fix, we should probably remove any mention of it from the docs.  We can still leave it as an undocumented legacy feature.

There is precedent for leaving `nargs` constants undocumented.  `argparse.PARSER` ('+...') is used by the subparser mechanism, but is not documented.
Date User Action Args
2018-04-24 23:30:19paul.j3setmessages: + msg315716
2018-04-24 18:13:31aldencolerainsetnosy: + aldencolerain
messages: + msg315711
2018-02-16 22:11:49paul.j3setmessages: + msg312255
2018-02-15 00:22:10akvadrakosetnosy: - akvadrako
2018-02-15 00:21:14akvadrakosetmessages: + msg312187
2018-02-14 23:35:27paul.j3setmessages: + msg312185
2018-02-14 23:19:05paul.j3setmessages: + msg312184
2018-02-14 19:53:19akvadrakosetmessages: + msg312180
2018-02-14 19:39:50paul.j3setmessages: + msg312179
2018-02-14 13:48:23akvadrakosetmessages: + msg312174
2018-02-14 13:46:11akvadrakosetmessages: + msg312173
2018-02-14 13:44:31akvadrakosetnosy: + akvadrako
2015-09-29 22:14:47danielshsetnosy: + danielsh
2013-04-17 17:03:22paul.j3setmessages: + msg187182
2013-04-17 16:34:46paul.j3setnosy: + paul.j3
messages: + msg187178
2013-01-27 08:51:01chris.jerdonekcreate