classification
Title: argparse.REMAINDER fails to parse remainder correctly
Type: behavior Stage: needs patch
Components: Library (Lib) Versions: Python 3.3, Python 3.2, Python 2.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Michael.Edwards, chris.jerdonek, eric.araujo, idank, jason.coombs, paul.j3, rr2do2
Priority: normal Keywords: patch

Created on 2012-03-02 12:08 by rr2do2, last changed 2014-07-02 06:40 by paul.j3.

Files
File name Uploaded Description Edit
bug_argparse.py rr2do2, 2012-03-02 12:09 Reproduction case
test.py Michael.Edwards, 2012-11-30 16:17
issue14174_1.patch paul.j3, 2014-07-02 06:40 review
Messages (10)
msg154761 - (view) Author: Arnout van Meer (rr2do2) Date: 2012-03-02 12:08
Reproduction case is attached and should speak for itself, but the short brief is that the argparse.REMAINDER behaviour is very inconsistent based on what (potentially) defined argument handlers come before it.

Tested this on Python 2.7 on OS X, but also grabbed the latest argparse.py from hg to verify against this.
msg154929 - (view) Author: √Čric Araujo (eric.araujo) * (Python committer) Date: 2012-03-05 06:44
Thanks for the report.  Could you edit your script to add the expected results, for example Namespace(foo=..., command=[...])?
msg170921 - (view) Author: Idan Kamara (idank) Date: 2012-09-21 21:24
I just ran into this issue myself and worked around it by using parse_known_args*.

* http://docs.python.org/library/argparse.html#partial-parsing
msg172232 - (view) Author: Jason R. Coombs (jason.coombs) * (Python committer) Date: 2012-10-06 18:29
I also ran into this problem. I put together this script to reproduce the issue:

import argparse

parser = argparse.ArgumentParser()
parser.add_argument('app')
parser.add_argument('--config')
parser.add_argument('app_args', nargs=argparse.REMAINDER)
args = parser.parse_args(['app', '--config', 'bar'])
print vars(args)
# actual: {'app': 'app', 'app_args': ['--config', 'bar'], 'config': None}
# expected: {'app': 'app', 'app_args': [], 'config': 'bar'}

I'll try using parse_known_args instead.
msg172235 - (view) Author: Idan Kamara (idank) Date: 2012-10-06 18:32
Unfortunately parse_known_args is buggy too: http://bugs.python.org/issue16142
msg176691 - (view) Author: Michael Edwards (Michael.Edwards) Date: 2012-11-30 16:17
I'm attaching my own bug repro script for Eric. Is this sufficient? I can demonstrate the entire resulting Namespace, but the problem is that argparse doesn't even produce a Namespace. The cases I show simply fail.
msg180753 - (view) Author: Chris Jerdonek (chris.jerdonek) * (Python committer) Date: 2013-01-27 08:52
See also issue 17050 for a reduced/simple case where argparse.REMAINDER doesn't work (the case of the first argument being argparse.REMAINDER).
msg187204 - (view) Author: paul j3 (paul.j3) * Date: 2013-04-17 20:24
An alternative to Jason's example:

parser = argparse.ArgumentParser()
parser.add_argument('app')
parser.add_argument('--config')
parser.add_argument('app_args', nargs=argparse.REMAINDER)
args = parser.parse_args(['--config', 'bar', 'app'])
print vars(args)
# as expected: {'app': 'app', 'app_args': [], 'config': 'bar'}

When you have several positionals, one or more of which may have 0 arguments (*,?,...), it is best to put all of the optional arguments first.  

With 'app --config bar', parse_args identifies a 'AOA' pattern (argument, optional, argument).  It then checks which positional arguments match.  'app' claims 1, 'app_args' claims 2 (REMAINDER means match everything that follows).  That leaves nothing for '--config'.

What you expected was that 'app' would match with the 1st string, '--config' would match the next 2, leaving nothing for 'app_args'.  

In http://bugs.python.org/issue14191 I wrote a patch that would give the results you want if 'app_args' uses '*'.  That is makes it possible to interleave positional and optional argument strings.  But it does not change the behavior of REMAINDER.

parser.add_argument('app_args', nargs='*')

--------------

Maybe the documentation example for REMAINDER needs to modified to show just how 'greedy' REMAINDER is.  Adding a:

parser.add_argument('--arg1',action='store_true')

does not change the outcome.  REMAINDER still grabs '--arg1' even though it is a defined argument.


Namespace(arg1=False, args=['--arg1', 'XX', 'ZZ'], command='cmd', foo='B')
msg187206 - (view) Author: paul j3 (paul.j3) * Date: 2013-04-17 20:33
By the way, parser.parse_args() uses parse_known_arg().  parse_known_args returns a Namespace and a list of unknown arguments.  If that list is empty, parse_args returns the Namespace.  If the list is not empty, parse_args raises an error.

So parse_known_args does not change how arguments are parsed.  It just changes how the unknowns are handled.
msg222074 - (view) Author: paul j3 (paul.j3) * Date: 2014-07-02 06:40
Here's a possible solution to the problem (assuming there really is one):

- redefine REMAINDER so it matches a '((?:A[AO]*)?)' pattern (where O is a string that looks like an optional flag, A an argument string).  I've added the condition that the first match (if any) must be an A.  It ends up being closer to the pattern for PARSER.

I included a patch from issue 15112, which delays the consumption of a positional that matches with 0 strings.

In the sample case for this issue, results with this patch are:

    args = parser.parse_args(['app', '--config', 'bar'])
    # Namespace(app='app', app_args=[], config='bar')

    args = parser.parse_args(['--config', 'bar', 'app'])
    # Namespace(app='app', app_args=[], config='bar')

    args = parser.parse_args(['app', 'args', '--config', 'bar'])
    # Namespace(app='app', app_args=['args', '--config', 'bar'], config=None)

In the last case, 'app_args' gets the rest of the strings because the first is a plain 'args'.  I believe this is consistent with the intuition expressed in this issue.

I've added one test case to test_argparse.TestNargsRemainder.  This is a TestCase that is similar to the above example.

    argument_signatures = [Sig('x'), Sig('y', nargs='...'), Sig('-z')]
    failures = ['', '-z', '-z Z']
    successes = [
        ('X', NS(x='X', y=[], z=None)),
        ('-z Z X', NS(x='X', y=[], z='Z')),
        ('X A B -z Z', NS(x='X', y=['A', 'B', '-z', 'Z'], z=None)),
        ('X Y --foo', NS(x='X', y=['Y', '--foo'], z=None)),
        ('X -z Z A B', NS(x='X', y=['A', 'B'], z='Z')), # new case
    ]

This patch runs test_argparse fine.  But there is a slight possibility that this patch will cause backward compatibility problems.  Some user might expect y=['-z','Z',...].  But that expectation has not been enshrined the test_argparse.

It may require a slight change to the documentation as well.
History
Date User Action Args
2014-07-02 06:40:46paul.j3setfiles: + issue14174_1.patch
keywords: + patch
messages: + msg222074
2013-04-17 20:33:57paul.j3setmessages: + msg187206
2013-04-17 20:24:49paul.j3setnosy: + paul.j3
messages: + msg187204
2013-01-27 08:52:55chris.jerdoneksetnosy: + chris.jerdonek
messages: + msg180753
2012-11-30 16:17:21Michael.Edwardssetfiles: + test.py
nosy: + Michael.Edwards
messages: + msg176691

2012-10-06 18:32:29idanksetmessages: + msg172235
2012-10-06 18:29:28jason.coombssetnosy: + jason.coombs
messages: + msg172232
2012-09-21 21:24:01idanksetnosy: + idank
messages: + msg170921
2012-03-05 06:44:26eric.araujosetversions: + Python 3.3
nosy: + eric.araujo

messages: + msg154929

stage: needs patch
2012-03-02 12:27:24rr2do2setfiles: - worker.py
2012-03-02 12:26:56rr2do2setfiles: + worker.py
2012-03-02 12:10:04rr2do2setfiles: - worker.py
2012-03-02 12:09:55rr2do2setfiles: + bug_argparse.py
2012-03-02 12:09:11rr2do2setfiles: + worker.py
2012-03-02 12:08:44rr2do2setfiles: - worker.py
2012-03-02 12:08:09rr2do2create