classification
Title: argparse: nargs='*' positional argument doesn't accept any items if preceded by an option and another positional
Type: behavior Stage: patch review
Components: Library (Lib) Versions: Python 3.4, Python 3.3, Python 3.2, Python 2.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: bethard, chris.jerdonek, gfxmonk, leonard.gerard, martin.panter, paul.j3, tshepang, waltermundt
Priority: normal Keywords: patch

Created on 2012-06-20 04:15 by waltermundt, last changed 2016-04-04 03:27 by martin.panter.

Files
File name Uploaded Description Edit
argparse_fix_empty_nargs_star.patch waltermundt, 2012-06-22 21:54 Proposed fix
15112.patch bethard, 2012-07-22 21:53 review
mixed.patch paul.j3, 2013-08-25 02:11 review
Messages (13)
msg163249 - (view) Author: Walter Mundt (waltermundt) Date: 2012-06-20 04:15
Test case:

    from argparse import *

    parser = ArgumentParser()
    parser.add_argument('-x', action='store_true')
    parser.add_argument('y')
    parser.add_argument('z', nargs='*')

    print parser.parse_args('yy -x zz'.split(' '))

The result of this is that the "z" option is unfilled, and the "zz" argument is unrecognized, resulting in an error.  Changing the 'nargs' to '+' works in this case, but results in errors if the 'zz' is left off.
msg163497 - (view) Author: Walter Mundt (waltermundt) Date: 2012-06-22 21:54
Attached is a patch to fix this bug by deferring matching of nargs='*' argument against a zero-length pattern until the end of the arguments list.

I believe that it ought to be maximally conservative in that it should not change the behavior of any existing parsers except in cases where it allows them to accept arguments they would previously have left unrecognized in cases like the test in the initial report.
msg166173 - (view) Author: Steven Bethard (bethard) * (Python committer) Date: 2012-07-22 21:53
Your patch is a good start, but it needs to handle all the related situations, e.g. nargs='?' and the possibility of having more than one zero-length argument at the end.

I believe the following patch achieves this. Please test it out.
msg178481 - (view) Author: Chris Jerdonek (chris.jerdonek) * (Python committer) Date: 2012-12-29 08:50
Was argparse ever supposed to support inputs of the form given in the example (i.e. different positional arguments straddling optional arguments): 'yy -x zz'?

The usage string shows up as: "usage: test.py [-h] [-x] y [z [z ...]]"  The original example seems to work with the current code if given as: '-x yy zz'.

Also, substituting argparse.REMAINDER for '*' in the original example gives the following both with and without the patch:

Namespace(x=False, y='yy', z=['-x', 'zz'])

That doesn't seem consistent with straddling being supported.

Lastly, passing just '-x' gives the following error with and without the patch (z should be optional):

error: the following arguments are required: y, z
msg196066 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2013-08-24 07:05
I was surprised to discover that “option straddling” doesn’t work this way with nargs="*". It seems to work fine with most other kinds of positional arguments I have tried, and I imagine that this was by design rather than accident. Many Gnu CLI programs also tend to support it as well (e.g. “cp file1 file2 --verbose dir/”).

I assumed nargs=argparse.REMAINDER was intended to override the “option straddling”. Otherwise, by just going off the documentation it sounds like nargs=argparse.REMAINDER is much the same as nargs="*".
msg196112 - (view) Author: paul j3 (paul.j3) * Date: 2013-08-25 02:11
I originally posted this on http://bugs.python.org/issue14191, but I think it belongs here.  The patch I proposed is similar to berthard's, popping items off the end of 'args_counts'.  I intend to check whether the logic is equivalent.

----------------------------
copy from http://bugs.python.org/msg187051
----------------------------
This patch permits the mixing of optionals with positionals, with the caveat that a particular positional cannot be split up.

If:

    parser = ArgumentParser()
    parser.add_argument('-f','--foo')
    parser.add_argument('cmd')
    parser.add_argument('rest', nargs='*')

    '-f1 cmd 1 2 3', 
    'cmd -f1 1 2 3', 
    'cmd 1 2 3 -f1' 

all give {cmd='cmd', rest=['1','2','3'], foo='1'}.  

But 'cmd 1 -f1 2 3', does not recognize ['2','3'].

Previously 'cmd -f1 1 2 3' would return rest=[], and not recognize ['1','2','3'].  With this change the nargs='*' behaves more like nargs='+', surviving to parse the 2nd group of positional strings.

The trick is to modify arg_counts in consume_positionals(), removing matches that don't do anything (don't consume argument strings). 

    if 'O' in arg_strings_pattern[start_index:]:
        # if there is an optional after this, remove
        # 'empty' positionals from the current match
        while len(arg_counts)>1 and arg_counts[-1]==0:
            arg_counts = arg_counts[:-1]

This change passes all of the existing test_argparse.py tests.  It also passes the optparse tests that I added in http://bugs.python.org/issue9334#msg184987
I added 4 cases to illustrate this change.
msg196434 - (view) Author: paul j3 (paul.j3) * Date: 2013-08-28 23:14
These three changes end up testing for the same thing. The initial 'if' catches different test cases.  'subparsers' or 'remainder' might 'confuse' the 'O' test.  The zero-width test ends up weeding out everything but the test cases added for this issue.

    # if we haven't hit the end of the command line strings,
    if start_index + sum(arg_counts) != len(arg_strings_pattern):
        while arg_counts and arg_counts[-1] == 0: 
            arg_counts.pop()

    # same test using selected_pattern (= arg_strings_pattern[start_index:])
    if len(selected_pattern) != sum(arg_counts):
        while arg_counts and arg_counts[-1] == 0: 
            arg_counts.pop()

    # alt test: test for optional in the remaining pattern
    if 'O' in selected_pattern:
        while arg_counts and arg_counts[-1] == 0: 
            arg_counts.pop()
msg221999 - (view) Author: paul j3 (paul.j3) * Date: 2014-07-01 00:14
I believe http://bugs.python.org/issue14174 with REMAINDER has its roots in the same issue - parse_args tries to process as many positionals as it can at a time, regardless of what's left in the argument strings.

The fix proposed here depends on the 2nd argument taking 0 strings.  REMAINDER, on the other hand, grabs everything that's left, leaving none for the optionals.
msg226315 - (view) Author: leonard gerard (leonard.gerard) Date: 2014-09-03 19:01
In my opinion this is a bug or it should be explicitly stated in the generated usage help string.
msg226316 - (view) Author: leonard gerard (leonard.gerard) Date: 2014-09-03 19:02
It seems that delaying positional argument parsing after all optional arguments are parsed would be clean and robust.

My understanding is that optional arguments aren't ambiguous and should be processed first and removed from the arguments. Then the current pattern matching done for positional arguments would work well (in one try).

If you think this would be a better patch I can give it a try.
msg226320 - (view) Author: paul j3 (paul.j3) * Date: 2014-09-03 20:45
http://bugs.python.org/issue14191
'argparse doesn't allow optionals within positionals'

implements a 'parse_intermixed_args()' method, which parses all of the optionals with one pass, followed by a second pass that handles the positionals.  It does this by temporarily deactivating the positionals for the first pass.  It emulates the optparse behavior (with the added ability to parse positionals).

This is too big of a change to ever become the default behavior for argparse.
msg243453 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-05-18 03:32
Closed Issue 24223 as a duplicate. I understand the patch here also fixes the case of an --option before an optional positional argument using nargs="?"; is that right?
msg262842 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-04-04 03:27
Playing with Steven and Paul’s patches, they both seem to work well. Paul’s seems to have debug printing included, which should be removed. I confirmed both patches also seem to address the nargs="?" case (Issue 24223).

At the moment I have zero knowledge of how the argparse internals work. Are there any advantages of one patch over the other?
History
Date User Action Args
2016-04-04 03:27:11martin.pantersetmessages: + msg262842
stage: patch review
2015-05-18 03:32:57martin.pantersetmessages: + msg243453
2015-05-18 03:29:16martin.panterlinkissue24223 superseder
2014-09-03 20:45:33paul.j3setmessages: + msg226320
2014-09-03 19:02:35leonard.gerardsetmessages: + msg226316
2014-09-03 19:01:48leonard.gerardsetnosy: + leonard.gerard
messages: + msg226315
2014-07-01 00:14:34paul.j3setmessages: + msg221999
2013-08-28 23:14:00paul.j3setmessages: + msg196434
2013-08-25 02:11:58paul.j3setfiles: + mixed.patch

messages: + msg196112
2013-08-24 21:37:57paul.j3setnosy: + paul.j3
2013-08-24 07:05:05martin.pantersetnosy: + martin.panter
messages: + msg196066
2012-12-29 08:50:41chris.jerdoneksetnosy: + chris.jerdonek
messages: + msg178481
2012-12-19 10:17:10gfxmonksetnosy: + gfxmonk
2012-07-22 21:53:46bethardsetfiles: + 15112.patch

messages: + msg166173
versions: + Python 3.2, Python 3.3, Python 3.4
2012-06-23 03:58:39waltermundtsettype: behavior
components: + Library (Lib)
versions: + Python 2.7
2012-06-22 21:54:43waltermundtsetfiles: + argparse_fix_empty_nargs_star.patch
keywords: + patch
messages: + msg163497
2012-06-22 18:19:07tshepangsetnosy: + tshepang
2012-06-20 12:30:54r.david.murraysetnosy: + bethard
2012-06-20 04:15:47waltermundtcreate