Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

argparse: nargs='*' positional argument doesn't accept any items if preceded by an option and another positional #59317

Open
waltermundt mannequin opened this issue Jun 20, 2012 · 14 comments
Labels
stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@waltermundt
Copy link
Mannequin

waltermundt mannequin commented Jun 20, 2012

BPO 15112
Nosy @cjerdonek, @vadmium
Files
  • argparse_fix_empty_nargs_star.patch: Proposed fix
  • 15112.patch
  • mixed.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2012-06-20.04:15:47.146>
    labels = ['type-bug', 'library']
    title = "argparse: nargs='*' positional argument doesn't accept any items if preceded by an option and another positional"
    updated_at = <Date 2018-06-30.21:19:15.871>
    user = 'https://bugs.python.org/waltermundt'

    bugs.python.org fields:

    activity = <Date 2018-06-30.21:19:15.871>
    actor = 'paul.j3'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Library (Lib)']
    creation = <Date 2012-06-20.04:15:47.146>
    creator = 'waltermundt'
    dependencies = []
    files = ['26097', '26485', '31457']
    hgrepos = []
    issue_num = 15112
    keywords = ['patch']
    message_count = 14.0
    messages = ['163249', '163497', '166173', '178481', '196066', '196112', '196434', '221999', '226315', '226316', '226320', '243453', '262842', '320808']
    nosy_count = 9.0
    nosy_names = ['bethard', 'gfxmonk', 'devurandom', 'chris.jerdonek', 'tshepang', 'martin.panter', 'paul.j3', 'waltermundt', 'leonard.gerard']
    pr_nums = []
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue15112'
    versions = ['Python 2.7', 'Python 3.2', 'Python 3.3', 'Python 3.4']

    @waltermundt
    Copy link
    Mannequin Author

    waltermundt mannequin commented Jun 20, 2012

    Test case:

        from argparse import *
    
        parser = ArgumentParser()
        parser.add_argument('-x', action='store_true')
        parser.add_argument('y')
        parser.add_argument('z', nargs='*')
    print parser.parse_args('yy -x zz'.split(' '))
    

    The result of this is that the "z" option is unfilled, and the "zz" argument is unrecognized, resulting in an error. Changing the 'nargs' to '+' works in this case, but results in errors if the 'zz' is left off.

    @waltermundt
    Copy link
    Mannequin Author

    waltermundt mannequin commented Jun 22, 2012

    Attached is a patch to fix this bug by deferring matching of nargs='*' argument against a zero-length pattern until the end of the arguments list.

    I believe that it ought to be maximally conservative in that it should not change the behavior of any existing parsers except in cases where it allows them to accept arguments they would previously have left unrecognized in cases like the test in the initial report.

    @waltermundt waltermundt mannequin added stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels Jun 23, 2012
    @bethard
    Copy link
    Mannequin

    bethard mannequin commented Jul 22, 2012

    Your patch is a good start, but it needs to handle all the related situations, e.g. nargs='?' and the possibility of having more than one zero-length argument at the end.

    I believe the following patch achieves this. Please test it out.

    @cjerdonek
    Copy link
    Member

    Was argparse ever supposed to support inputs of the form given in the example (i.e. different positional arguments straddling optional arguments): 'yy -x zz'?

    The usage string shows up as: "usage: test.py [-h] [-x] y [z [z ...]]" The original example seems to work with the current code if given as: '-x yy zz'.

    Also, substituting argparse.REMAINDER for '*' in the original example gives the following both with and without the patch:

    Namespace(x=False, y='yy', z=['-x', 'zz'])

    That doesn't seem consistent with straddling being supported.

    Lastly, passing just '-x' gives the following error with and without the patch (z should be optional):

    error: the following arguments are required: y, z

    @vadmium
    Copy link
    Member

    vadmium commented Aug 24, 2013

    I was surprised to discover that “option straddling” doesn’t work this way with nargs="*". It seems to work fine with most other kinds of positional arguments I have tried, and I imagine that this was by design rather than accident. Many Gnu CLI programs also tend to support it as well (e.g. “cp file1 file2 --verbose dir/”).

    I assumed nargs=argparse.REMAINDER was intended to override the “option straddling”. Otherwise, by just going off the documentation it sounds like nargs=argparse.REMAINDER is much the same as nargs="*".

    @paulj3
    Copy link
    Mannequin

    paulj3 mannequin commented Aug 25, 2013

    I originally posted this on http://bugs.python.org/issue14191, but I think it belongs here. The patch I proposed is similar to berthard's, popping items off the end of 'args_counts'. I intend to check whether the logic is equivalent.

    ----------------------------
    copy from http://bugs.python.org/msg187051
    ----------------------------
    This patch permits the mixing of optionals with positionals, with the caveat that a particular positional cannot be split up.

    If:

        parser = ArgumentParser()
        parser.add_argument('-f','--foo')
        parser.add_argument('cmd')
        parser.add_argument('rest', nargs='*')
    '-f1 cmd 1 2 3', 
    'cmd -f1 1 2 3', 
    'cmd 1 2 3 -f1' 
    

    all give {cmd='cmd', rest=['1','2','3'], foo='1'}.

    But 'cmd 1 -f1 2 3', does not recognize ['2','3'].

    Previously 'cmd -f1 1 2 3' would return rest=[], and not recognize ['1','2','3']. With this change the nargs='*' behaves more like nargs='+', surviving to parse the 2nd group of positional strings.

    The trick is to modify arg_counts in consume_positionals(), removing matches that don't do anything (don't consume argument strings).

        if 'O' in arg_strings_pattern[start_index:]:
            # if there is an optional after this, remove
            # 'empty' positionals from the current match
            while len(arg_counts)>1 and arg_counts[-1]==0:
                arg_counts = arg_counts[:-1]

    This change passes all of the existing test_argparse.py tests. It also passes the optparse tests that I added in http://bugs.python.org/issue9334#msg184987
    I added 4 cases to illustrate this change.

    @paulj3
    Copy link
    Mannequin

    paulj3 mannequin commented Aug 28, 2013

    These three changes end up testing for the same thing. The initial 'if' catches different test cases. 'subparsers' or 'remainder' might 'confuse' the 'O' test. The zero-width test ends up weeding out everything but the test cases added for this issue.

        # if we haven't hit the end of the command line strings,
        if start_index + sum(arg_counts) != len(arg_strings_pattern):
            while arg_counts and arg_counts[-1] == 0: 
                arg_counts.pop()
    
        # same test using selected_pattern (= arg_strings_pattern[start_index:])
        if len(selected_pattern) != sum(arg_counts):
            while arg_counts and arg_counts[-1] == 0: 
                arg_counts.pop()
    
        # alt test: test for optional in the remaining pattern
        if 'O' in selected_pattern:
            while arg_counts and arg_counts[-1] == 0: 
                arg_counts.pop()

    @paulj3
    Copy link
    Mannequin

    paulj3 mannequin commented Jul 1, 2014

    I believe http://bugs.python.org/issue14174 with REMAINDER has its roots in the same issue - parse_args tries to process as many positionals as it can at a time, regardless of what's left in the argument strings.

    The fix proposed here depends on the 2nd argument taking 0 strings. REMAINDER, on the other hand, grabs everything that's left, leaving none for the optionals.

    @leonardgerard
    Copy link
    Mannequin

    leonardgerard mannequin commented Sep 3, 2014

    In my opinion this is a bug or it should be explicitly stated in the generated usage help string.

    @leonardgerard
    Copy link
    Mannequin

    leonardgerard mannequin commented Sep 3, 2014

    It seems that delaying positional argument parsing after all optional arguments are parsed would be clean and robust.

    My understanding is that optional arguments aren't ambiguous and should be processed first and removed from the arguments. Then the current pattern matching done for positional arguments would work well (in one try).

    If you think this would be a better patch I can give it a try.

    @paulj3
    Copy link
    Mannequin

    paulj3 mannequin commented Sep 3, 2014

    http://bugs.python.org/issue14191
    'argparse doesn't allow optionals within positionals'

    implements a 'parse_intermixed_args()' method, which parses all of the optionals with one pass, followed by a second pass that handles the positionals. It does this by temporarily deactivating the positionals for the first pass. It emulates the optparse behavior (with the added ability to parse positionals).

    This is too big of a change to ever become the default behavior for argparse.

    @vadmium
    Copy link
    Member

    vadmium commented May 18, 2015

    Closed bpo-24223 as a duplicate. I understand the patch here also fixes the case of an --option before an optional positional argument using nargs="?"; is that right?

    @vadmium
    Copy link
    Member

    vadmium commented Apr 4, 2016

    Playing with Steven and Paul’s patches, they both seem to work well. Paul’s seems to have debug printing included, which should be removed. I confirmed both patches also seem to address the nargs="?" case (bpo-24223).

    At the moment I have zero knowledge of how the argparse internals work. Are there any advantages of one patch over the other?

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    Status: Bugs
    Development

    No branches or pull requests

    2 participants