classification
Title: argparse handling multiple "--" in args improperly
Type: behavior Stage: resolved
Components: Versions: Python 3.2, Python 3.3, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: Warren.Turkal, bethard, eric.smith, gcbirzan, jeffknupp, kalt, maker, paul.j3, python-dev, r.david.murray
Priority: normal Keywords: patch

Created on 2012-02-01 21:42 by Warren.Turkal, last changed 2014-07-07 17:16 by paul.j3. This issue is now closed.

Files
File name Uploaded Description Edit
test.py Warren.Turkal, 2012-02-05 00:42 minimal test script
argparse.patch jeffknupp, 2012-03-19 04:52 Fix to argparse '--' handling review
dbldash.patch paul.j3, 2013-04-14 06:48 review
Messages (27)
msg152443 - (view) Author: Warren Turkal (Warren.Turkal) Date: 2012-02-01 21:42
I have a program that runs something like the following:
$ hack run -- :target --arg1 --arg2 arg3 arg4

This basically runs a program identified by :target with the args. However, I cannot pass "--" to the program. For example, if I type:
$ hack run -- :hack run -- :target clean --help

the second "--" is swallowed by the parser, and I get an the help for "hack run" instead of instead of "hack clean". The run subcommand just does the following:

all_args = [target.bin_path] + args.args
os.execv(target.bin_path, all_args)

However, the first hack run has the following list for args:
args = Namespace(args=['run', ':hack', 'clean', '--help'], func=<function do_run at 0x19c3e60>, target=':hack')

Where is the second "--"? I would have expected the args list to be:
args=['run', '--', ':hack', 'clean', '--help']

About the python version, I am using python 2.6. However, I am using the latest release of argparse from [1] and am assuming that it's very similar code.

[1]http://code.google.com/p/argparse/downloads/list
msg152644 - (view) Author: Warren Turkal (Warren.Turkal) Date: 2012-02-04 21:24
I wanted to include a minimal example script, so here it is. If you run this like so:
$ ./test.py -- blah -- blah2

you will get the the following output:
Namespace(args=['blah2'], target=['blah'])

I would have expected the following instead:
Namespace(args=['--', 'blah2'], target=['blah'])
msg152650 - (view) Author: Warren Turkal (Warren.Turkal) Date: 2012-02-05 00:42
It doesn't look like that file got included last time, so here's a second try.
msg153045 - (view) Author: Christophe Kalt (kalt) Date: 2012-02-10 13:52
Hah.. was just about to report this.  I'm in the midst of converting a bunch of scripts from optparse to argparse, and this is one of the problems I'm hitting.
msg153286 - (view) Author: Steven Bethard (bethard) * (Python committer) Date: 2012-02-13 19:01
See nargs=argparse.REMAINDER for an approach that doesn't require that first '--':

http://docs.python.org/library/argparse.html#nargs

But yeah, removing more than one '--' is probably a bug. Fixing it would be a little backwards incompatible though. Is it possible that anyone actually depends on the behavior of consuming multiple '--' arguments?
msg153463 - (view) Author: Warren Turkal (Warren.Turkal) Date: 2012-02-16 07:27
Using argparse.REMAINDER will not help for my problem in my real program as the first arg needs to be handled by my program and mapped into the real binary name.

If I recall correctly from memory, the following is what happened when I tried using argparse.REMAINDER. If call my program like so:
$ hack run :target --help

The subcommand will be "run" in this case. Because :target is handled by argparse, the "--help" will not be seen as part of the remainder of the arguments, and I will get the help for the "hack run" subcommand instead of the target binary getting the --help argument. I have pushed most of the program to [1] if you want to take a look. Specifically, see cli/commands/run.py:do_run for that bit of code that handles the run subcommand.

[1]https://github.com/wt/repo-digg-dev-hackbuilder
msg156255 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012-03-18 14:42
I think it is unlikely that anyone depends on argparse consuming multiple -- strings.  If you are worried about it we could restrict the change to 3.3.  But personally I think this would be OK for a bug fix.
msg156257 - (view) Author: Steven Bethard (bethard) * (Python committer) Date: 2012-03-18 15:54
Ok, I agree - I'm fine with it as a bugfix. Depending on the removal of extra -- strings would be pretty crazy anyway. ;-)
msg156259 - (view) Author: Jeff Knupp (jeffknupp) * Date: 2012-03-18 17:11
I don't know that this is a bug. Rather, the string '--' means different things to argparse and optparse. In argparse, '--' is a psuedo-argument taken to mean "everything after this is a postional argument" and not "stop processing arguments", which is the optparse meaning. In that context it doesn't seem like removing additional '--' is a bug in argparse, since additional '--' would merely be restating the same thing.
msg156260 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012-03-18 17:31
No, it is definitely a bug.  It prevents implementing parsers that pass strings on to another sub-parser or command.  Imagine, for example, implementing a script that takes some arguments, but takes the entire rest of the command string and passes it to some unix program that it calls...and that unix program will recognize '--' in that string as the end of processing *its* options, and the user of the python command may well want to do that.  Unless this bug is fixed, it would be impossible to use argparse to implement such a reasonable scenario.
msg156261 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2012-03-18 17:32
I agree with David. It's a bug. I have programs (not using argparse yet) that do exactly what he describes.
msg156262 - (view) Author: Jeff Knupp (jeffknupp) * Date: 2012-03-18 18:14
In that case, wouldn't you use 'parse_known_args' instead of 'parse_args'
and pass the remaining arguments to the next script? This case is
explicitly mentioned in the argparse documentation. Again it seems to me
that the meaning of '--' has changed slightly between optparse and
argparse. Whether or not that was correct or intended is perhaps another
isssue.

On Sun, Mar 18, 2012 at 1:32 PM, Eric V. Smith <report@bugs.python.org>wrote:

>
> Eric V. Smith <eric@trueblade.com> added the comment:
>
> I agree with David. It's a bug. I have programs (not using argparse yet)
> that do exactly what he describes.
>
> ----------
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue13922>
> _______________________________________
>
msg156263 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2012-03-18 18:30
No. parse_known_args assumes you have known and unknown args intermixed. 

Going back to the original example, what if "hack" and ":target" had overlapping parameter names (say, they both supported "--arg1")? I think parse_known_args would pick up all instances of "--arg1". I really want "--" to mean "treat everything else as non-optional arguments".

Whether that's the original intent, I don't know. But as David says, unless it is, it's impossible to implement such a scheme with argparse.
msg156276 - (view) Author: Steven Bethard (bethard) * (Python committer) Date: 2012-03-18 21:41
> It prevents implementing parsers that pass strings on to another 
> sub-parser or command.
...
> wouldn't you use 'parse_known_args' instead of 'parse_args'
> and pass the remaining arguments to the next script

I'll just say again that the recommended way of passing arguments to another command is via nargs=argparse.REMAINDER. (Though you may still need "--" if the first argument in the remainder is a flag shared by your parser, as Warren Turka pointed out.)

> I really want "--" to mean "treat everything else as non-optional 
> arguments"

Yep, and that's what it's intended to mean, and what the documentation says:

"you can insert the pseudo-argument '--' which tells parse_args() that everything after that is a positional argument"
http://docs.python.org/library/argparse.html#arguments-containing

It's therefore a bug if there's a '--' after the first '--', but it's not being treated as a positional argument.
msg156315 - (view) Author: Jeff Knupp (jeffknupp) * Date: 2012-03-19 04:52
Added patch so that only the first '--' is removed by an argparse.PARSE or argparse.REMAINDER argument. Note that, as Steven said, argparse.REMAINDER should be used in the OP's issue (and the added test makes sure all remaining arguments are preserved even if they appear in the parent parser).
msg166054 - (view) Author: Steven Bethard (bethard) * (Python committer) Date: 2012-07-21 19:18
This patch looks like the right fix to me. The tests look good too.
msg166100 - (view) Author: Roundup Robot (python-dev) Date: 2012-07-22 02:58
New changeset 18b114be013e by R David Murray in branch '3.2':
#13922: argparse no longer incorrectly strips '--' after the first one.
http://hg.python.org/cpython/rev/18b114be013e

New changeset bd2c167dfabc by R David Murray in branch '2.7':
#13922: argparse no longer incorrectly strips '--' after the first one.
http://hg.python.org/cpython/rev/bd2c167dfabc

New changeset a636f365d815 by R David Murray in branch 'default':
Merge #13922: argparse no longer incorrectly strips '--' after the first one.
http://hg.python.org/cpython/rev/a636f365d815
msg166101 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012-07-22 03:00
Committed.  Thanks, Jeff.  By the way, although this patch isn't big enough to require it, it would be great if you would submit a contributor agreement: http://www.python.org/psf/contrib.
msg166103 - (view) Author: Warren Turkal (Warren.Turkal) Date: 2012-07-22 04:30
Thanks for fixing this issue. You guys are great!

wt

On Sat, Jul 21, 2012 at 8:00 PM, R. David Murray <report@bugs.python.org>wrote:

>
> R. David Murray <rdmurray@bitdance.com> added the comment:
>
> Committed.  Thanks, Jeff.  By the way, although this patch isn't big
> enough to require it, it would be great if you would submit a contributor
> agreement: http://www.python.org/psf/contrib.
>
> ----------
> resolution:  -> fixed
> stage:  -> committed/rejected
> status: open -> closed
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue13922>
> _______________________________________
>
msg168754 - (view) Author: George-Cristian Bîrzan (gcbirzan) Date: 2012-08-21 11:26
This patch introduced a regression. Before, parse_args would take a tuple as an argument, and in _get_values it was converted to a list via list comprehension, which meant it was working with tuples too. In the current version, that raises an AttributeError, since tuples do not have .remove().

The simplest solution would be to convert arg_strings to a list before calling .remove(). The downside is that it will break silently when you pass it a string, whereas now you get a confusing error message (but, this is the same behavior as before this fix)
msg168758 - (view) Author: Steven Bethard (bethard) * (Python committer) Date: 2012-08-21 11:54
@gcbirzan: Could you please open up a new issue? The current issue is fixed - it's just that the fix caused a new issue.

I would say that the `args` parameter was never intended to be anything but a list, so currently there's a documentation bug since that isn't stated explicitly anywhere. (But certainly the tests only test for lists, not any other sequences.)

Adding support for stuff other than lists feels more like a feature request, but given that it apparently worked by accident before, I guess we need to treat it as a bug. Probably the fix for that bug needs to have a ton of tests that check whether tuples work in all the various ways that parse_args is used.
msg170054 - (view) Author: Roundup Robot (python-dev) Date: 2012-09-08 16:35
New changeset 5d8454fcc629 by R David Murray in branch '3.2':
#15847: allow args to be a tuple in parse_args
http://hg.python.org/cpython/rev/5d8454fcc629

New changeset 76655483c5c8 by R David Murray in branch 'default':
merge #15847: allow args to be a tuple in parse_args
http://hg.python.org/cpython/rev/76655483c5c8

New changeset a2147bbf7868 by R David Murray in branch '2.7':
#15847: allow args to be a tuple in parse_args
http://hg.python.org/cpython/rev/a2147bbf7868
msg186092 - (view) Author: paul j3 (paul.j3) * Date: 2013-04-05 16:40
There are several problems with the patch provided in msg156315

This description:

"Added patch so that only the first '--' is removed by an argparse.PARSE or argparse.REMAINDER argument."

should read

"Added patch so that only the first '--' is removed by arguments that are not argparse.PARSER or argparse.REMAINDER ."

The test that adds a third subparser with a nargs='...' argument does not test this change.   It exercises both nargs types that are not affected by the change in argparse.py.  As such older versions of argparse pass this test.

I thinking argparse.py change is correct (delete only the 1st '--'), though I'm not sure it addresses the concern of the original poster.
msg186110 - (view) Author: paul j3 (paul.j3) * Date: 2013-04-05 21:59
There's another 'feature' to the patch proposed here.  It only deletes the first '--' in the list of strings passed to '_get_values' for a particular action.

    parser = argparse.ArgumentParser()
    parser.add_argument('foo')
    parser.add_argument('bar', nargs='*')
    print(parser.parse_args('-- 1 -- 2 3 4'.split(' ')))
    # Namespace(bar=['2', '3', '4'], foo='1')

'_get_values' first gets ('foo',['--','1']), then ('bar',['--','2','3','4'])

    print(parser.parse_args('-- -- 1 -- 2 -- 3 4'.split(' ')))
    # with this '1st only' change
    # Namespace(bar=['1', '2', '--', '3', '4'], foo='--')
    # without it, deleting all '--'; note foo is empty
    # Namespace(bar=['1', '2', '3', '4'], foo=[])

And to confuse things a bit more:

    print(parser.parse_args('1 -- 2 3 4'.split(' ')))
    # Namespace(bar=['2', '3', '4'], foo='1')

passes ['1','--'] with 'foo'

If 'bar' nargs='...', bar gets all of the '--' (with or without this patch).

The handling of '--' is complex because it is used in one place to mean, 'everything else is an argument', effectively adding '-AA...' to the arg_strings_pattern. It also matches with the nargs_pattern (e.g. '(-*A-*)...').  And then it may or may not be removed in _get_values().
msg186119 - (view) Author: paul j3 (paul.j3) * Date: 2013-04-06 06:44
I am working on an alternative solution that moves the '--' removal to the consume_positionals() method, and only does it if there is a corresponding '-' in the arg_strings_pattern.
msg186884 - (view) Author: paul j3 (paul.j3) * Date: 2013-04-14 06:48
This patch removes only one '--', the one that put a '-' in the 'arg_strings_pattern'.  It does this in 'consume_positionals' right before calling 'take_action'.  As before it does not do this if nargs is PARSER or REMAINDER.

test_argparse.py has two DoubleDashRemoval cases, that attempt to highlight the changes from production (delete all --) and development (delete first -- in each positional group) versions.

I have not made any changes to the documentation.  All it says now is:

"If you have positional arguments that must begin with - and don’t look like negative numbers, you can insert the pseudo-argument '--' which tells parse_args() that everything after that is a positional argument:"
msg222479 - (view) Author: paul j3 (paul.j3) * Date: 2014-07-07 17:16
http://bugs.python.org/issue9571 proposes a refinement to this patch - drop the '--' from PARSER and REMAINDER nargs if it is the 1st string.
History
Date User Action Args
2014-07-07 17:16:45paul.j3setmessages: + msg222479
2013-04-19 07:38:17makersetnosy: + maker
2013-04-14 06:48:18paul.j3setfiles: + dbldash.patch

messages: + msg186884
2013-04-06 06:44:37paul.j3setmessages: + msg186119
2013-04-05 21:59:03paul.j3setmessages: + msg186110
2013-04-05 16:40:01paul.j3setnosy: + paul.j3
messages: + msg186092
2012-09-08 16:35:06python-devsetmessages: + msg170054
2012-08-21 11:54:57bethardsetmessages: + msg168758
2012-08-21 11:26:08gcbirzansetnosy: + gcbirzan
messages: + msg168754
2012-07-22 04:30:30Warren.Turkalsetmessages: + msg166103
2012-07-22 03:00:33r.david.murraysetstatus: open -> closed
resolution: fixed
messages: + msg166101

stage: resolved
2012-07-22 02:58:03python-devsetnosy: + python-dev
messages: + msg166100
2012-07-21 19:18:38bethardsetmessages: + msg166054
2012-03-19 04:52:43jeffknuppsetfiles: + argparse.patch
keywords: + patch
messages: + msg156315
2012-03-18 21:41:39bethardsetmessages: + msg156276
2012-03-18 18:30:44eric.smithsetmessages: + msg156263
2012-03-18 18:14:56jeffknuppsetmessages: + msg156262
2012-03-18 17:32:52eric.smithsetmessages: + msg156261
2012-03-18 17:31:10r.david.murraysetmessages: + msg156260
2012-03-18 17:11:51jeffknuppsetnosy: + jeffknupp
messages: + msg156259
2012-03-18 15:54:06bethardsetmessages: + msg156257
2012-03-18 14:42:51r.david.murraysetnosy: + r.david.murray

messages: + msg156255
versions: - Python 3.4
2012-02-16 07:27:44Warren.Turkalsetmessages: + msg153463
2012-02-13 19:01:12bethardsetmessages: + msg153286
2012-02-11 07:54:18Warren.Turkalsetversions: + Python 3.3, Python 3.4
2012-02-10 13:52:11kaltsetnosy: + kalt
messages: + msg153045
2012-02-05 00:42:02Warren.Turkalsetfiles: + test.py

messages: + msg152650
2012-02-04 21:24:52Warren.Turkalsetmessages: + msg152644
2012-02-02 13:06:41eric.smithsetnosy: + bethard, eric.smith
2012-02-01 21:42:36Warren.Turkalcreate