classification
Title: argparse does not accept options taking arguments beginning with dash (regression from optparse)
Type: enhancement Stage: patch review
Components: Library (Lib) Versions: Python 3.5
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Christophe.Guillon, abacabadabacaba, amcnabb, andersk, bethard, cben, danielsh, davidben, drm, eric.araujo, eric.smith, gdb, gfxmonk, martin.panter, memeplex, nelhage, paul.j3, r.david.murray, skilletaudio, spaceone
Priority: normal Keywords: patch

Created on 2010-07-22 22:15 by andersk, last changed 2016-06-15 10:04 by spaceone.

Files
File name Uploaded Description Edit
final.patch paul.j3, 2013-03-22 17:16 patch for issue 9334 review
Messages (46)
msg111221 - (view) Author: Anders Kaseorg (andersk) Date: 2010-07-22 22:15
Porting the a2x program to argparse from the now-deprecated optparse subtly breaks it when certain options are passed:

$ a2x --asciidoc-opts --safe gitcli.txt
$ ./a2x.argparse --asciidoc-opts --safe gitcli.txt
usage: a2x [-h] [--version] [-a ATTRIBUTE] [--asciidoc-opts ASCIIDOC_OPTS]
           [--copy] [--conf-file CONF_FILE] [-D PATH] [-d DOCTYPE]
           [--epubcheck] [-f FORMAT] [--icons] [--icons-dir PATH] [-k]
           [--lynx] [-L] [-n] [-r PATH] [-s] [--stylesheet STYLESHEET]
           [--safe] [--dblatex-opts DBLATEX_OPTS] [--fop]
           [--fop-opts FOP_OPTS] [--xsltproc-opts XSLTPROC_OPTS] [-v]
a2x: error: argument --asciidoc-opts: expected one argument

Apparently argparse uses a heuristic to try to guess whether an argument looks like an argument or an option, going so far as to check whether it looks like a negative number (!).  It should _never_ guess: the option was specified to take an argument, so the following argument should always be parsed as an argument.

Small test case:

>>> import optparse
>>> parser = optparse.OptionParser(prog='a2x')
>>> parser.add_option('--asciidoc-opts',
...     action='store', dest='asciidoc_opts', default='',
...     metavar='ASCIIDOC_OPTS', help='asciidoc options')
>>> parser.parse_args(['--asciidoc-opts', '--safe'])
(<Values at 0x7f585142ef80: {'asciidoc_opts': '--safe'}>, [])

>>> import argparse
>>> parser = argparse.ArgumentParser(prog='a2x')
>>> parser.add_argument('--asciidoc-opts',
...     action='store', dest='asciidoc_opts', default='',
...     metavar='ASCIIDOC_OPTS', help='asciidoc options')
>>> parser.parse_args(['--asciidoc-opts', '--safe'])
usage: a2x [-h] [--asciidoc-opts ASCIIDOC_OPTS]
a2x: error: argument --asciidoc-opts: expected one argument
msg111224 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010-07-22 23:06
It seems like reasonable request to me to be able to allow such arguments, especially since optparse did and we want people to be able to use argparse as a replacement. Though in general I find argparse's default behavior more useful.  Since argparse has been released, I'm thinking this still has to be a feature request, since argparse is *not* a drop-in replacement for optparse.
msg111227 - (view) Author: Nelson Elhage (nelhage) Date: 2010-07-22 23:40
For what it's worth, I have trouble seeing this as anything but a bug. I understand the motivation of trying to catch user errors, but in doing so, you're breaking with the behavior of every other option parsing library that I'm aware of, in favor of an arbitrary heuristic that sometimes guesses wrong. That's not the kind of behavior I expect from my Python libraries; I want them to do what I ask them to, not try to guess what I probably meant.
msg111228 - (view) Author: Anders Kaseorg (andersk) Date: 2010-07-22 23:43
> Though in general I find argparse's default behavior more useful.

I’m not sure I understand.  Why is it useful for an option parsing library to heuristically decide, by default, that I didn’t actually want to pass in the valid option that I passed in?  Shouldn’t that be up to the caller (or up to the program, if it explicitly decides to reject such arguments)?

Keep in mind that the caller might be another script instead of a user.
msg111230 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010-07-23 00:51
Well, even if you call it a bug, it would be an argparse design bug, and design bug fixes are feature requests from a procedural point of view.
msg111279 - (view) Author: Steven Bethard (bethard) * (Python committer) Date: 2010-07-23 11:21
Note that the negative number heuristic you're complaining about doesn't actually affect your code below. The negative number heuristic is only used when you have some options that look like negative numbers. See the docs for more information:

http://docs.python.org/library/argparse.html#arguments-containing

Your problem is that you want "--safe" to be treated as a positional argument even though you've declared it as an option. Basically there are two reasonable interpretations of this situation. Consider something like "--conf-file --safe". Either the user wants a conf file named "--safe", or the user accidentally forgot to type the name of the conf file. Argparse assumes the latter, though either one is conceivable. Argparse assumes the latter because, while it occasionally throws an unnecessary exception, the other behavior would allow an error to pass silently.

I'm definitely opposed to changing the default behavior to swallow some errors silently. If you'd like to propose an API for enabling such behavior explicitly and supply a patch and tests implementing it, I'll be happy to review it though.
msg111367 - (view) Author: Anders Kaseorg (andersk) Date: 2010-07-23 17:39
> Note that the negative number heuristic you're complaining about
> doesn't actually affect your code below.

Yes it does:

>>> import argparse
>>> parser = argparse.ArgumentParser(prog='a2x')
>>> parser.add_argument('--asciidoc-opts',
...     action='store', dest='asciidoc_opts', default='',
...     metavar='ASCIIDOC_OPTS', help='asciidoc options')
>>> parser.parse_args(['--asciidoc-opts', '-1'])
Namespace(asciidoc_opts='-1')
>>> parser.parse_args(['--asciidoc-opts', '-one'])
usage: a2x [-h] [--asciidoc-opts ASCIIDOC_OPTS]
a2x: error: argument --asciidoc-opts: expected one argument

> Your problem is that you want "--safe" to be treated as a positional
> argument even though you've declared it as an option.

No, it doesn’t matter whether --safe was declared as an option: argparse rejected it on the basis of beginning with a dash (as I demonstrated in my small test case, which did not declare --safe as an option, and again in the example above with -one).

> Either the user wants a conf file named "--safe", or the user
> accidentally forgot to type the name of the conf file.

But it’s not argparse’s job to decide that the valid option I passed was actually a typo for something invalid.  This would be like Python rejecting the valid call
  shell = "bash"
  p = subprocess.Popen(shell)
just because shell happens to also be a valid keyword argument for the Popen constructor and I might have forgotten to specify its value.

Including these special heuristics by default, that (1) are different from the standard behavior of all other option parsing libraries and (2) interfere with the ability to pass certain valid options, only leads to strange inconsistencies between command line programs written in different languages, and ultimately makes the command line harder to use for everyone.  The default behavior should be the standard one.
msg111669 - (view) Author: Steven Bethard (bethard) * (Python committer) Date: 2010-07-26 21:53
I still disagree. You're giving the parser ambiguous input. If a parser sees "--foo --bar", and "--foo" is a valid option, but "--bar" is not, this is a legitimately ambiguous situation. Either the user really wanted "--bar", and the parser doesn't support it, or the "--bar" was meant to be the argument to the "--foo" flag. At this point, the parser must make an arbitrary decision, and argparse chooses the interpretation that the user wanted the "--bar" flag.

I understand that you have a good use case for the other interpretation. That's why I suggest you come up with a patch that allows this other interpretation to be enabled when necessary. Changing the default behavior is really a non-starter unless you can propose a sensible transition strategy (as is always necessary for changing APIs in backwards incompatible ways).
msg111670 - (view) Author: Anders Kaseorg (andersk) Date: 2010-07-26 22:43
> I still disagree. You're giving the parser ambiguous input. If a
> parser sees "--foo --bar", and "--foo" is a valid option, but "--bar"
> is not, this is a legitimately ambiguous situation.

There is no ambiguity.  According to the way that every standard option parsing library has worked for decades, the parser knows that --foo takes an argument, so the string after --foo is in a different grammatical context than options are, and is automatically interpreted as an argument to --foo.  (It doesn’t matter whether that string begins with a dash, is a valid argument, might become a valid argument in some future version, looks like a negative number, or any other such condition.)

  arguments = *(positional-argument / option) [-- *(positional-argument)]
  positional-argument = string
  option = foo-option / bar-option
  foo-option = "--foo" string
  bar-option = "--bar"

This is just like how variable names in Python are in a different grammatical position than keyword argument names, so that Popen(shell) is not confused with Popen(shell=True).  This is not ambiguity; it simply follows from the standard definition of the grammar.

argparse’s alternative interpretation of that string as another option does not make sense because it violates the requirement that --foo has been defined to take an argument.

The only justification for considering that input ambiguous is if you start assuming that argparse knows better than the user (“the user accidentally forgot to type the name of the conf file”) and try to guess what they meant.  This violates the user’s expectations of how the command line should work.  It also creates subtle bugs in scripts that call argparse-based programs (think about call(["program", "--foo", foo_argument]) where foo_argument comes from some complex computation or even untrusted network input).

> Changing the default behavior is really a non-starter unless you can
> propose a sensible transition strategy (as is always necessary for
> changing APIs in backwards incompatible ways).

This would not be a backwards incompatible change, since every option that previously parsed successfully would also parse in the same way after the fix.
msg111673 - (view) Author: Anders Kaseorg (andersk) Date: 2010-07-26 23:17
>   arguments = *(positional-argument / option) [-- *(positional-argument)]
>   positional-argument = string
>   option = foo-option / bar-option
>   foo-option = "--foo" string
>   bar-option = "--bar"

Er, obviously positional arguments before the first ‘--’ can’t begin with a dash (I don’t think there’s any confusion over how those should work).
  arguments = *(non-dash-positional-argument / option) ["--" *(positional-argument)]
  non-dash-positional-argument = <string not beginning with "-">
  positional-argument = string

The point was just that the grammar unambiguously allows the argument of --foo to be any string.
msg111691 - (view) Author: Steven Bethard (bethard) * (Python committer) Date: 2010-07-27 09:37
It *would* be a backwards incompatible change. Currently, if I have a parser with both a "--foo" and a "--bar" option, and my user types "--foo --bar", they get an error saying that they were missing the argument to "--foo". Under your proposal, the "--foo" option will now silently consume the "--bar" option without an error. I know this is good from your perspective, but it would definitely break some of my scripts, and I imagine it would break other people's scripts as well.

As I keep saying, I'm happy to add your alternative parsing as an option (assuming you provide a patch), but I really don't think it's the right thing to do by default. Most command line programs don't have options that take other option-like things as arguments (which is the source of your problem), so in most command line programs, people want an error when they get an option they don't recognize or an option that's missing its argument. Under your proposal, more such errors will pass silently and will have to be caught by additional code in the script.
msg128014 - (view) Author: Gerard van Helden (drm) Date: 2011-02-05 19:13
The reporter imho is 100% right. Simply because of the fact that in the current situation, there is no way to supply an argument starting with a dash (not even for instance a filename). That is, of course, total nonsense to be dictated by the parser library.
msg128025 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2011-02-05 20:54
While I also dislike the existing behavior, note that you can get what you want by using an equal sign.

>>> import argparse
>>> parser = argparse.ArgumentParser(prog='a2x')
>>> parser.add_argument('--asciidoc-opts',
... action='store', dest='asciidoc_opts', default=''
... metavar='ASCIIDOC_OPTS', help='asciidoc options')
>>> parser.parse_args(['--asciidoc-opts', '-1'])
Namespace(asciidoc_opts='-1')
>>> parser.parse_args(['--asciidoc-opts=-one'])
Namespace(asciidoc_opts='-one')

I always use the equal sign, so I've never noticed this behavior before.

I wish that help would display the equal sign, but that's another issue.
msg128047 - (view) Author: Steven Bethard (bethard) * (Python committer) Date: 2011-02-06 09:47
Yeah, I agree it's not ideal, though note that basic unix commands have trouble with arguments staring with dashes:

$ cd -links-/
-bash: cd: -l: invalid option
cd: usage: cd [-L|-P] [dir]

If you're working with a file on a filesystem, the time honored workaround is to prefix with ./

$ cd ./-links-/
$

Anyway, it doesn't seem like anyone is offering to write up a patch to enable such an alternative parsing strategy, perhaps Eric's "=" workaround should be documented prominently somewhere?
msg128055 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011-02-06 12:18
Documenting “--extra-args=--foo” or “--extra-args -- --foo” (untested, but should work) seems good.
msg128062 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2011-02-06 15:56
"--" won't work. Traditionally, this has been used to separate optional arguments from positional arguments. Continuing the "cd" example, that's what would let you cd into a directory whose name starts with a hyphen:

$ cd -links-/
-bash: cd: -l: invalid option
cd: usage: cd [-L|-P] [dir]
$ cd -- -links-
$

This would also work with argparse:
import argparse
parser = argparse.ArgumentParser(prog='cd')
parser.add_argument('-L', help='follow symbolic links')
parser.add_argument('-P', help='do not follow symbolic links')
parser.add_argument('dir', help='directory name')
print(parser.parse_args(['--', '-Links-']))

prints:
Namespace(L=None, P=None, dir='-Links-')

Continuing the example from my earlier post shows it won't work for values for optional arguments:
>>> parser.parse_args(['--asciidoc-opts -- -one'])
usage: a2x [-h] [--asciidoc-opts ASCIIDOC_OPTS]
a2x: error: unrecognized arguments: --asciidoc-opts -- -one

I believe it's only the '=' that will solve this problem. In fact, because of this issue, I suggest we document '=' as the preferred way to call argparse when optional arguments have values, and change all of the examples to use it. I also think it would cause less confusion (because of this issue) if the help output showed the equal sign. But I realize that's probably more controversial.
msg128071 - (view) Author: Anders Kaseorg (andersk) Date: 2011-02-06 19:12
There are some problems that ‘=’ can’t solve, such as options with nargs ≥ 2.  optparse has no trouble with this:

>>> parser = optparse.OptionParser()
>>> parser.add_option('-a', nargs=2)
>>> parser.parse_args(['-a', '-first', '-second'])
(<Values at 0x7fc97a93a7e8: {'a': ('-first', '-second')}>, [])

But inputting those arguments is _not possible_ with argparse.

>>> parser = argparse.ArgumentParser()
>>> parser.add_argument('-a', nargs=2)
>>> parser.parse_args(['-a', '-first', '-second'])
usage: [-h] [-a A A]
: error: argument -a: expected 2 argument(s)
msg128072 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2011-02-06 19:33
Good point, I hadn't thought of that. Maybe ArgumentParser needs a "don't try to be so helpful, parse like optparse" option. Which is what Steven suggested earlier, I believe.

I'd take a crack at this if there's general consensus on that solution.

We can change the documentation to point out the issue now, but the feature request can only go in 3.3.
msg128076 - (view) Author: Anders Kaseorg (andersk) Date: 2011-02-06 19:53
That would be a good first step.

I continue to advocate making that mode the default, because it’s consistent with how every other command line program works[1], and backwards compatible with the current argparse behavior.

As far as documentation for older versions, would it be reasonable to un-deprecate optparse until argparse becomes a suitable replacement?  There are still lots of programmers working in Python 2.7.

[1] bethard’s msg128047 is confusing positional arguments with option arguments.  All UNIX commands that accept option arguments have no trouble accepting option arguments that begin with -.  For example, ‘grep -e -pattern file’ is commonly used to search for patterns beginning with -.
msg128078 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2011-02-06 20:02
I'd also like to see this as the default. After all, presumably we'd like Python scripts to work like all other command line programs, and I too am unaware of any other option parsing library that works the way argparse does.

But changing released behavior in the stdlib is problematic, as everyone knows.

I'll look into producing a patch to add this as optional behavior, then we can separately think about changing the default.
msg128090 - (view) Author: Steven Bethard (bethard) * (Python committer) Date: 2011-02-06 22:01
I don't think there's any sense in "un-deprecating" optparse because:

(1) It's only deprecated in the documentation - there is absolutely nothing in the code to keep you from continuing to use it, and there are no plans to remove it from Python.

(2) One (mis?)feature doesn't make the rest of the module useless.

And yes Eric, it would be awesome if you could develop a patch that allows the alternate parsing to be enabled when someone wants it. We should think about deprecation strategy though. Maybe something like:

== Python 3.3 ==
# Python 3.2 behavior
parser = ArgumentParser(error_on_unknown_options=True)
# proposed behavior
parser = ArgumentParser(error_on_unknown_options=False)
# deprecation warning when not specified
parser = ArgumentParser()

== Python 2.4 ==
# error warning when not specified
parser = ArgumentParser()

== Python 2.5 ==
# defaults to error_on_unknown_options=False
parser = ArgumentParser()

I'm not sure that's the right way to do it, but if the plan is to change the default at some point, we should make sure that we have a deprecation plan before we add the feature.
msg128091 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011-02-06 22:34
s/2.4/3.4/
s/2.5/3.5/
obviously :)
msg128094 - (view) Author: Anders Kaseorg (andersk) Date: 2011-02-07 02:08
> (1) It's only deprecated in the documentation

Which is why I suggested un-deprecating it in the documentation.  (I want to avoid encouraging programmers to switch away from optparse until this bug is fixed.)

> # proposed behavior
> parser = ArgumentParser(error_on_unknown_options=False)

Perhaps you weren’t literally proposing “error_on_unknown_options=False” as the name of the new flag, but note that neither the current nor proposed behaviors have nothing to do with whether arguments look like known or unknown options.  Under the proposed behavior, anything in argument position (--asciidoc-opts ___) is parsed as an argument, no matter what it looks like.

So a more accurate name might be “refuse_dashed_args=False”, or more generally (in case prefix_chars != '-'), “refuse_prefixed_args=False”?
msg128104 - (view) Author: Steven Bethard (bethard) * (Python committer) Date: 2011-02-07 07:58
@Éric: yes, thanks!

@Anders: The reason the current implementation gives you the behavior you don't want is that the first thing it does is scan the args list for things that look like flags (based on prefix_chars). It assumes that everything that looks like a flag is intended to be one, before it ever looks at how many arguments the flag before it takes or anything like that. This is the source of your problem - argparse assumes "-safe" is a flag, and as a result, there is no argument for "--asciidoc-opts'. So perhaps a better name would be something like dont_assume_everything_that_looks_like_a_flag_is_intended_to_be_one. ;-)
msg128134 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2011-02-07 16:34
Steven: Yes, the current structure of the first pass scan makes any patch problematic. It really would be an implementation of a different algorithm.

I'm still interested in looking at it, though.
msg128179 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2011-02-08 14:01
Without guessing which args are options, I don't see how it's possible to implement parse_known_args().

I'd propose raising an exception if it's called and dont_assume_everything_that_looks_like_a_flag_is_intended_to_be_one (or whatever it ends up being called) is True.
msg128266 - (view) Author: Steven Bethard (bethard) * (Python committer) Date: 2011-02-10 07:11
Maybe dont_assume_everything_that_looks_like_a_flag_is_intended_to_be_one should actually be a new class, e.g.

    parser = AllowFlagsAsPositionalArgumentsArgumentParser()

Then you just wouldn't provide parse_known_args on that parser.
msg128728 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2011-02-17 15:52
[I doubt my terminology is exactly correct in this post, but I've tried my best to make it so.)

The more I think about this the more I realize we can't implement a parser that doesn't make guesses about '-' prefixed args and that works with arparse's existing behavior with respect to optional arguments.

For example:
parser = argparse.ArgumentParser()
parser.add_argument('--foo', nargs='?')
parser.add_argument('--bar', nargs='?')
print parser.parse_args(['--foo', '--bar', 'a'])
print parser.parse_args(['--foo', 'x', '--bar', 'a'])

Unless the parser tries to guess that --bar is an optional argument by itself, it can't know that --foo has an argument or not.

I guess it could look and say that if you called this with '--foo --baz', then '--baz' must be an argument for '--foo', but then you could never have an argument to '--foo' named '--bar', plus it all seems fragile.

Maybe this new parser (as Steven described it) wouldn't allow a variable number of arguments to optional arguments? That is, nargs couldn't be '?', '*', or '+', only a number.
msg132220 - (view) Author: Steven Bethard (bethard) * (Python committer) Date: 2011-03-26 09:53
Thanks for the analysis Eric. Yeah, it does seem like it's not possible to implement this feature request while still supporting optionals with variable number arguments.

@andersk: Would the restriction to only having flags with a fixed number of arguments be acceptable for your use case?
msg132260 - (view) Author: Anders Kaseorg (andersk) Date: 2011-03-26 18:11
> @andersk: Would the restriction to only having flags with a fixed
> number of arguments be acceptable for your use case?

I think that’s fine.  Anyone coming from optparse won’t need options with optional arguments.

However, FWIW, GNU getopt_long() supports options with an optional argument under the restrictions that:
 • the option must be a long option,
 • the optional argument must be the only argument for the option, and
 • the argument, if present, must be supplied using the
   ‘--option=argument’ form, not the ‘--option argument’ form.
This avoids all parsing ambiguity.  It would be useful to have feature parity with getopt_long(), to facilitate writing Python wrapper scripts for C programs.
msg150310 - (view) Author: James B (skilletaudio) Date: 2011-12-28 18:15
I have encountered this issue(python 2.7) with respect to positional arguments that begin with a dash (linux/ bash).

In the following example, the parser requires three positional arguments. I attempted to encase the arguments in single-quotes as that is expected in general to result in strings to be correctly handled (these args are API keys, so they could contain shell-unfriendly chars like - and &).

./tool.py  arg1 'arg2' '-arg3&otherstuff' 

You'll note there are no optional arguments in this example, it just boils down to a positional argument being broken up on parse.

Needless to say it was quite confusing to see the script complain after passing in what would typically be perfectly valid strings in most other apps / scripts.

Is it possible to get argparse to correctly notice and handle shell-appropriate single-quoting methods(dont break down a string that has been implied as a complete token via ' ')

As it stands, it appears I have two workaround options: 1) adopt the ./tool.py -- <postional args> convention mentioned in this thread, or 2) escape leading dashes in positional argument strings to avoid this issue.
msg150320 - (view) Author: Anders Kaseorg (andersk) Date: 2011-12-28 22:20
James: That’s not related to this issue.  This issue is about options taking arguments beginning with dash (such as a2x --asciidoc-opts --safe, where --safe is the argument to --asciidoc-opts), not positional arguments beginning with dash.

Your observation isn’t a bug.  In all getopt-like parsers, -- is the only way to pass positional arguments beginning with -.  (Whether you shell-quoted the argument is irrelevant; the - is interpreted by the program, not the shell, after the shell has already stripped off the shell quoting.)

If your program doesn’t take any options and you’d like to parse positional arguments without requiring --, don’t use a getopt-like parser; use sys.argv directly.

If you still think your example is a bug, please file a separate report.
msg169712 - (view) Author: Christophe Guillon (Christophe.Guillon) Date: 2012-09-02 17:58
As a workaround for this missing feature,
the negative number matching regexp can be used for allowing arguments starting with '-' in arguments of option flags.

We basically do:
parser = argparse.ArgumentParser(...)
parser._negative_number_matcher = re.compile(r'^-.+$')

This allow cases such as @andersk:
$ a2x --asciidoc-opts --safe gitcli.txt
where '--safe' is an argument to '--asciidoc-opts'

As this behavioral change is quite simple, couldn't the requested feature be implemented like this with an optional setting to the ArgumentParser contructor?
msg169978 - (view) Author: Steven Bethard (bethard) * (Python committer) Date: 2012-09-07 07:24
Interesting idea! The regex would need a little extra care to interoperate properly with prefix_chars, but the approach doesn't seem crazy. I'd probably call the constructor option something like "args_default_to_positional" (the current behavior is essentially that args default to optional arguments if they look like optionals).

I'd be happy to review a patch along these lines. It would probably be good if Anders Kaseorg could also review it to make sure it fully solves his problem.
msg184174 - (view) Author: paul j3 (paul.j3) * Date: 2013-03-14 17:09
If nargs=2, type=float, an argv like '1e4 -.002' works, but '1e4 -2e-3' produces the same error as discussed here.  The problem is that _negative_number_matcher does not handle scientific notation.  The proposed generalize matcher, r'^-.+$', would solve this, but may be overkill.

I'm not as familiar with optparse and other argument processes, but I suspect argparse is different in that it processes the argument strings twice.  On one loop it parses them, producing an arg_strings_pattern that looks like 'OAA' (or 'OAO' in these problem cases).  On the second loop is consumes the strings (optionals and positionals).  This gives it more power, but produces problems like this if the parsing does not match expectations.
msg184177 - (view) Author: Evgeny Kapun (abacabadabacaba) Date: 2013-03-14 17:46
The way how argparse currently parses option arguments is broken. If a long option requires an argument and it's value isn't specified together with the option (using --option=value syntax), then the following argument should be interpreted as that value, no matter what it looks like. There should be no guesses or heuristics here. That the behavior depends on whether some argument "looks like" a negative number is the most horrible. Argument parsing should follow simple, deterministic rules, preferably the same that used by standard getopt(3).
msg184178 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2013-03-14 17:59
Evgeny: I completely agree. It's unfortunate that argparse doesn't work that way.

However, I think it's too late to change this behavior without adding a new parser. I don't think existing argparse can be changed to not operate the way it does, due to backward compatibility concerns. The discussion in this issue describes those compatibility concerns.
msg184180 - (view) Author: paul j3 (paul.j3) * Date: 2013-03-14 19:38
We need to be careful about when or where _negative_number_match is changed.
"
We basically do:
parser = argparse.ArgumentParser(...)
parser._negative_number_matcher = re.compile(r'^-.+$')
"

This changes the value for the parser itself, but not for the groups (_optionals, _positionals) or any subparsers. The code takes special care to make sure that the related property: _has_negative_number_optionals is properly shared among all these ActionContainers.
msg184211 - (view) Author: paul j3 (paul.j3) * Date: 2013-03-15 04:43
While

parser._negative_number_matcher

is used during parser.parse_args() to check whether an argument string is a 'negative number' (and hence whether to classify it as A or O).

parser._optionals._negative_number_matcher

is used during parser.add_argument() to determine whether an option_string is a 'negative number', and hence whether to modify the _hasNegativeNumberOptionals flag.  If this matcher is the general r'^-.+$', adding the default '-h' will set this flag.  We don't want that.

Using a different matcher for these two containers might work, but is awfully kludgy.
msg184425 - (view) Author: paul j3 (paul.j3) * Date: 2013-03-18 05:45
I think the `re.compile(r'^-.+$')` behavior could be better achieved by inserting a simple test in `_parse_optional` before the `_negative_number_matcher` test.

    # behave more like optparse even if the argument looks like a option
    if self.args_default_to_positional:
        return None

In effect, if the string does not match an action string, say it is a positional.

Making this patch to argparse.py is simple. How much to test it, and how document it requires more thought.
msg184987 - (view) Author: paul j3 (paul.j3) * Date: 2013-03-22 17:16
This patch makes two changes to argparse.py ArgumentParser._parse_optional()

- accept negative scientific and complex numbers

- add the args_default_to_positional parser option

_negative_number_matcher only matches integers and simple floats.  This
is fine for detecting number-like options like '-1'.  But as used in
_parse_optional() it prevents strings like '-1e4' and '-1-4j' from being
classed as positionals (msg184174).  In this patch it is replaced with

    try:
        complex(arg_string)
        return None
    except ValueError:
        pass

Immediately before this number test I added

    if self.args_default_to_positional:
        return None

to implement the idea suggested in msg169978.

I added the args_default_to_positional parser option to the documentation, along with some notes on its implications in the `Arguments containing -` section.  A few of the examples that I added use scientific or complex numbers.

I tested test_argparse.py with args_default_to_positional=True default.  A number of the 'failures' no longer failed.
class TestDefaultToPositionalWithOptionLike illustrates this in the
Option-Like situation.

The only 'successes' to fail were in the TestAddSubparsers case.  There
an argument string  '0.5 -p 1 b -w 7' produced 'wrong choice' error,
since the '-p' was assumed to be a commands choice, rather than an unknown optional.

I translated the TestStandard cases from the optparse test file.  argparse ran most of these without problem.  The value of args_default_to_positional makes no difference.  There a few optparse tests that use '--'  or a valid optional as positional that argparse does not handle.
msg239435 - (view) Author: paul j3 (paul.j3) * Date: 2015-03-27 20:51
http://bugs.python.org/issue22672
float arguments in scientific notation not supported by argparse

is a newer complaint about the same issue.  I've closed it with link to here.
msg251815 - (view) Author: Memeplex (memeplex) Date: 2015-09-29 03:28
What's missing for this patch to be applied? Can I help somehow?
msg251862 - (view) Author: Memeplex (memeplex) Date: 2015-09-29 14:41
Here is another manifestation of this problem: http://bugs.python.org/issue17050
msg263211 - (view) Author: Cherniavsky Beni (cben) * Date: 2016-04-11 22:03
+1, is there anything missing to apply Paul's patch?

Can I additional suggest a change to the error message, e.g.:

  $ prog --foo -bar
  prog: error: argument --foo: expected one argument
  (tip: use --foo=-bar to force interpretation as argument of --foo)

This can be safely added in the current mode with no opt-in required, and will relieve the immediate "but what can I do?" confusions of users.  The workaround is hard to discover otherwise, as `--foo=x` is typically equivalent to `--foo x`.

--- more discussion, though I suspect it's not productive ---

I've tried to find what the GNU Standards or POSIX say about this and was surprised to see neither explains how exactly `--opt_with_mandatory_argument -quux` behaves.

man getopt says:

     If such a character is followed by a colon, the option requires an argument, so getopt() places a pointer to the following text in the same argv-element, or the text of the following argv-element, in optarg. Two colons mean an option takes an optional arg; if there is text in the current argv-element (i.e., in the same word as the option name itself, for example, "-oarg"), then it is returned in optarg, otherwise optarg is set to zero. This is a GNU extension.

POSIX similarly does explain that an optional arg after an option must follow within the same argument:

    (2)(b) If the SYNOPSIS shows an optional option-argument (as with [ -f[ option_argument]] in the example), a conforming application shall place any option-argument for that option directly adjacent to the option in the same argument string, without intervening <blank> characters. If the utility receives an argument containing only the option, it shall behave as specified in its description for an omitted option-argument; it shall not treat the next argument (if any) as the option-argument for that option.

    -- http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap12.html

Anyway, every argument parsing library I've ever seen parses options in a left-to-right pass, consuming non-optional arguments after an option whatever they look like.  I've never seen a difference between `--foo bar` and `--foo=bar` when bar is *non-optional*.

Both behaviors (--opt_with_mandatory_argument bar, --opt_with_optional_argument[=bar]) were clearly designed to avoid ambiguity.
Whereas argparse innovated some constructs eg. '--opt', nargs='*' that are inherently ambiguous.  But for the simple constructs, most notably nargs=1, there should be a way to get the traditional unix meaning.
msg263216 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-04-12 01:24
My main concern with the patch is that it only half fixes the problem. It sounds like it will allow parsing “--opt -x” (if “-x” is not registered as an option), but will still refuse “--opt -h”, assuming “-h” is registered by default. What is the barrier to parsing an argument to the option syntax independently of what option names are registered?

Also the name “args_default_to_positional=True” name is both unwieldy and vague to me. The purpose seems to be to disable option-lookalike-strings from being reserved. Maybe call it something like “reserve_all_options=False” or “reserve_unregistered_options=False”?

I left some thoughts in the code review for the documentation too.
History
Date User Action Args
2016-06-15 10:04:03spaceonesetnosy: + spaceone
2016-04-12 01:25:00martin.pantersetnosy: + martin.panter

messages: + msg263216
stage: needs patch -> patch review
2016-04-11 22:03:47cbensetnosy: + cben
messages: + msg263211
2016-01-25 17:14:25eric.smithlinkissue26196 superseder
2015-09-29 14:41:48memeplexsetmessages: + msg251862
2015-09-29 03:28:52memeplexsetnosy: + memeplex
messages: + msg251815
2015-03-27 21:10:13terry.reedylinkissue22672 superseder
2015-03-27 20:51:18paul.j3setmessages: + msg239435
2014-04-14 20:21:01eric.smithsetversions: + Python 3.5, - Python 2.7, Python 3.2, Python 3.3
2013-03-22 17:16:18paul.j3setfiles: + final.patch
keywords: + patch
messages: + msg184987
2013-03-18 05:45:12paul.j3setmessages: + msg184425
2013-03-15 04:43:25paul.j3setmessages: + msg184211
2013-03-14 19:38:46paul.j3setmessages: + msg184180
2013-03-14 17:59:52eric.smithsetmessages: + msg184178
2013-03-14 17:46:25abacabadabacabasetmessages: + msg184177
2013-03-14 17:09:24paul.j3setnosy: + paul.j3
messages: + msg184174
2013-03-02 17:22:15abacabadabacabasetnosy: + abacabadabacaba
2012-12-31 23:41:26danielshsetnosy: + danielsh
2012-12-19 10:17:22gfxmonksetnosy: + gfxmonk
2012-09-07 07:24:48bethardsetmessages: + msg169978
2012-09-02 17:58:45Christophe.Guillonsetnosy: + Christophe.Guillon
messages: + msg169712
2012-08-22 15:40:37amcnabbsetnosy: + amcnabb
2011-12-28 22:20:10andersksetmessages: + msg150320
2011-12-28 18:15:42skilletaudiosetnosy: + skilletaudio
messages: + msg150310
2011-03-26 18:11:01andersksetmessages: + msg132260
2011-03-26 09:53:55bethardsetmessages: + msg132220
versions: - Python 3.1
2011-02-17 15:52:04eric.smithsetnosy: bethard, eric.smith, eric.araujo, r.david.murray, andersk, gdb, nelhage, drm, davidben
messages: + msg128728
2011-02-10 07:11:32bethardsetnosy: bethard, eric.smith, eric.araujo, r.david.murray, andersk, gdb, nelhage, drm, davidben
messages: + msg128266
2011-02-08 14:01:21eric.smithsetnosy: bethard, eric.smith, eric.araujo, r.david.murray, andersk, gdb, nelhage, drm, davidben
messages: + msg128179
2011-02-07 16:34:45eric.smithsetnosy: bethard, eric.smith, eric.araujo, r.david.murray, andersk, gdb, nelhage, drm, davidben
messages: + msg128134
2011-02-07 07:58:28bethardsetnosy: bethard, eric.smith, eric.araujo, r.david.murray, andersk, gdb, nelhage, drm, davidben
messages: + msg128104
2011-02-07 02:08:36andersksetnosy: bethard, eric.smith, eric.araujo, r.david.murray, andersk, gdb, nelhage, drm, davidben
messages: + msg128094
2011-02-07 01:08:07davidbensetnosy: + davidben
2011-02-06 22:34:53eric.araujosetnosy: bethard, eric.smith, eric.araujo, r.david.murray, andersk, gdb, nelhage, drm
messages: + msg128091
2011-02-06 22:01:26bethardsetnosy: bethard, eric.smith, eric.araujo, r.david.murray, andersk, gdb, nelhage, drm
messages: + msg128090
2011-02-06 20:02:17eric.smithsetnosy: bethard, eric.smith, eric.araujo, r.david.murray, andersk, gdb, nelhage, drm
messages: + msg128078
2011-02-06 19:53:44andersksetnosy: bethard, eric.smith, eric.araujo, r.david.murray, andersk, gdb, nelhage, drm
messages: + msg128076
2011-02-06 19:33:14eric.smithsetnosy: bethard, eric.smith, eric.araujo, r.david.murray, andersk, gdb, nelhage, drm
messages: + msg128072
2011-02-06 19:12:18andersksetnosy: bethard, eric.smith, eric.araujo, r.david.murray, andersk, gdb, nelhage, drm
messages: + msg128071
2011-02-06 15:56:11eric.smithsetnosy: bethard, eric.smith, eric.araujo, r.david.murray, andersk, gdb, nelhage, drm
messages: + msg128062
2011-02-06 12:18:05eric.araujosetversions: + Python 3.1, Python 2.7, Python 3.2
nosy: + eric.araujo

messages: + msg128055

stage: test needed -> needs patch
2011-02-06 09:47:37bethardsetnosy: bethard, eric.smith, r.david.murray, andersk, gdb, nelhage, drm
messages: + msg128047
2011-02-05 20:54:37eric.smithsetnosy: bethard, eric.smith, r.david.murray, andersk, gdb, nelhage, drm
messages: + msg128025
versions: + Python 3.3, - Python 3.2
2011-02-05 19:13:29drmsetnosy: + drm
messages: + msg128014
2010-07-27 09:37:39bethardsetmessages: + msg111691
2010-07-26 23:17:59andersksetmessages: + msg111673
2010-07-26 22:43:47andersksetmessages: + msg111670
2010-07-26 21:53:48bethardsetmessages: + msg111669
2010-07-23 17:39:20andersksetmessages: + msg111367
2010-07-23 11:26:16bethardunlinkissue9338 superseder
2010-07-23 11:21:12bethardsetmessages: + msg111279
2010-07-23 10:47:34eric.araujolinkissue9338 superseder
2010-07-23 00:51:56r.david.murraysetmessages: + msg111230
2010-07-23 00:11:39gdbsetnosy: + gdb
2010-07-22 23:43:53andersksetmessages: + msg111228
2010-07-22 23:40:25nelhagesetnosy: + nelhage
messages: + msg111227
2010-07-22 23:06:20r.david.murraysetversions: - Python 2.7, Python 3.3
nosy: + r.david.murray, bethard

messages: + msg111224

type: enhancement
stage: test needed
2010-07-22 22:44:48eric.smithsetnosy: + eric.smith
2010-07-22 22:15:36anderskcreate