classification
Title: argparse doesn't allow optionals within positionals
Type: behavior Stage:
Components: Library (Lib) Versions: Python 3.4
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: amcnabb, bethard, docs@python, guilherme-pg, martin.panter, paul.j3, r.david.murray, v+python
Priority: normal Keywords: patch

Created on 2012-03-04 06:36 by v+python, last changed 2014-05-03 01:23 by paul.j3.

Files
File name Uploaded Description Edit
t12.py v+python, 2012-03-04 06:36 test case demonstrating bug
14191.diff guilherme-pg, 2012-03-06 15:11 review
t13.py v+python, 2012-03-06 22:04 Demonstration that optparse accepts scattered positional parameters.
t14.py v+python, 2012-03-07 17:44
t16.py v+python, 2012-03-09 01:35 Demonstrate solution based on Steven's suggestion
t18.py v+python, 2012-03-10 00:04 multiclass wrapper for argparse to allow intermixed parameters
t18a.py v+python, 2012-07-06 21:00
test_intermixed.py paul.j3, 2013-04-22 22:29
test_intermixed.py paul.j3, 2013-05-06 22:45
intermixed.patch paul.j3, 2013-05-06 22:51 review
test_intermixed.py paul.j3, 2013-05-10 19:38
intermixed.patch paul.j3, 2013-05-30 05:40 review
Messages (42)
msg154880 - (view) Author: Glenn Linderman (v+python) * Date: 2012-03-04 06:36
To me, "all positional parameters" mean whether they are in the front, back or middle, as long as they are not diriectly preceded by an option that can accept an unlimited number of parameters.

from argparse import ArgumentParser, SUPPRESS, REMAINDER
import sys
print( sys.version )
parser = ArgumentParser()
parser.add_argument('--foo', dest='foo')
parser.add_argument('--bar', dest='bar')
parser.add_argument('baz', nargs='*')
print( parser.parse_args('a b --foo x --bar 1 c d'.split()))
# expected:  Namespace(bar='1', baz=['a', 'b', 'c', 'd'], foo='x')
# actual: error: unrecognized arguments: c d

Above also supplied as a test file, t12.py
msg155008 - (view) Author: Steven Bethard (bethard) * (Python committer) Date: 2012-03-06 09:34
This behavior is intentional - positional arguments must be sequential, not broken up with optional (flag) arguments between. So this is a documentation bug.

Allowing positional arguments to be broken up with optional (flag) arguments between them would be a new feature. It would also break many current parsers, so it couldn't be turned on by default. A new constructor parameter or method or something would have to be added to ArgumentParser. Patches welcome.
msg155019 - (view) Author: Guilherme Gonçalves (guilherme-pg) Date: 2012-03-06 15:11
I uploaded an incomplete patch that might address the issue so it can be discussed.

This patch introduces 'greedy_star', a new constructor parameter to ArgumentParser that makes "*" positional arguments behave as expected in the test case.

The patch doesn't yet update the documentation and doesn't include test cases, but I'll be glad to provide those changes in a next version.

It is admittedly hackish, but I haven't found a better solution so far. Looking forward for your comments.
msg155039 - (view) Author: Glenn Linderman (v+python) * Date: 2012-03-06 22:04
Interesting that the behavior is intentional, yet it accepts positional parameters either before, or after, or between optional (flag) parameters.

This seems to me to be a case where proper documentation of the intention would have led to the realization that it is easier to fix the code than the documentation.

The only definition of positional parameters I could find in the present documentation is:

When parse_args() is called, optional arguments will be identified by the - prefix, and the remaining arguments will be assumed to be positional:

This is simple and succinct, but leads to my interpretation that they can be anywhere, intermixed with optional arguments.

Further, optparse, which argparse attempts to replace, permitted positional arguments to be intermixed with optional arguments, see new file t13.py which demonstrates that.

To document that positional parameters must be grouped together, yet can appear anywhere, the documentation would have to get much more verbose...  something like

All positional parameters must be grouped together in a single sequence.  However, that group of parameters may have optional parameters either before it or after it, or there may be optional parameter both before it and after it.
msg155062 - (view) Author: Steven Bethard (bethard) * (Python committer) Date: 2012-03-07 09:06
> optparse, which argparse attempts to replace, permitted positional 
> arguments to be intermixed with optional arguments

Sure, but optparse didn't actually parse positional arguments - it just threw them into a bag, and then you had to group them and convert them however you wanted afterwards. Argparse, instead, was designed to let you specify the groups of positional arguments. Your situation is a little different because you just want to throw all the positional arguments into a bag again. Not that there's anything wrong with that - it's just not the primary use case argparse had in mind.

> The only definition of positional parameters I could find...

Yeah, it looks like there's no good documentation on positional vs. optional parameters. Somewhere obvious, perhaps right at the beginning of the add_argument() documentation, there should probably be something like:

Argparse groups the command line argument strings into two types of groups: optional arguments, which are a sequence of command line strings that begin with a flag like "-v" or "--verbose", and positional arguments, which are a sequence of command line strings that do not begin with a flag. The add_argument() method allows you to specify how many command line strings each of your optional or positional arguments should consume, how those strings should be converted into Python objects, etc.
msg155103 - (view) Author: Glenn Linderman (v+python) * Date: 2012-03-07 17:44
Improved documentation would certainly help the situation.

And yes, I understand that optparse simply returned the set of positional parameters without giving them names, types, or groups.  So does getopt, and pretty much all previous art in the arena of command line parsing that I am familiar with.

To successfully replace optparse and other prior art, though, there should be an equivalent, although perhaps improved, functionality in argparse.

This lack of documentation for the idea that the ordered set of all positional parameters is not treated as a sequence certainly slipped under the covers of optparse functionality when I first read about it when optparse was being added to the stdlib.  I had no clue that the specification of positional parameters would do anything other than process the positional parameters sequentially, without being disrupted by intervening optional parameters.  The capabilities for naming, and typing those parameters are nice enhancements, but were not seen as redefining what positional parameters are, from its historical definition.  Is there wording in the PEP that describes such?

Naming and typing and even grouping positional parameters are all nice features... but there should be no undocumented boundaries between positional parameters (or groups of them), and presently there are no documented boundaries, and with prior art there were no boundaries.

Having no boundaries among positional parameters is a capability and expectation that has a long history, and tools ported from prior art to argparse need the capability to preserve command line compatibility.  Hence, I conclude that, unless this was spelled out in the PEP and I missed it, that having such boundaries is a bug, even if your intentions were otherwise, and that the test case I provided should work.

My test was only meant to demonstrate the issue, not to be a particular use case, but there are use cases that would be affected in the same manner as the demonstration.

Regarding your suggested documentation, it is more complete than my suggestion, but "sequence" should probably be replaced by "sequence of adjacent" if that is what is meant, because with positional parameters, the historical perspective is that the sequence of positional parameters may be interrupted by optional parameters, but that makes it no less a sequence.

I believe that the present syntax for parsing positional parameters should be fixed to handle all positional parameters, because of the history of prior art, and that if there is a need, benefit, or demand for treating positional parameters in groups, then that should be documented and created as additional features.

I further cannot figure out how to even parse the additional positional parameters as a separate group, using the current capabilities.  My attempt to do so in t14.py failed.
msg155157 - (view) Author: Steven Bethard (bethard) * (Python committer) Date: 2012-03-08 10:16
> Hence, I conclude that, unless this was spelled out in the PEP and I 
> missed it, that having such boundaries is a bug

Practically speaking, we just can't change this because it will break existing argparse scripts. Argparse has had this behavior since 2006 when it was first released, and I guarantee you that many scripts expect and rely on this behavior.

As I said earlier, the only reasonable solution is to document the current behavior more explicitly, and then add a new constructor parameter or method or something to enable the behavior you want.

I looked a bit a guilherme's patch, and I think it's not really the right direction. We definitely shouldn't be modifying the action classes like _StoreAction. All changes should be within _parse_known_args. The parsing algorithm is so different for the proposed behavior, that I wonder if it wouldn't be easier to just create a new ArgumentParser subclass, ArgumentParserAllowingOptionalsInsidePositionals (or whatever), that just overrides _parse_known_args and rewrites it entirely.
msg155172 - (view) Author: Glenn Linderman (v+python) * Date: 2012-03-08 18:48
Ah yes, argparse had a life outside the stdlib, so now I understand your compatibility concerns.

Mind you, I think the overall technology of argparse is superior to optparse, which is why, together with the optparse deprecation, I am trying to port to use it... so consider me a fan, not an enemy.  But...

However, it seems that while the esoteric extensions required in optparse were a known incompatibility at the time the PEP was written, the incompatibility with intermixed positional and optional parameters slipped under the radar... but is actually a more serious compatibility problem for general use.

I see three possible ways forward, maybe there are others.

1) un-deprecate optparse, explaining and documenting this functional difference between optparse and argparse.  The deprecation of optparse is what makes this missing capability a bug, rather than a feature enhancement.

2) add features to argparse to make it capable of parsing all the same command lines as unextended optparse. (I'm of the opinion that folks that extended optparse can learn to extend argparse in similar or more capable manners; not having such extensions, I'm not qualified to state whether there are optparse extensions that cannot be ported to use standard or extended argparse capabilities.)  The documentation for such features should clearly state that former users of argparse may be interested in using this feature, and should state why; further, the deprecation notice in optparse should be updated to point out that porting to argparse may need to use this particular argparse capability to achieve command line compatibility with optparse, and that the capability is not available until (specified release).

3) If there is an already existing way (my t14.py is a half-hearted attempt to find it) to parse scattered positional parameters, it could be explicitly documented rather than writing new capabilities per #2.  However, since you as author jumped to the new capabilities option straightway, I suspect this is not really a possibility.


The rest of this is concerned option #2, which seems the best way forward to me, with my current knowledge.

You mention ArgumentParserAllowingOptionalsInsidePositionals and that is extremely lengthy, might I suggest something like  ArgumentParserIntermixed ?

What would be the scope of the effort, and what release(s) might be a possible target? (since it is a bug, it can be backported, but since the cure will be implemented as a new capability, that might be problematical for point releases, somehow? I'm not the expert in that area.)
msg155185 - (view) Author: Steven Bethard (bethard) * (Python committer) Date: 2012-03-08 23:29
Thinking about it a bit more, it strikes me that maybe you could get the behavior you want by declaring two parsers, one with just optionals, and one with just positionals. Then:

optional_args, remaining_args = optionals.parse_known_args()
args = positionals.parse_args(remaining_args)
vars(args).update(vars(optional_args))

Basically, you first parse out all the optional arguments, then you parse out the positional arguments from what's left after the optional arguments are stripped out. This approach seems to work for your t14.py.
msg155188 - (view) Author: Steven Bethard (bethard) * (Python committer) Date: 2012-03-08 23:48
Actually, that could be even simpler:

args, remaining_args = optionals.parse_known_args()
args = positionals.parse_args(remaining_args, args)
msg155202 - (view) Author: Glenn Linderman (v+python) * Date: 2012-03-09 01:35
*Very* interesting, Steven.

Looking again at section 15.4.6, and recognizing that positional arguments were never parsed by optparse, then we discover that with two minor tweaks, removing "and add additional ArgumentParser.add_argument() calls for the positional arguments." from step two, and calling parse_known_args instead of parse_args, we actually achieve greater compatibility with optparse, for those that need it.

However, the above would allow undefined arguments to slip through into the positional arguments.

So to prevent that, the second parser you suggest, defining only the positional arguments, will tighten up the error checks.

So this issue could be resolved simply by updating section 15.4.6 of the documentation to explain all this.  On the other hand, the explanation gets long and perhaps confusing.


Looking at the 15.4.4, I see:

ArgumentParser.parse_args(args=None, namespace=None)

    Convert argument strings to objects and assign them as attributes of the namespace. Return the populated namespace.

    Previous calls to add_argument() determine exactly what objects are created and how they are assigned. See the documentation for add_argument() for details.

    By default, the argument strings are taken from sys.argv, and a new empty Namespace object is created for the attributes.


However, nowhere is the args= parameter explained.  One example is given at the end of 15.4.4.6 showing the use of args= which apparently accepts a list of parameters, similar to the positional list of parameters that are used in all the other examples.  It might be nice to clarify that.


This leads into a suggestion: a new keyword parameter for  parse_args: intermixed=False.

When False, the default, new_parse_args would act exactly as it does today, possibly by calling old_parse_args.  When True, parse_args would implement your suggestion probably in the following way: filter out the positional parameters, call parse_known_args, then filter out the optional parameters, call old_parse_args, and return the combination.

t16.py implements this, external to the class, and using two parsers as you suggested.

One thing I notice in playing with my optparse function, is that error messages contain an improper usage message.  This would seem to be most easily fixed if the logic were built in to argparse, rather than attempting to do it externally.

Adding this parameter would also make it is much easier to write section 15.4.7 accurately, and would reduce the porting burden on each implementer, as well.  The technique has merit in achieving compatibility with optparse, to allow it to remain deprecated, and looks straightforward to implement.
msg155212 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012-03-09 08:33
To answer Glenn's procedural question: no this is not a bug whose fix can be backported.  API changes are not allowed in maintenance releases.  Doc improvements can be backported, though, so I'm leaving versios alone (alternatively someone could split this into two bugs, one for the doc fixes for older releases, and one for adding the new capability).
msg155272 - (view) Author: Glenn Linderman (v+python) * Date: 2012-03-10 00:04
Sad.  That means all the documentation of workarounds needs to be written, even figured out in the first place.  Steven's code, while being a nice implementation when proper arguments are provided, produces inappropriate errors, because only the positional, or only the optional, parameters are printed when errors occur.

So it would probably take a third parser, with all the parameters defined, to exist, to allow easiest generation of the usage message, but I'm not quite sure how to catch the error printing, and redirect it to the third parser.

So, I tried the classes in t17.py; they are not complete; CompatibleArgumentParser should pass through all the other APIs, and I'm not sure if all the extension semantics can be appropriately passed through when there are three classes and two objects involved.  But this is sort of a proof-of-concept wrapper for achieving intermixed optional and positional arguments, and still get proper error messages.
msg155277 - (view) Author: Glenn Linderman (v+python) * Date: 2012-03-10 01:16
Of course, if a "real" solution can only be shipped in 3.3, it may want to use a different API than parse_args to avoid the parameter, parse_intermixed_args, perhaps. But my t18.py uses the name parse_args, but just always does the intermixed parsing, so that is something to be aware of if building on that code at all, for the real solution or for sample code for older versions.
msg164756 - (view) Author: Glenn Linderman (v+python) * Date: 2012-07-06 21:00
See also issue 15258 which points out issues with the converse case.

Further testing and development also discovered that in certain error cases, the help message produced by t18-equivalent code was incorrect.

t18a.py is an update/rewrite (same concepts, though) that produces a correct help message when errors occur.
msg166175 - (view) Author: Steven Bethard (bethard) * (Python committer) Date: 2012-07-22 22:16
I created Issue 15427 for the parse_args documentation bug. So let's make this issue just about parsing intermixed arguments.

Yes, if someone would like to provide a patch for this, please create a method "parse_intermixed_args" rather than adding a boolean flag parameter. It should be basically equivalent to this code:

args, remaining_args = optionals.parse_known_args()
args = positionals.parse_args(remaining_args, args)

Except that it should give proper error messages. There should be some tests to make sure both that it parses things as expected and that it gives error messages as expected.
msg166193 - (view) Author: Glenn Linderman (v+python) * Date: 2012-07-23 01:49
So my t18a.py wraps Argparse, because the externals are documented and I could understand that. Given enough time, I might be able to understand the internals too... it is just Python...

Seems like the internals separate positionals and optionals into two subparsers, the logic needed is to save the positionals, temporarily replace them with an empty group, and call parse_known_args, then restore the positionals, and call parse_known_args again.

What I haven't figured out, is how that would affect the help message building, and how to make the empty group of positionals without affecting the help message.

It's also not clear that it is possible, with current internals, to substitute an empty group... the only features for creating a group seem to always add it to a list of _action_groups.  So that muddies the water, just like the help message... everything seems intertwined internally.

A little guidance here, would let me code... I could probably whack and slash, and keep extra lists of arguments and sort of make it work like my wrapper does, but I doubt that would be an acceptable patch. So I would be happy, with guidance, to try to create the code, and I can do a "diff", but beyond that, I'm sort of at a loss.

So I'd be happy to work with someone to create the patch, but I've also never created a patch, nor test cases, for Python (or any other open source project). Theoretically, it is straightforward, and documented; as a practical matter, it isn't likely that I'll find time to figure out all that methodology and actually create a patch, in the near future (although it is on my list of things to learn, in the fullness of time, and after the first one, I'm sure subsequent ones would be easier).

Or if someone else wants to code it, I'd be happy to look it over, test it with my environment and applications that I'm currently using with my wrapper.

Given this guidance, I've tweaked my wrapper to have parse_intermixed_args rather than changing the behavior of parse_args as I have been, and will tweak the apps correspondingly, so I'll be in a position to test any code created for this issue.
msg185517 - (view) Author: paul j3 (paul.j3) * Date: 2013-03-29 20:56
Glenn
I looked at your t18a.py test case

    parser = ArgumentParser()
    parser.add_argument('--foo', dest='foo')
    parser.add_argument('--bar', dest='bar')
    parser.add_argument('foz')
    parser.add_argument('baz', nargs='*')

and parse variations on 'a b c d --foo x --bar 1'

I think your main problem is with the 'baz', nargs='*'.  If nargs was changed to '+', 'a --foo x b c d --bar 1' would work, returning {foz='a', bar=['b','c','d']}.

argparse alternates between consuming positional and optionals until it runs out of arguments or argument strings.  But with '*', both 'foz' and 'baz' are consumed with the first set of positional strings {foz='a', baz=[]}.  When it gets to 'b c d' there are no more positional arguments to consume, so they get put into 'extras'.

With nargs='+', 'a b --foo x c d --bar 1' would assign {foz='a', baz=[b]}, and extras=['c','d'].

So while optionals can be interspersed with positionals, they can't be placed within the set of strings intended for one positional. That seems to me to very reasonable (why break up 'b c d'?).  And as your file demonstrates, you can fall back on parse_known_args to handle the extras.
msg187051 - (view) Author: paul j3 (paul.j3) * Date: 2013-04-16 07:09
This patch permits the mixing of optionals with positionals, with the caveat that a particular positional cannot be split up.

If:

    parser = ArgumentParser()
    parser.add_argument('-f','--foo')
    parser.add_argument('cmd')
    parser.add_argument('rest', nargs='*')

    '-f1 cmd 1 2 3', 
    'cmd -f1 1 2 3', 
    'cmd 1 2 3 -f1' 

all give {cmd='cmd', rest=['1','2','3'], foo='1'}.  

But 'cmd 1 -f1 2 3', does not recognize ['2','3'].

Previously 'cmd -f1 1 2 3' would return rest=[], and not recognize ['1','2','3'].  With this change the nargs='*' behaves more like nargs='+', surviving to parse the 2nd group of positional strings.

The trick is to modify arg_counts in consume_positionals(), removing matches that don't do anything (don't consume argument strings). 

    if 'O' in arg_strings_pattern[start_index:]:
        # if there is an optional after this, remove
        # 'empty' positionals from the current match
        while len(arg_counts)>1 and arg_counts[-1]==0:
            arg_counts = arg_counts[:-1]

This change passes all of the existing test_argparse.py tests.  It also passes the optparse tests that I added in http://bugs.python.org/issue9334#msg184987
I added 4 cases to illustrate this change.
msg187269 - (view) Author: Glenn Linderman (v+python) * Date: 2013-04-18 17:14
Paul, your comments are interesting, but your proposed patch doesn't actually solve the problem.

So here I am typing away at my command prompt, and I type in a couple optional parameters I know I'll need and start on the sequence of positional ones, and half way through I remember "oh, I need another optional paremeter" so I type it in next before I forget, and then go on with the positional ones.

No prior Unix-style argument parsing mechanism that I have ever seen or heard of would be confused by that, but argparse is.

This bug is about providing a facility in argparse that supports intermixing optional parameters into strings of positional parameters, just like all prior Unix-style argument parsing mechanisms, so that an application can be ported to use argparse without breaking command lines that their users have stored in command files. Otherwise argparse is not an upgrade path for an application, yet optparse has been deprecated.
msg187270 - (view) Author: Glenn Linderman (v+python) * Date: 2013-04-18 17:33
I should clarify, before someone jumps in: some particular applications do implement restrictions on order of optional and positional arguments; I'm aware of that. getopt easily supported application defined order restrictions, because it processed arguments sequentially, and the processing loop was user code. optparse, as has been pointed out, parses the optionals, and leaves a single list of positionals, combined from between all the optionals, for the user code to process in any manner, but would actually make it harder for user code to implement order restrictions. argparse goes the other way, taking over all the user parsing (which is a good thing), but not providing sufficient features to implement flexible mixing of optional and positional arguments.
msg187329 - (view) Author: paul j3 (paul.j3) * Date: 2013-04-19 01:36
Glenn
Take a look at http://bugs.python.org/issue15427

I took a stab at changing the documentation, including recommending parse_known_args when porting optparse uses.
msg187341 - (view) Author: Glenn Linderman (v+python) * Date: 2013-04-19 03:51
Docs look good as mentioned there, for the current behavior, although it would be good to improve the behavior.  

Note that I have supplied a wrapper (t18a.py) (if it hasn't bit-rotted for 3.4, I'm still using 3.3) that provides the needed functionality. The problem is, that I have no clue how to modify the internals of argparse to allow it to simply be a method of the current argparse class.  One could achieve the goal by renaming the current argparse class to _argparse, and renaming my wrapper class to be the "real" argparse, and that would work, but would seem to be inefficient.

It would be nice if someone could move the needed functionality, a new API called parse_intermixed_args, already approved by msg166175, that does the same thing as my wrapper does, but without the wrapper class. This would be a cure to the problem, and it could be tested against my wrapper class by comparison to ensure the needed functionality is provided. I'd be glad to help with testing and understanding the requirements, but don't have time to figure out the internals of argparse at present.
msg187596 - (view) Author: paul j3 (paul.j3) * Date: 2013-04-22 22:29
The attached file has a 'parse_intermixed_args()' that has the same API as 'parse_known_args()'.

It follows the two parse step model

    args, remaining_args = optionals.parse_known_args()
    args, extras = positionals.parse_known_args(remaining_args, args)

except that the 'optionals parser' is self with the positional arguments 'deactivated' by setting their nargs to 0.  Similarly the 'positionals parser' is self with the optional arguments set to 'required=false'.

Here it is in a standalone test module illustrating its functionality and limitations.  I could provide a patch, but this form might easier to test in your own code.

When used to run test_argparse.py, it had problems in the cases where the distinction between positionals and optionals is blurred.  

For example, PARSER and REMAINDER are supposed to grab everything that follows regardless of what it looks like.  I choose to fall back on a single 'parse_know_args' call.  Raising an error would the alternative.

Similarly, a mutually exclusive group that includes a positional is difficult to handle.  Again I fall back on the single step.

So the two issues to be discussed are:

- does it provide the desired freedom to mix optionals and positionals?

- in the difficult cases, should it punt, or raise an error?
msg187618 - (view) Author: Glenn Linderman (v+python) * Date: 2013-04-23 08:14
Very nice, Paul.

I tested that with some of my applications, and some of my test cases. All of them initially failed, because you have parse_intermixed_args returning parameters like parse_known_args instead of like parse_args. Now I can understand that might be a little confusing in msg166175, but note that the implementation is "like" a call to parse_known_args followed by a call to parse_args... strongly implying that the return should be like parse_args.

After tweaking your implementation in that regard, then I was able to get all the same applications and test cases to pass, although I haven't tried all my applications and all my test cases, as yet.

Your techniques for disabling particular parameters are pretty clever.

I think the difficult cases should raise an error. 

Firstly, parse_intermixed_args is intended to be for functional compatibility with optparse functionality, which doesn't support the difficult cases, therefore use of the difficult cases would require additional restrictions on the allowed order of options on the command line, beyond what optparse supports... this would be an application interface change, and as part of that interface change, should such happen, the flexibility of intermixing optionals and positionals can be restricted.

Secondly, if an understanding of how to define the use parse_intermixed_args with one or more of the difficult cases is reached, replacing an error case with a functional case is possible, but replacing one silent functionality with a different one is a backwards compatibility problem. Throwing an error avoids limiting a future definition of these cases.

The freedom of mixing optionals and positionals that would available in the now deprecated optparse does seem to be restored by this patch.

I look forward to seeing a revised patch, this is a very promising solution to this bug.
msg187669 - (view) Author: paul j3 (paul.j3) * Date: 2013-04-23 21:22
Yes, http://bugs.python.org/msg166175 does use 'parse_args' in the second call.  But I think things would be more flexible if we had a second function:

    def parse_???(self, args=None, namespace=None):
        args, argv = self.parse_intermixed_args(args, namespace)
        if argv:
            msg = _('unrecognized arguments: %s')
            self.error(msg % ' '.join(argv))
        return args

But then what would a be a good pair of names?

parse??? and parse_intermixed_args
versus
parse_intermixed_args and parse_known_intermixed_args
or
something else?
msg187680 - (view) Author: Glenn Linderman (v+python) * Date: 2013-04-24 01:11
Yes, a second function would give more flexibility.

Due to the "approval" in msg166175 to use the name parse_intermixed_args for the functionality described there, it would probably be best to use that name for that functionality.

So then we are left naming your current function something else. parse_known_intermixed_args certainly is descriptive, and fits the naming conventions of the other methods in the class. Quite long, unfortunately... but then I doubt it will get used much. I am using parse_intermixed_args regularly (via my wrapper class), and it is quite long enough.
msg188608 - (view) Author: paul j3 (paul.j3) * Date: 2013-05-06 22:45
This is a revision of the test_intermixed.py that I submitted earlier.  Now `parse_intermixed_args` acts like `parse_args', and calls `parse_known_intermixed_args`.  Again it is form that can exercise the idea without modifying `argparse.py`.

If the parser has incompatible features (REMAINDER, PARSER, or certain exclusive groups), it raises an error.  However to facilitate testing I included a `_fallback` backdoor.  If not default None it will be called instead of raising the error.

While making documentation changes, I got to wondering whether 'interspersed' would be a better term than 'intermixed'.  optparse has an 'interspersed' option and api.  However the getopt documentation does use 'intermixed'.
msg188609 - (view) Author: paul j3 (paul.j3) * Date: 2013-05-06 22:51
This is the formal patch corresponding to the `test_intermixed.py`.  It includes changes to `argparse.rst`, plus tests in `test_argparse.py`.  These tests are near the end, after those for `parse_known_args`.  They are roughly equivalent to the examples in `test_intermixed.py`.
 -----------------
The new documentation section is:

Some users expect to freely intermix optional and positional argument strings. For example, optparse, by default, allows interspersed argument strings. GNU getopt() permutes the argument strings so non-options are at the end. The parse_intermixed_args() method emulates this behavior by first calling parse_known_args() with just the optional arguments being active. It is then called a second time to parse the list of remaining argument strings using the positional arguments.

parse_intermixed_args() raises an error if the parser uses features that are incompatible with this two step parsing. These include subparsers, argparse.REMAINDER, and mutually exclusive groups that include both optionals and positionals.

In this example, parse_known_args() returns an unparsed list of arguments [‘2’, ‘3’], while parse_intermixed_args() returns rest=[1, 2, 3].

    >>> parser = argparse.ArgumentParser()
    >>> parser.add_argument('--foo')
    >>> parser.add_argument('cmd')
    >>> parser.add_argument('rest', nargs='*', type=int)
    >>> parser.parse_known_args('cmd1 1 --foo bar 2 3'.split())
    (Namespace(cmd='cmd1', foo='bar', rest=[1]), ['2', '3']) 
    >>> parser.parse_intermixed_args('cmd1 1 --foo bar 2 3'.split())
    Namespace(cmd='cmd1', foo='bar', rest=[1, 2, 3])

parse_known_intermixed_args() method, returns a two item tuple containing the populated namespace and the list of remaining argument strings. parse_intermixed_args() raises an error if there are any remaining unparsed argument strings.
msg188680 - (view) Author: Glenn Linderman (v+python) * Date: 2013-05-07 18:50
Paul, thanks for your continued work.

I had reworked your prior patch into a subclass of Argument Parser, and tweaking the code to get parse_intermixed_args to adjust the behaviors I had reported.

Now substituting exactly your more flexible new code into my subclass from your latest test_intermixed.py (you should delete your old patches), I can quickly confirm that it works with my applications that used to use my wrapper class, and expect and use intermixed functionality.

I also read through all your code and comments and it looks good to me.

Regarding parse_fallback_args, I do not see documentation for it. If that is intentional, you might want to add comments in the code regarding its use for testing only... and might want to rename it to _parse_fallback_args. I personally don't see a lot of value to the function, or the new parameter; tests for parse_intermixed_args and parse_known_intermixed_args should be (and have been, thanks) added to the tests for argparse, and should suffice for testing. In non-test code, I see no benefit: either the user uses features that are incompatible with parse_intermixed_args, and thus uses the other features of argparse, or the user, for compatibility reasons, needs to use parse_intermixed_args, and thus is prevented from successfully using the incompatible features. If I'm missing some benefit of parse_fallback_args, it should be explained in either the documentation or the comments.

Regarding the terminology: both intermixed and interspersed would be correct English words to describe the use case. So would intermingled :)

Because Stephen "blessed" intermixed, and because it is used by getopt documentation (getopt has not been deprecated, optparse has), it seems to be the best term to use. Should optparse someday be removed, along with its documentation, the use of the term interspersed would also disappear, leaving more consistency in terminology.

Alternative:

Because optparse uses "interspersed" in an API, we cannot fix it to use "intermixed". However, we could fix the uses of "intermixed" to be "interspersed" prior to or at the time of accepting your patch to argparse... afterwards, it would be too late.  Personally, I see no problem with the use of both terms in the documentation, and "intermixed" is the shortest, so I have a slight preference for that.
msg188869 - (view) Author: paul j3 (paul.j3) * Date: 2013-05-10 19:38
'parse_fallback_args()' function is only in the 'test_intermixed.py' file, not the patch.  It should be in the 'if __name__' section of that file, along with the modified 'exit()' method, since it is part of these testing suit, not meant to be imported.  'test_intermixed.py' is more of an example and discussion tool, not a formal test.

I added the '_fallback' optional argument because the normal exit from 'parse_args' using SystemExit is remarkably uninformative.  It's hard to distinguish between the 'fallback' errors, and the ones generated by 'parse_known_args' (that have more to do with the argument strings).  One is a programming error, the other a user generated error.

It is possible to redefine ArgumentParser.error() so it gives more information, for example by raising an Exception(message).  I have added to test_intermixed.py an alternative 'parse_fallback_args' that uses such a modified error rather than the _fallback option.  But that ends up modifying the normal parsing error generation as well.

I used the 'fallback' idea to test 'parse_intermixed_args' against the whole set 'test_argparse.py' tests.  It would nice to have a way of doing that automatically anytime other features are added to 'parse_args'.  But I can't think of a clean way of doing that.

Regarding earlier versions of these files - I do not see a way of deleting them.  

I have attached a modified test_intermixed.py that has these changes.  I also modified how 'parse_known_intermixed_args' restores the original value of self.usage, using an outer 'try:finally:' block.  I need to make a note to myself to put that in the patch.
msg188871 - (view) Author: Glenn Linderman (v+python) * Date: 2013-05-10 19:50
paul j3: Regarding earlier versions of these files - I do not see a way of deleting them. 

Click on edit, then there is an option to unlink. I don't know if they ever actually get deleted, but it clears out the clutter when looking for the latest version.

Will check out the newest code shortly.
msg188887 - (view) Author: paul j3 (paul.j3) * Date: 2013-05-10 22:39
I should note one caveat:

As a consequence of setting nargs to 0 for the first 'parse_know_args' step, all positional entries in the namespace get an empty list value ([]).  This is produced by 'ArgumentParser._get_values'.  With the builtin action classes this does not cause any problems.

However a custom action class might have problems with this [] value.
For example in 'test_argparse.py', TestActionUserDefined the PositionalAction class does check the values and throws an error with this [] value.

The positional arguments are removed from the namespace before it is passed on to the 2nd 'parse_known_args', so these [] in the first don't affect the final namespace.

I don't think anything about this should be added to main documentation, since it could confuse most readers.  I might add a note of warning to the code itself.
msg188899 - (view) Author: Glenn Linderman (v+python) * Date: 2013-05-11 07:23
OK, I've been running with the new code most the day, and it seems functional in my testing.

I only "sort of" follow your discussion about the "custom action class" caveat, probably because I haven't used "custom action classes"... I tried once, but failed to achieve my goal, as it was more ambitious than they presently support. If the [] value is significantly problematical in some manner, could positional nargs be set to a sentinal value that would avoid the assignment of the [] value? I realize that would require code changes in some other function or functions, in addition to the added new functions, so that would make the patch a bit more intrusive.

If _fallback helps some folks with understanding errors clearly, I won't object to it. I guess it would only be programmers that would be confused, because they would be the ones interpreting the errors... and with adequate testing, should fix the programming errors before the users get a chance to be confused.

So maybe your next .patch will be ready to ship!
msg190339 - (view) Author: paul j3 (paul.j3) * Date: 2013-05-30 05:40
This is a refinement of the patch with Message188609.

In parse_known_intermixed_args, the temporary capture of formatted usage has been put in a try/finally structure.

Positionals are now 'deactivated' with

    action.nargs = SUPPRESS
    action.default = SUPPRESS

To use this, a 'nargs==SUPPRESS' case has been added to the relevant methods.  In _get_args_pattern() it acts just like 'nargs=0'.  In '_get_values()' it returns 'value=SUPPRESS'.  The net effect is that, in take_action(), 'action' is not invoked, and that positional is not added to the namespace.

Previously I used nargs=0, which put a [] value in the namespace, which then had to be deleted.

I have not added anything about this SUPPRESS option to the documentation (PARSER isn't there either).

When the parser uses incompatible features (e.g. REMAINDER), this now raises a TypeError.  The effect is similar to giving add_argument incompatible definitions.  The 'fallback' that I used earlier can be implemented with a simple 'try/except TypeError'.  Other parsing errors go through the usual ArgumentParser.error() method.

test_argparse.py has been changed to handle this TypeError.
msg190340 - (view) Author: Glenn Linderman (v+python) * Date: 2013-05-30 05:49
These sound like good refinements. You've been thinking.  By making the fallback happen externally, it simplifies the implementation of parse_intermixed_args, and forces the application to accept responsibility for calling it with a consistent set of arguments, or calling something else. I like that. I don't really see the fallback as a particularly useful feature, so pushing it outside the stdlib, yet still making it simple to implement for any that do find it to be useful, seems like a good tradeoff.
msg195312 - (view) Author: Glenn Linderman (v+python) * Date: 2013-08-16 06:34
Paul, is this ready to merge, or are you thinking of more refinements?
msg195728 - (view) Author: paul j3 (paul.j3) * Date: 2013-08-20 23:28
It's everything I intend to add.  Now I'm just waiting for a committer to act, either with suggested changes, or a merge.  I'm watching more than a dozen argparse patches.
msg195740 - (view) Author: Glenn Linderman (v+python) * Date: 2013-08-21 01:29
So I read over your code again, and even read the documentation this time, and it all looks good, and I know it works good because I've been using the code. I tried to send a notice through Reitveld, and maybe did, but I don't know where it went, so I'll say this much here, too.
msg196067 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2013-08-24 07:16
It sounds like this bug might cover Issue 15112, which is only concerned with options between different positional parameters.
msg196110 - (view) Author: paul j3 (paul.j3) * Date: 2013-08-25 02:01
Above in
http://bugs.python.org/issue14191#msg187051
I proposed a patch that is quite close to bethard's patch in http://bugs.python.org/issue15112#msg166173

Both modify the same place, doing the same (pop items off arg_counts).  The logic is a little different.  I'd have to look at them more carefully to see whether one is more robust.  Thanks for linking that issue.

The main issue here is different, allowing for complete intermixing of optionals and positionals.  Over at 15112 the issue is intermixing of optionals and 'whole' positionals.  My 187051 patch belongs over there.
msg217812 - (view) Author: paul j3 (paul.j3) * Date: 2014-05-03 01:23
I encountered a conflict when merging this patch with http://bugs.python.org/issue15112.  In my first testcase, 'cmd' and 'rest' failed the 'required' test in phase one of 'intermixed'.  That's because 15112 postponed parsing them (with nargs=0/suppressed).  

I got around that by temporarily setting the 'required' attribute to False.

The whole issue of when (or even whether) a positional that is satisfied with 0 arguments, is consumed, is a bit messy.
History
Date User Action Args
2014-05-03 01:23:26paul.j3setmessages: + msg217812
2013-08-25 02:13:03paul.j3setfiles: - mixed.patch
2013-08-25 02:02:22paul.j3setfiles: + mixed.patch
2013-08-25 02:01:08paul.j3setmessages: + msg196110
2013-08-24 07:16:06martin.pantersetnosy: + martin.panter
messages: + msg196067
2013-08-21 01:29:23v+pythonsetmessages: + msg195740
2013-08-20 23:28:01paul.j3setmessages: + msg195728
2013-08-16 06:34:37v+pythonsetmessages: + msg195312
2013-05-30 05:49:34v+pythonsetmessages: + msg190340
2013-05-30 05:40:38paul.j3setfiles: + intermixed.patch

messages: + msg190339
2013-05-13 06:47:27paul.j3setfiles: - mixed.patch
2013-05-11 07:23:44v+pythonsetmessages: + msg188899
2013-05-10 22:39:01paul.j3setmessages: + msg188887
2013-05-10 19:50:19v+pythonsetmessages: + msg188871
2013-05-10 19:38:01paul.j3setfiles: + test_intermixed.py

messages: + msg188869
2013-05-07 18:50:13v+pythonsetmessages: + msg188680
2013-05-06 22:51:32paul.j3setfiles: + intermixed.patch

messages: + msg188609
2013-05-06 22:45:45paul.j3setfiles: + test_intermixed.py

messages: + msg188608
2013-04-24 01:11:55v+pythonsetmessages: + msg187680
2013-04-23 21:22:13paul.j3setmessages: + msg187669
2013-04-23 08:14:53v+pythonsetmessages: + msg187618
2013-04-22 22:29:53paul.j3setfiles: + test_intermixed.py

messages: + msg187596
2013-04-19 03:51:55v+pythonsetmessages: + msg187341
2013-04-19 01:36:29paul.j3setmessages: + msg187329
2013-04-18 17:33:56v+pythonsetmessages: + msg187270
2013-04-18 17:14:41v+pythonsetmessages: + msg187269
2013-04-16 07:09:45paul.j3setfiles: + mixed.patch

messages: + msg187051
2013-03-29 20:56:14paul.j3setnosy: + paul.j3
messages: + msg185517
2013-02-13 18:06:02amcnabbsetnosy: + amcnabb
2012-07-23 01:49:24v+pythonsetmessages: + msg166193
2012-07-22 22:16:18bethardsetversions: + Python 3.4, - Python 2.7, Python 3.2, Python 3.3
title: argparse: nargs='*' doesn't get out-of-order positional parameters -> argparse doesn't allow optionals within positionals
messages: + msg166175

assignee: docs@python ->
components: - Documentation
2012-07-06 21:00:48v+pythonsetfiles: + t18a.py

messages: + msg164756
2012-03-10 01:16:31v+pythonsetmessages: + msg155277
2012-03-10 00:04:11v+pythonsetfiles: + t18.py

messages: + msg155272
2012-03-09 08:33:31r.david.murraysetnosy: + docs@python, r.david.murray
messages: + msg155212

assignee: docs@python
components: + Documentation
2012-03-09 01:35:49v+pythonsetfiles: + t16.py

messages: + msg155202
2012-03-08 23:48:53bethardsetmessages: + msg155188
2012-03-08 23:29:11bethardsetmessages: + msg155185
2012-03-08 18:48:54v+pythonsetmessages: + msg155172
2012-03-08 10:16:19bethardsetmessages: + msg155157
2012-03-07 17:44:02v+pythonsetfiles: + t14.py

messages: + msg155103
2012-03-07 09:06:47bethardsetmessages: + msg155062
2012-03-06 22:04:44v+pythonsetfiles: + t13.py

messages: + msg155039
2012-03-06 15:11:21guilherme-pgsetfiles: + 14191.diff

nosy: + guilherme-pg
messages: + msg155019

keywords: + patch
2012-03-06 09:34:25bethardsetmessages: + msg155008
2012-03-05 06:46:37eric.araujosetnosy: + bethard
title: argparse: nargs='*' doesn't parse all positional parameters -> argparse: nargs='*' doesn't get out-of-order positional parameters

versions: + Python 2.7, Python 3.3
2012-03-04 06:36:30v+pythoncreate