Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

argparse doesn't allow optionals within positionals #58399

Closed
vpython mannequin opened this issue Mar 4, 2012 · 47 comments
Closed

argparse doesn't allow optionals within positionals #58399

vpython mannequin opened this issue Mar 4, 2012 · 47 comments
Labels
3.7 (EOL) end of life stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@vpython
Copy link
Mannequin

vpython mannequin commented Mar 4, 2012

BPO 14191
Nosy @bitdancer, @vadmium, @monkeyman79
PRs
  • bpo-14191 Add parse_intermixed_args. #3319
  • bpo-42973: argparse: mixing optional and positional #24259
  • bpo-42973: argparse: mixing optional and positional #24367
  • Files
  • t12.py: test case demonstrating bug
  • 14191.diff
  • t13.py: Demonstration that optparse accepts scattered positional parameters.
  • t14.py
  • t16.py: Demonstrate solution based on Steven's suggestion
  • t18.py: multiclass wrapper for argparse to allow intermixed parameters
  • t18a.py
  • test_intermixed.py
  • test_intermixed.py
  • intermixed.patch
  • test_intermixed.py
  • intermixed.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2017-09-07.00:27:33.235>
    created_at = <Date 2012-03-04.06:36:30.660>
    labels = ['3.7', 'type-feature', 'library']
    title = "argparse doesn't allow optionals within positionals"
    updated_at = <Date 2021-01-29.13:29:51.106>
    user = 'https://bugs.python.org/vpython'

    bugs.python.org fields:

    activity = <Date 2021-01-29.13:29:51.106>
    actor = 'monkeyman79'
    assignee = 'none'
    closed = True
    closed_date = <Date 2017-09-07.00:27:33.235>
    closer = 'r.david.murray'
    components = ['Library (Lib)']
    creation = <Date 2012-03-04.06:36:30.660>
    creator = 'v+python'
    dependencies = []
    files = ['24726', '24745', '24749', '24753', '24764', '24771', '26273', '29982', '30159', '30160', '30204', '30422']
    hgrepos = []
    issue_num = 14191
    keywords = ['patch']
    message_count = 47.0
    messages = ['154880', '155008', '155019', '155039', '155062', '155103', '155157', '155172', '155185', '155188', '155202', '155212', '155272', '155277', '164756', '166175', '166193', '185517', '187051', '187269', '187270', '187329', '187341', '187596', '187618', '187669', '187680', '188608', '188609', '188680', '188869', '188871', '188887', '188899', '190339', '190340', '195312', '195728', '195740', '196067', '196110', '217812', '301275', '301292', '301415', '301549', '301550']
    nosy_count = 10.0
    nosy_names = ['amcnabb', 'bethard', 'v+python', 'r.david.murray', 'docs@python', 'martin.panter', 'paul.j3', 'guilherme-pg', 'dg1727', 'monkeyman79']
    pr_nums = ['3319', '24259', '24367']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue14191'
    versions = ['Python 3.7']

    @vpython
    Copy link
    Mannequin Author

    vpython mannequin commented Mar 4, 2012

    To me, "all positional parameters" mean whether they are in the front, back or middle, as long as they are not diriectly preceded by an option that can accept an unlimited number of parameters.

    from argparse import ArgumentParser, SUPPRESS, REMAINDER
    import sys
    print( sys.version )
    parser = ArgumentParser()
    parser.add_argument('--foo', dest='foo')
    parser.add_argument('--bar', dest='bar')
    parser.add_argument('baz', nargs='*')
    print( parser.parse_args('a b --foo x --bar 1 c d'.split()))
    # expected:  Namespace(bar='1', baz=['a', 'b', 'c', 'd'], foo='x')
    # actual: error: unrecognized arguments: c d

    Above also supplied as a test file, t12.py

    @vpython vpython mannequin added type-bug An unexpected behavior, bug, or error stdlib Python modules in the Lib dir labels Mar 4, 2012
    @merwok merwok changed the title argparse: nargs='*' doesn't parse all positional parameters argparse: nargs='*' doesn't get out-of-order positional parameters Mar 5, 2012
    @bethard
    Copy link
    Mannequin

    bethard mannequin commented Mar 6, 2012

    This behavior is intentional - positional arguments must be sequential, not broken up with optional (flag) arguments between. So this is a documentation bug.

    Allowing positional arguments to be broken up with optional (flag) arguments between them would be a new feature. It would also break many current parsers, so it couldn't be turned on by default. A new constructor parameter or method or something would have to be added to ArgumentParser. Patches welcome.

    @guilherme-pg
    Copy link
    Mannequin

    guilherme-pg mannequin commented Mar 6, 2012

    I uploaded an incomplete patch that might address the issue so it can be discussed.

    This patch introduces 'greedy_star', a new constructor parameter to ArgumentParser that makes "*" positional arguments behave as expected in the test case.

    The patch doesn't yet update the documentation and doesn't include test cases, but I'll be glad to provide those changes in a next version.

    It is admittedly hackish, but I haven't found a better solution so far. Looking forward for your comments.

    @vpython
    Copy link
    Mannequin Author

    vpython mannequin commented Mar 6, 2012

    Interesting that the behavior is intentional, yet it accepts positional parameters either before, or after, or between optional (flag) parameters.

    This seems to me to be a case where proper documentation of the intention would have led to the realization that it is easier to fix the code than the documentation.

    The only definition of positional parameters I could find in the present documentation is:

    When parse_args() is called, optional arguments will be identified by the - prefix, and the remaining arguments will be assumed to be positional:

    This is simple and succinct, but leads to my interpretation that they can be anywhere, intermixed with optional arguments.

    Further, optparse, which argparse attempts to replace, permitted positional arguments to be intermixed with optional arguments, see new file t13.py which demonstrates that.

    To document that positional parameters must be grouped together, yet can appear anywhere, the documentation would have to get much more verbose... something like

    All positional parameters must be grouped together in a single sequence. However, that group of parameters may have optional parameters either before it or after it, or there may be optional parameter both before it and after it.

    @bethard
    Copy link
    Mannequin

    bethard mannequin commented Mar 7, 2012

    optparse, which argparse attempts to replace, permitted positional
    arguments to be intermixed with optional arguments

    Sure, but optparse didn't actually parse positional arguments - it just threw them into a bag, and then you had to group them and convert them however you wanted afterwards. Argparse, instead, was designed to let you specify the groups of positional arguments. Your situation is a little different because you just want to throw all the positional arguments into a bag again. Not that there's anything wrong with that - it's just not the primary use case argparse had in mind.

    The only definition of positional parameters I could find...

    Yeah, it looks like there's no good documentation on positional vs. optional parameters. Somewhere obvious, perhaps right at the beginning of the add_argument() documentation, there should probably be something like:

    Argparse groups the command line argument strings into two types of groups: optional arguments, which are a sequence of command line strings that begin with a flag like "-v" or "--verbose", and positional arguments, which are a sequence of command line strings that do not begin with a flag. The add_argument() method allows you to specify how many command line strings each of your optional or positional arguments should consume, how those strings should be converted into Python objects, etc.

    @vpython
    Copy link
    Mannequin Author

    vpython mannequin commented Mar 7, 2012

    Improved documentation would certainly help the situation.

    And yes, I understand that optparse simply returned the set of positional parameters without giving them names, types, or groups. So does getopt, and pretty much all previous art in the arena of command line parsing that I am familiar with.

    To successfully replace optparse and other prior art, though, there should be an equivalent, although perhaps improved, functionality in argparse.

    This lack of documentation for the idea that the ordered set of all positional parameters is not treated as a sequence certainly slipped under the covers of optparse functionality when I first read about it when optparse was being added to the stdlib. I had no clue that the specification of positional parameters would do anything other than process the positional parameters sequentially, without being disrupted by intervening optional parameters. The capabilities for naming, and typing those parameters are nice enhancements, but were not seen as redefining what positional parameters are, from its historical definition. Is there wording in the PEP that describes such?

    Naming and typing and even grouping positional parameters are all nice features... but there should be no undocumented boundaries between positional parameters (or groups of them), and presently there are no documented boundaries, and with prior art there were no boundaries.

    Having no boundaries among positional parameters is a capability and expectation that has a long history, and tools ported from prior art to argparse need the capability to preserve command line compatibility. Hence, I conclude that, unless this was spelled out in the PEP and I missed it, that having such boundaries is a bug, even if your intentions were otherwise, and that the test case I provided should work.

    My test was only meant to demonstrate the issue, not to be a particular use case, but there are use cases that would be affected in the same manner as the demonstration.

    Regarding your suggested documentation, it is more complete than my suggestion, but "sequence" should probably be replaced by "sequence of adjacent" if that is what is meant, because with positional parameters, the historical perspective is that the sequence of positional parameters may be interrupted by optional parameters, but that makes it no less a sequence.

    I believe that the present syntax for parsing positional parameters should be fixed to handle all positional parameters, because of the history of prior art, and that if there is a need, benefit, or demand for treating positional parameters in groups, then that should be documented and created as additional features.

    I further cannot figure out how to even parse the additional positional parameters as a separate group, using the current capabilities. My attempt to do so in t14.py failed.

    @bethard
    Copy link
    Mannequin

    bethard mannequin commented Mar 8, 2012

    Hence, I conclude that, unless this was spelled out in the PEP and I
    missed it, that having such boundaries is a bug

    Practically speaking, we just can't change this because it will break existing argparse scripts. Argparse has had this behavior since 2006 when it was first released, and I guarantee you that many scripts expect and rely on this behavior.

    As I said earlier, the only reasonable solution is to document the current behavior more explicitly, and then add a new constructor parameter or method or something to enable the behavior you want.

    I looked a bit a guilherme's patch, and I think it's not really the right direction. We definitely shouldn't be modifying the action classes like _StoreAction. All changes should be within _parse_known_args. The parsing algorithm is so different for the proposed behavior, that I wonder if it wouldn't be easier to just create a new ArgumentParser subclass, ArgumentParserAllowingOptionalsInsidePositionals (or whatever), that just overrides _parse_known_args and rewrites it entirely.

    @vpython
    Copy link
    Mannequin Author

    vpython mannequin commented Mar 8, 2012

    Ah yes, argparse had a life outside the stdlib, so now I understand your compatibility concerns.

    Mind you, I think the overall technology of argparse is superior to optparse, which is why, together with the optparse deprecation, I am trying to port to use it... so consider me a fan, not an enemy. But...

    However, it seems that while the esoteric extensions required in optparse were a known incompatibility at the time the PEP was written, the incompatibility with intermixed positional and optional parameters slipped under the radar... but is actually a more serious compatibility problem for general use.

    I see three possible ways forward, maybe there are others.

    1. un-deprecate optparse, explaining and documenting this functional difference between optparse and argparse. The deprecation of optparse is what makes this missing capability a bug, rather than a feature enhancement.

    2. add features to argparse to make it capable of parsing all the same command lines as unextended optparse. (I'm of the opinion that folks that extended optparse can learn to extend argparse in similar or more capable manners; not having such extensions, I'm not qualified to state whether there are optparse extensions that cannot be ported to use standard or extended argparse capabilities.) The documentation for such features should clearly state that former users of argparse may be interested in using this feature, and should state why; further, the deprecation notice in optparse should be updated to point out that porting to argparse may need to use this particular argparse capability to achieve command line compatibility with optparse, and that the capability is not available until (specified release).

    3. If there is an already existing way (my t14.py is a half-hearted attempt to find it) to parse scattered positional parameters, it could be explicitly documented rather than writing new capabilities per Rename README to README.rst and enhance formatting #2. However, since you as author jumped to the new capabilities option straightway, I suspect this is not really a possibility.

    The rest of this is concerned option #2, which seems the best way forward to me, with my current knowledge.

    You mention ArgumentParserAllowingOptionalsInsidePositionals and that is extremely lengthy, might I suggest something like ArgumentParserIntermixed ?

    What would be the scope of the effort, and what release(s) might be a possible target? (since it is a bug, it can be backported, but since the cure will be implemented as a new capability, that might be problematical for point releases, somehow? I'm not the expert in that area.)

    @bethard
    Copy link
    Mannequin

    bethard mannequin commented Mar 8, 2012

    Thinking about it a bit more, it strikes me that maybe you could get the behavior you want by declaring two parsers, one with just optionals, and one with just positionals. Then:

    optional_args, remaining_args = optionals.parse_known_args()
    args = positionals.parse_args(remaining_args)
    vars(args).update(vars(optional_args))

    Basically, you first parse out all the optional arguments, then you parse out the positional arguments from what's left after the optional arguments are stripped out. This approach seems to work for your t14.py.

    @bethard
    Copy link
    Mannequin

    bethard mannequin commented Mar 8, 2012

    Actually, that could be even simpler:

    args, remaining_args = optionals.parse_known_args()
    args = positionals.parse_args(remaining_args, args)

    @vpython
    Copy link
    Mannequin Author

    vpython mannequin commented Mar 9, 2012

    *Very* interesting, Steven.

    Looking again at section 15.4.6, and recognizing that positional arguments were never parsed by optparse, then we discover that with two minor tweaks, removing "and add additional ArgumentParser.add_argument() calls for the positional arguments." from step two, and calling parse_known_args instead of parse_args, we actually achieve greater compatibility with optparse, for those that need it.

    However, the above would allow undefined arguments to slip through into the positional arguments.

    So to prevent that, the second parser you suggest, defining only the positional arguments, will tighten up the error checks.

    So this issue could be resolved simply by updating section 15.4.6 of the documentation to explain all this. On the other hand, the explanation gets long and perhaps confusing.

    Looking at the 15.4.4, I see:

    ArgumentParser.parse_args(args=None, namespace=None)
    Convert argument strings to objects and assign them as attributes of the namespace. Return the populated namespace.
    
    Previous calls to add_argument() determine exactly what objects are created and how they are assigned. See the documentation for add_argument() for details.
    
    By default, the argument strings are taken from sys.argv, and a new empty Namespace object is created for the attributes.
    

    However, nowhere is the args= parameter explained. One example is given at the end of 15.4.4.6 showing the use of args= which apparently accepts a list of parameters, similar to the positional list of parameters that are used in all the other examples. It might be nice to clarify that.

    This leads into a suggestion: a new keyword parameter for parse_args: intermixed=False.

    When False, the default, new_parse_args would act exactly as it does today, possibly by calling old_parse_args. When True, parse_args would implement your suggestion probably in the following way: filter out the positional parameters, call parse_known_args, then filter out the optional parameters, call old_parse_args, and return the combination.

    t16.py implements this, external to the class, and using two parsers as you suggested.

    One thing I notice in playing with my optparse function, is that error messages contain an improper usage message. This would seem to be most easily fixed if the logic were built in to argparse, rather than attempting to do it externally.

    Adding this parameter would also make it is much easier to write section 15.4.7 accurately, and would reduce the porting burden on each implementer, as well. The technique has merit in achieving compatibility with optparse, to allow it to remain deprecated, and looks straightforward to implement.

    @bitdancer
    Copy link
    Member

    To answer Glenn's procedural question: no this is not a bug whose fix can be backported. API changes are not allowed in maintenance releases. Doc improvements can be backported, though, so I'm leaving versios alone (alternatively someone could split this into two bugs, one for the doc fixes for older releases, and one for adding the new capability).

    @bitdancer bitdancer added the docs Documentation in the Doc dir label Mar 9, 2012
    @vpython
    Copy link
    Mannequin Author

    vpython mannequin commented Mar 10, 2012

    Sad. That means all the documentation of workarounds needs to be written, even figured out in the first place. Steven's code, while being a nice implementation when proper arguments are provided, produces inappropriate errors, because only the positional, or only the optional, parameters are printed when errors occur.

    So it would probably take a third parser, with all the parameters defined, to exist, to allow easiest generation of the usage message, but I'm not quite sure how to catch the error printing, and redirect it to the third parser.

    So, I tried the classes in t17.py; they are not complete; CompatibleArgumentParser should pass through all the other APIs, and I'm not sure if all the extension semantics can be appropriately passed through when there are three classes and two objects involved. But this is sort of a proof-of-concept wrapper for achieving intermixed optional and positional arguments, and still get proper error messages.

    @vpython
    Copy link
    Mannequin Author

    vpython mannequin commented Mar 10, 2012

    Of course, if a "real" solution can only be shipped in 3.3, it may want to use a different API than parse_args to avoid the parameter, parse_intermixed_args, perhaps. But my t18.py uses the name parse_args, but just always does the intermixed parsing, so that is something to be aware of if building on that code at all, for the real solution or for sample code for older versions.

    @vpython
    Copy link
    Mannequin Author

    vpython mannequin commented Jul 6, 2012

    See also bpo-15258 which points out issues with the converse case.

    Further testing and development also discovered that in certain error cases, the help message produced by t18-equivalent code was incorrect.

    t18a.py is an update/rewrite (same concepts, though) that produces a correct help message when errors occur.

    @bethard
    Copy link
    Mannequin

    bethard mannequin commented Jul 22, 2012

    I created bpo-15427 for the parse_args documentation bug. So let's make this issue just about parsing intermixed arguments.

    Yes, if someone would like to provide a patch for this, please create a method "parse_intermixed_args" rather than adding a boolean flag parameter. It should be basically equivalent to this code:

    args, remaining_args = optionals.parse_known_args()
    args = positionals.parse_args(remaining_args, args)

    Except that it should give proper error messages. There should be some tests to make sure both that it parses things as expected and that it gives error messages as expected.

    @bethard bethard mannequin removed the docs Documentation in the Doc dir label Jul 22, 2012
    @bethard bethard mannequin changed the title argparse: nargs='*' doesn't get out-of-order positional parameters argparse doesn't allow optionals within positionals Jul 22, 2012
    @bethard bethard mannequin unassigned docspython Jul 22, 2012
    @vpython
    Copy link
    Mannequin Author

    vpython mannequin commented Jul 23, 2012

    So my t18a.py wraps Argparse, because the externals are documented and I could understand that. Given enough time, I might be able to understand the internals too... it is just Python...

    Seems like the internals separate positionals and optionals into two subparsers, the logic needed is to save the positionals, temporarily replace them with an empty group, and call parse_known_args, then restore the positionals, and call parse_known_args again.

    What I haven't figured out, is how that would affect the help message building, and how to make the empty group of positionals without affecting the help message.

    It's also not clear that it is possible, with current internals, to substitute an empty group... the only features for creating a group seem to always add it to a list of _action_groups. So that muddies the water, just like the help message... everything seems intertwined internally.

    A little guidance here, would let me code... I could probably whack and slash, and keep extra lists of arguments and sort of make it work like my wrapper does, but I doubt that would be an acceptable patch. So I would be happy, with guidance, to try to create the code, and I can do a "diff", but beyond that, I'm sort of at a loss.

    So I'd be happy to work with someone to create the patch, but I've also never created a patch, nor test cases, for Python (or any other open source project). Theoretically, it is straightforward, and documented; as a practical matter, it isn't likely that I'll find time to figure out all that methodology and actually create a patch, in the near future (although it is on my list of things to learn, in the fullness of time, and after the first one, I'm sure subsequent ones would be easier).

    Or if someone else wants to code it, I'd be happy to look it over, test it with my environment and applications that I'm currently using with my wrapper.

    Given this guidance, I've tweaked my wrapper to have parse_intermixed_args rather than changing the behavior of parse_args as I have been, and will tweak the apps correspondingly, so I'll be in a position to test any code created for this issue.

    @paulj3
    Copy link
    Mannequin

    paulj3 mannequin commented Mar 29, 2013

    Glenn
    I looked at your t18a.py test case

        parser = ArgumentParser()
        parser.add_argument('--foo', dest='foo')
        parser.add_argument('--bar', dest='bar')
        parser.add_argument('foz')
        parser.add_argument('baz', nargs='*')

    and parse variations on 'a b c d --foo x --bar 1'

    I think your main problem is with the 'baz', nargs='*'. If nargs was changed to '+', 'a --foo x b c d --bar 1' would work, returning {foz='a', bar=['b','c','d']}.

    argparse alternates between consuming positional and optionals until it runs out of arguments or argument strings. But with '*', both 'foz' and 'baz' are consumed with the first set of positional strings {foz='a', baz=[]}. When it gets to 'b c d' there are no more positional arguments to consume, so they get put into 'extras'.

    With nargs='+', 'a b --foo x c d --bar 1' would assign {foz='a', baz=[b]}, and extras=['c','d'].

    So while optionals can be interspersed with positionals, they can't be placed within the set of strings intended for one positional. That seems to me to very reasonable (why break up 'b c d'?). And as your file demonstrates, you can fall back on parse_known_args to handle the extras.

    @paulj3
    Copy link
    Mannequin

    paulj3 mannequin commented Apr 16, 2013

    This patch permits the mixing of optionals with positionals, with the caveat that a particular positional cannot be split up.

    If:

        parser = ArgumentParser()
        parser.add_argument('-f','--foo')
        parser.add_argument('cmd')
        parser.add_argument('rest', nargs='*')
    '-f1 cmd 1 2 3', 
    'cmd -f1 1 2 3', 
    'cmd 1 2 3 -f1' 
    

    all give {cmd='cmd', rest=['1','2','3'], foo='1'}.

    But 'cmd 1 -f1 2 3', does not recognize ['2','3'].

    Previously 'cmd -f1 1 2 3' would return rest=[], and not recognize ['1','2','3']. With this change the nargs='*' behaves more like nargs='+', surviving to parse the 2nd group of positional strings.

    The trick is to modify arg_counts in consume_positionals(), removing matches that don't do anything (don't consume argument strings).

        if 'O' in arg_strings_pattern[start_index:]:
            # if there is an optional after this, remove
            # 'empty' positionals from the current match
            while len(arg_counts)>1 and arg_counts[-1]==0:
                arg_counts = arg_counts[:-1]

    This change passes all of the existing test_argparse.py tests. It also passes the optparse tests that I added in http://bugs.python.org/issue9334#msg184987
    I added 4 cases to illustrate this change.

    @vpython
    Copy link
    Mannequin Author

    vpython mannequin commented Apr 18, 2013

    Paul, your comments are interesting, but your proposed patch doesn't actually solve the problem.

    So here I am typing away at my command prompt, and I type in a couple optional parameters I know I'll need and start on the sequence of positional ones, and half way through I remember "oh, I need another optional paremeter" so I type it in next before I forget, and then go on with the positional ones.

    No prior Unix-style argument parsing mechanism that I have ever seen or heard of would be confused by that, but argparse is.

    This bug is about providing a facility in argparse that supports intermixing optional parameters into strings of positional parameters, just like all prior Unix-style argument parsing mechanisms, so that an application can be ported to use argparse without breaking command lines that their users have stored in command files. Otherwise argparse is not an upgrade path for an application, yet optparse has been deprecated.

    @vpython
    Copy link
    Mannequin Author

    vpython mannequin commented Apr 18, 2013

    I should clarify, before someone jumps in: some particular applications do implement restrictions on order of optional and positional arguments; I'm aware of that. getopt easily supported application defined order restrictions, because it processed arguments sequentially, and the processing loop was user code. optparse, as has been pointed out, parses the optionals, and leaves a single list of positionals, combined from between all the optionals, for the user code to process in any manner, but would actually make it harder for user code to implement order restrictions. argparse goes the other way, taking over all the user parsing (which is a good thing), but not providing sufficient features to implement flexible mixing of optional and positional arguments.

    @paulj3
    Copy link
    Mannequin

    paulj3 mannequin commented Apr 19, 2013

    Glenn
    Take a look at http://bugs.python.org/issue15427

    I took a stab at changing the documentation, including recommending parse_known_args when porting optparse uses.

    @vpython
    Copy link
    Mannequin Author

    vpython mannequin commented Apr 19, 2013

    Docs look good as mentioned there, for the current behavior, although it would be good to improve the behavior.

    Note that I have supplied a wrapper (t18a.py) (if it hasn't bit-rotted for 3.4, I'm still using 3.3) that provides the needed functionality. The problem is, that I have no clue how to modify the internals of argparse to allow it to simply be a method of the current argparse class. One could achieve the goal by renaming the current argparse class to _argparse, and renaming my wrapper class to be the "real" argparse, and that would work, but would seem to be inefficient.

    It would be nice if someone could move the needed functionality, a new API called parse_intermixed_args, already approved by msg166175, that does the same thing as my wrapper does, but without the wrapper class. This would be a cure to the problem, and it could be tested against my wrapper class by comparison to ensure the needed functionality is provided. I'd be glad to help with testing and understanding the requirements, but don't have time to figure out the internals of argparse at present.

    @paulj3
    Copy link
    Mannequin

    paulj3 mannequin commented Apr 22, 2013

    The attached file has a 'parse_intermixed_args()' that has the same API as 'parse_known_args()'.

    It follows the two parse step model

    args, remaining_args = optionals.parse_known_args()
    args, extras = positionals.parse_known_args(remaining_args, args)
    

    except that the 'optionals parser' is self with the positional arguments 'deactivated' by setting their nargs to 0. Similarly the 'positionals parser' is self with the optional arguments set to 'required=false'.

    Here it is in a standalone test module illustrating its functionality and limitations. I could provide a patch, but this form might easier to test in your own code.

    When used to run test_argparse.py, it had problems in the cases where the distinction between positionals and optionals is blurred.

    For example, PARSER and REMAINDER are supposed to grab everything that follows regardless of what it looks like. I choose to fall back on a single 'parse_know_args' call. Raising an error would the alternative.

    Similarly, a mutually exclusive group that includes a positional is difficult to handle. Again I fall back on the single step.

    So the two issues to be discussed are:

    • does it provide the desired freedom to mix optionals and positionals?

    • in the difficult cases, should it punt, or raise an error?

    @vpython
    Copy link
    Mannequin Author

    vpython mannequin commented Apr 23, 2013

    Very nice, Paul.

    I tested that with some of my applications, and some of my test cases. All of them initially failed, because you have parse_intermixed_args returning parameters like parse_known_args instead of like parse_args. Now I can understand that might be a little confusing in msg166175, but note that the implementation is "like" a call to parse_known_args followed by a call to parse_args... strongly implying that the return should be like parse_args.

    After tweaking your implementation in that regard, then I was able to get all the same applications and test cases to pass, although I haven't tried all my applications and all my test cases, as yet.

    Your techniques for disabling particular parameters are pretty clever.

    I think the difficult cases should raise an error.

    Firstly, parse_intermixed_args is intended to be for functional compatibility with optparse functionality, which doesn't support the difficult cases, therefore use of the difficult cases would require additional restrictions on the allowed order of options on the command line, beyond what optparse supports... this would be an application interface change, and as part of that interface change, should such happen, the flexibility of intermixing optionals and positionals can be restricted.

    Secondly, if an understanding of how to define the use parse_intermixed_args with one or more of the difficult cases is reached, replacing an error case with a functional case is possible, but replacing one silent functionality with a different one is a backwards compatibility problem. Throwing an error avoids limiting a future definition of these cases.

    The freedom of mixing optionals and positionals that would available in the now deprecated optparse does seem to be restored by this patch.

    I look forward to seeing a revised patch, this is a very promising solution to this bug.

    @paulj3
    Copy link
    Mannequin

    paulj3 mannequin commented Apr 23, 2013

    Yes, http://bugs.python.org/msg166175 does use 'parse_args' in the second call. But I think things would be more flexible if we had a second function:

    def parse_???(self, args=None, namespace=None):
        args, argv = self.parse_intermixed_args(args, namespace)
        if argv:
            msg = _('unrecognized arguments: %s')
            self.error(msg % ' '.join(argv))
        return args
    

    But then what would a be a good pair of names?

    parse??? and parse_intermixed_args
    versus
    parse_intermixed_args and parse_known_intermixed_args
    or
    something else?

    @vpython
    Copy link
    Mannequin Author

    vpython mannequin commented Apr 24, 2013

    Yes, a second function would give more flexibility.

    Due to the "approval" in msg166175 to use the name parse_intermixed_args for the functionality described there, it would probably be best to use that name for that functionality.

    So then we are left naming your current function something else. parse_known_intermixed_args certainly is descriptive, and fits the naming conventions of the other methods in the class. Quite long, unfortunately... but then I doubt it will get used much. I am using parse_intermixed_args regularly (via my wrapper class), and it is quite long enough.

    @paulj3
    Copy link
    Mannequin

    paulj3 mannequin commented May 6, 2013

    This is a revision of the test_intermixed.py that I submitted earlier. Now parse_intermixed_args acts like parse_args', and calls parse_known_intermixed_args. Again it is form that can exercise the idea without modifying argparse.py`.

    If the parser has incompatible features (REMAINDER, PARSER, or certain exclusive groups), it raises an error. However to facilitate testing I included a _fallback backdoor. If not default None it will be called instead of raising the error.

    While making documentation changes, I got to wondering whether 'interspersed' would be a better term than 'intermixed'. optparse has an 'interspersed' option and api. However the getopt documentation does use 'intermixed'.

    @paulj3
    Copy link
    Mannequin

    paulj3 mannequin commented May 6, 2013

    This is the formal patch corresponding to the test_intermixed.py. It includes changes to argparse.rst, plus tests in test_argparse.py. These tests are near the end, after those for parse_known_args. They are roughly equivalent to the examples in test_intermixed.py.
    -----------------
    The new documentation section is:

    Some users expect to freely intermix optional and positional argument strings. For example, optparse, by default, allows interspersed argument strings. GNU getopt() permutes the argument strings so non-options are at the end. The parse_intermixed_args() method emulates this behavior by first calling parse_known_args() with just the optional arguments being active. It is then called a second time to parse the list of remaining argument strings using the positional arguments.

    parse_intermixed_args() raises an error if the parser uses features that are incompatible with this two step parsing. These include subparsers, argparse.REMAINDER, and mutually exclusive groups that include both optionals and positionals.

    In this example, parse_known_args() returns an unparsed list of arguments [‘2’, ‘3’], while parse_intermixed_args() returns rest=[1, 2, 3].

        >>> parser = argparse.ArgumentParser()
        >>> parser.add_argument('--foo')
        >>> parser.add_argument('cmd')
        >>> parser.add_argument('rest', nargs='*', type=int)
        >>> parser.parse_known_args('cmd1 1 --foo bar 2 3'.split())
        (Namespace(cmd='cmd1', foo='bar', rest=[1]), ['2', '3']) 
        >>> parser.parse_intermixed_args('cmd1 1 --foo bar 2 3'.split())
        Namespace(cmd='cmd1', foo='bar', rest=[1, 2, 3])

    parse_known_intermixed_args() method, returns a two item tuple containing the populated namespace and the list of remaining argument strings. parse_intermixed_args() raises an error if there are any remaining unparsed argument strings.

    @vpython
    Copy link
    Mannequin Author

    vpython mannequin commented May 7, 2013

    Paul, thanks for your continued work.

    I had reworked your prior patch into a subclass of Argument Parser, and tweaking the code to get parse_intermixed_args to adjust the behaviors I had reported.

    Now substituting exactly your more flexible new code into my subclass from your latest test_intermixed.py (you should delete your old patches), I can quickly confirm that it works with my applications that used to use my wrapper class, and expect and use intermixed functionality.

    I also read through all your code and comments and it looks good to me.

    Regarding parse_fallback_args, I do not see documentation for it. If that is intentional, you might want to add comments in the code regarding its use for testing only... and might want to rename it to _parse_fallback_args. I personally don't see a lot of value to the function, or the new parameter; tests for parse_intermixed_args and parse_known_intermixed_args should be (and have been, thanks) added to the tests for argparse, and should suffice for testing. In non-test code, I see no benefit: either the user uses features that are incompatible with parse_intermixed_args, and thus uses the other features of argparse, or the user, for compatibility reasons, needs to use parse_intermixed_args, and thus is prevented from successfully using the incompatible features. If I'm missing some benefit of parse_fallback_args, it should be explained in either the documentation or the comments.

    Regarding the terminology: both intermixed and interspersed would be correct English words to describe the use case. So would intermingled :)

    Because Stephen "blessed" intermixed, and because it is used by getopt documentation (getopt has not been deprecated, optparse has), it seems to be the best term to use. Should optparse someday be removed, along with its documentation, the use of the term interspersed would also disappear, leaving more consistency in terminology.

    Alternative:

    Because optparse uses "interspersed" in an API, we cannot fix it to use "intermixed". However, we could fix the uses of "intermixed" to be "interspersed" prior to or at the time of accepting your patch to argparse... afterwards, it would be too late. Personally, I see no problem with the use of both terms in the documentation, and "intermixed" is the shortest, so I have a slight preference for that.

    @paulj3
    Copy link
    Mannequin

    paulj3 mannequin commented May 10, 2013

    'parse_fallback_args()' function is only in the 'test_intermixed.py' file, not the patch. It should be in the 'if __name__' section of that file, along with the modified 'exit()' method, since it is part of these testing suit, not meant to be imported. 'test_intermixed.py' is more of an example and discussion tool, not a formal test.

    I added the '_fallback' optional argument because the normal exit from 'parse_args' using SystemExit is remarkably uninformative. It's hard to distinguish between the 'fallback' errors, and the ones generated by 'parse_known_args' (that have more to do with the argument strings). One is a programming error, the other a user generated error.

    It is possible to redefine ArgumentParser.error() so it gives more information, for example by raising an Exception(message). I have added to test_intermixed.py an alternative 'parse_fallback_args' that uses such a modified error rather than the _fallback option. But that ends up modifying the normal parsing error generation as well.

    I used the 'fallback' idea to test 'parse_intermixed_args' against the whole set 'test_argparse.py' tests. It would nice to have a way of doing that automatically anytime other features are added to 'parse_args'. But I can't think of a clean way of doing that.

    Regarding earlier versions of these files - I do not see a way of deleting them.

    I have attached a modified test_intermixed.py that has these changes. I also modified how 'parse_known_intermixed_args' restores the original value of self.usage, using an outer 'try:finally:' block. I need to make a note to myself to put that in the patch.

    @vpython
    Copy link
    Mannequin Author

    vpython mannequin commented May 10, 2013

    paul j3: Regarding earlier versions of these files - I do not see a way of deleting them.

    Click on edit, then there is an option to unlink. I don't know if they ever actually get deleted, but it clears out the clutter when looking for the latest version.

    Will check out the newest code shortly.

    @paulj3
    Copy link
    Mannequin

    paulj3 mannequin commented May 10, 2013

    I should note one caveat:

    As a consequence of setting nargs to 0 for the first 'parse_know_args' step, all positional entries in the namespace get an empty list value ([]). This is produced by 'ArgumentParser._get_values'. With the builtin action classes this does not cause any problems.

    However a custom action class might have problems with this [] value.
    For example in 'test_argparse.py', TestActionUserDefined the PositionalAction class does check the values and throws an error with this [] value.

    The positional arguments are removed from the namespace before it is passed on to the 2nd 'parse_known_args', so these [] in the first don't affect the final namespace.

    I don't think anything about this should be added to main documentation, since it could confuse most readers. I might add a note of warning to the code itself.

    @vpython
    Copy link
    Mannequin Author

    vpython mannequin commented May 11, 2013

    OK, I've been running with the new code most the day, and it seems functional in my testing.

    I only "sort of" follow your discussion about the "custom action class" caveat, probably because I haven't used "custom action classes"... I tried once, but failed to achieve my goal, as it was more ambitious than they presently support. If the [] value is significantly problematical in some manner, could positional nargs be set to a sentinal value that would avoid the assignment of the [] value? I realize that would require code changes in some other function or functions, in addition to the added new functions, so that would make the patch a bit more intrusive.

    If _fallback helps some folks with understanding errors clearly, I won't object to it. I guess it would only be programmers that would be confused, because they would be the ones interpreting the errors... and with adequate testing, should fix the programming errors before the users get a chance to be confused.

    So maybe your next .patch will be ready to ship!

    @paulj3
    Copy link
    Mannequin

    paulj3 mannequin commented May 30, 2013

    This is a refinement of the patch with Message188609.

    In parse_known_intermixed_args, the temporary capture of formatted usage has been put in a try/finally structure.

    Positionals are now 'deactivated' with

        action.nargs = SUPPRESS
        action.default = SUPPRESS

    To use this, a 'nargs==SUPPRESS' case has been added to the relevant methods. In _get_args_pattern() it acts just like 'nargs=0'. In '_get_values()' it returns 'value=SUPPRESS'. The net effect is that, in take_action(), 'action' is not invoked, and that positional is not added to the namespace.

    Previously I used nargs=0, which put a [] value in the namespace, which then had to be deleted.

    I have not added anything about this SUPPRESS option to the documentation (PARSER isn't there either).

    When the parser uses incompatible features (e.g. REMAINDER), this now raises a TypeError. The effect is similar to giving add_argument incompatible definitions. The 'fallback' that I used earlier can be implemented with a simple 'try/except TypeError'. Other parsing errors go through the usual ArgumentParser.error() method.

    test_argparse.py has been changed to handle this TypeError.

    @vpython
    Copy link
    Mannequin Author

    vpython mannequin commented May 30, 2013

    These sound like good refinements. You've been thinking. By making the fallback happen externally, it simplifies the implementation of parse_intermixed_args, and forces the application to accept responsibility for calling it with a consistent set of arguments, or calling something else. I like that. I don't really see the fallback as a particularly useful feature, so pushing it outside the stdlib, yet still making it simple to implement for any that do find it to be useful, seems like a good tradeoff.

    @vpython
    Copy link
    Mannequin Author

    vpython mannequin commented Aug 16, 2013

    Paul, is this ready to merge, or are you thinking of more refinements?

    @paulj3
    Copy link
    Mannequin

    paulj3 mannequin commented Aug 20, 2013

    It's everything I intend to add. Now I'm just waiting for a committer to act, either with suggested changes, or a merge. I'm watching more than a dozen argparse patches.

    @vpython
    Copy link
    Mannequin Author

    vpython mannequin commented Aug 21, 2013

    So I read over your code again, and even read the documentation this time, and it all looks good, and I know it works good because I've been using the code. I tried to send a notice through Reitveld, and maybe did, but I don't know where it went, so I'll say this much here, too.

    @vadmium
    Copy link
    Member

    vadmium commented Aug 24, 2013

    It sounds like this bug might cover bpo-15112, which is only concerned with options between different positional parameters.

    @paulj3
    Copy link
    Mannequin

    paulj3 mannequin commented Aug 25, 2013

    Above in
    http://bugs.python.org/issue14191#msg187051
    I proposed a patch that is quite close to bethard's patch in http://bugs.python.org/issue15112#msg166173

    Both modify the same place, doing the same (pop items off arg_counts). The logic is a little different. I'd have to look at them more carefully to see whether one is more robust. Thanks for linking that issue.

    The main issue here is different, allowing for complete intermixing of optionals and positionals. Over at 15112 the issue is intermixing of optionals and 'whole' positionals. My 187051 patch belongs over there.

    @paulj3
    Copy link
    Mannequin

    paulj3 mannequin commented May 3, 2014

    I encountered a conflict when merging this patch with http://bugs.python.org/issue15112. In my first testcase, 'cmd' and 'rest' failed the 'required' test in phase one of 'intermixed'. That's because 15112 postponed parsing them (with nargs=0/suppressed).

    I got around that by temporarily setting the 'required' attribute to False.

    The whole issue of when (or even whether) a positional that is satisfied with 0 arguments, is consumed, is a bit messy.

    @bitdancer
    Copy link
    Member

    I've turned intermixed.patch into a PR. I tweaked the documentation so that it does not refer to the details of the implementation. I tweaked the implementation to have the 'try' start before the code that modifies the state, and did the line wrapping to <80 columns.

    I have one concern about this patch: parse_intermixed_args doesn't support all of the argparse features. It feels to me like it is worthwhile to add anyway, and maybe someone will figure out how to support more features later. But I'd like at least one other committer to concur :)

    Given concurrence I'll add news and what's new entries.

    paul.j3: I'm trying to review argparse patches during this core sprint week, so if you have anything in particular you want me to focus on, please let me know.

    @bitdancer bitdancer added 3.7 (EOL) end of life type-feature A feature request or enhancement and removed type-bug An unexpected behavior, bug, or error labels Sep 4, 2017
    @vpython
    Copy link
    Mannequin Author

    vpython mannequin commented Sep 5, 2017

    I would dearly love to discard my private copies of argparse subclasses to implement this that I have been using for the last 5 years. Please, some other committer, concur here!

    @bitdancer
    Copy link
    Member

    I got an offline agreement from Zach Ware, and nobody here at the sprint has objected (though I don't know if anyone else looked), so I'll go ahead and finish the PR.

    @bitdancer
    Copy link
    Member

    New changeset 0f6b9d2 by R. David Murray in branch 'master':
    bpo-14191 Add parse_intermixed_args. (bpo-3319)
    0f6b9d2

    @bitdancer
    Copy link
    Member

    Thanks Paul. By the way, if you want your "real name" in What's New, just let me know what it is and I'll make the change.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life stdlib Python modules in the Lib dir type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants