classification
Title: argparse removing more "--" than it should
Type: behavior Stage:
Components: Library (Lib) Versions: Python 3.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: eric.smith, jol, paul.j3
Priority: normal Keywords:

Created on 2019-07-05 20:45 by jol, last changed 2019-07-07 19:16 by paul.j3.

Messages (9)
msg347378 - (view) Author: Jorge L. Martinez (jol) Date: 2019-07-05 20:45
$ python -c '
import argparse
p = argparse.ArgumentParser()
p.add_argument("first_arg")
p.add_argument("args", nargs="*")
r = p.parse_args(["foo", "--", "bar", "--", "baz", "--", "zap"])
print(r.first_arg + " " + " ".join(r.args))
'                        

returns:

foo bar baz -- zap

when I think it should return:

foo bar -- baz -- zap
msg347379 - (view) Author: Jorge L. Martinez (jol) Date: 2019-07-05 20:51
Sorry, I forgot to add details on my machine.

Python: 3.7.3
OS: Archlinux
msg347407 - (view) Author: Jorge L. Martinez (jol) Date: 2019-07-06 00:08
To be clear, my opinion is that a single call of parse_args() should only ever remove the first "--". Right now, it seems that it removes the first of each argument group, as determined by nargs, I guess.
msg347413 - (view) Author: paul j3 (paul.j3) * (Python triager) Date: 2019-07-06 01:21
There are earlier bug/issues about the '--'.  

Also look at the parser code itself.  Keep in mind that parsing is done in two passes - once to identify flags versus arguments ('O/A') and then to allocate strings to arguments.  I don't recall when '--' is being handled, possibly in both.
msg347434 - (view) Author: Jorge L. Martinez (jol) Date: 2019-07-06 15:58
> There are earlier bug/issues about the '--'.

Yes, there are:

https://bugs.python.org/issue9571
https://bugs.python.org/issue22223
https://bugs.python.org/issue14364

But this one seems separate. Though they're related, they don't seem like duplicates, so that's why I thought I'd make this one.

> Also look at the parser code itself.  Keep in mind that parsing is done in two passes - once to identify flags versus arguments ('O/A') and then to allocate strings to arguments. I don't recall when '--' is being handled, possibly in both.

I'm not sure what your point is here. I did take a quick look at the code, yesterday. I think the identification part you mention is done right. It marks 'O/A' as you mention. Then, when it sees "--", it marks it "-", and the rest of the arguments are marked as "A".

I didn't look at the start of the code of the second pass or how it is concretely linked to the first pass. However, I did see that in the example I gave, _get_values() in argparse.py gets called twice (apparently once per argument group as determined by nargs, I guess) to remove the "--" present in arg_strings. The first time, arg_strings is ["foo", "--"] and the second time it's ["bar", "--", "baz", "--", "zap"].

So, that's what happens, and where part of the fix should probably be. I don't think the removal of "--" should happen in a function that gets called multiple times. Though I didn't spend the time to see where the code should be positioned, I can only imagine the correct behavior would be to remove the argument marked as "-" by the first pass mentioned.

I didn't mention this yesterday, because I figured there wouldn't be much value in sharing incomplete research like this, as opposed to a patch. I didn't want to influence the work of whoever chose to invest time in this for a proper fix.
msg347435 - (view) Author: Jorge L. Martinez (jol) Date: 2019-07-06 16:15
> to remove the "--" present in arg_strings

*to remove the first "--" present...
msg347437 - (view) Author: paul j3 (paul.j3) * (Python triager) Date: 2019-07-06 18:10
I looked at this issue way back, in 2013:

https://bugs.python.org/issue13922

I probably shouldn't have tacked this on to a closed issue.
msg347442 - (view) Author: Jorge L. Martinez (jol) Date: 2019-07-06 19:38
Maybe I can find the time to make a patch this weekend (either today or tomorrow). I hope I'm not underestimating this somehow, but I don't think this would take too long. The only issue I can foresee is in disagreement of what the correct behavior should be, which is why I gave my opinion that a single call of parse_args() should only ever remove a single "--".

If I don't submit a patch by Monday (PDT), everyone should assume I decided not to tackle this.

By the way, does this issue tracking platform support submitting to the issue thread by email? Maybe, I'll try that.
msg347478 - (view) Author: paul j3 (paul.j3) * (Python triager) Date: 2019-07-07 19:16
https://bugs.python.org/file29845/dbldash.patch

while written against an earlier version of `argparse`, does what you want.  I moved the '--' removal out of `_get_values` and into `consume_positionals`.

Note that a full patch should include test cases and documentation changes if any.  

Also now github pull requests are the preferred patching route.  But don't expect rapid response on this.  The issue has been around for a long time without causing too many complaints.  

Backward compatibility is always a concern when making changes to core functionality.  We don't want to mess with someone's working code. 
 Though I doubt if there are many users who count on multiple '--' removals, one way or other.  The argparse docs explicity calls this a 'pseudo-argument'.
History
Date User Action Args
2019-07-07 19:16:53paul.j3setmessages: + msg347478
2019-07-06 19:38:34jolsetmessages: + msg347442
2019-07-06 18:10:07paul.j3setmessages: + msg347437
2019-07-06 16:15:19jolsetmessages: + msg347435
2019-07-06 15:58:15jolsetmessages: + msg347434
2019-07-06 01:21:40paul.j3setmessages: + msg347413
2019-07-06 01:11:25xtreaksetnosy: + paul.j3
2019-07-06 00:08:30jolsetmessages: + msg347407
2019-07-05 22:28:32eric.smithsetnosy: + eric.smith
2019-07-05 20:51:59jolsetmessages: + msg347379
2019-07-05 20:45:30jolcreate