This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Expected behavior of argparse given quoted strings
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.8
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: eric.smith, paul.j3, rhettinger, vegarsti
Priority: normal Keywords:

Created on 2020-08-20 11:24 by vegarsti, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Messages (9)
msg375702 - (view) Author: Vegard Stikbakke (vegarsti) * Date: 2020-08-20 11:24
I'm not sure if this is a bug, but I had a problem when I was trying to use argparse recently, and I was wondering about the expected behavior.

For context: We invoke a Python program from a deployment tool, where we provide input in a text box. We were using argparse to read and parse the input arguments. The scenario we had was we were requiring two named arguments to be given, as illustrated in the minimal example below.

```
# a.py

import argparse

parser = argparse.ArgumentParser()
parser.add_argument("--a", required=True)
parser.add_argument("--b", required=True)
parser.parse_args()
```

When invoking this program from this deployment tool giving `--a=1 --b=2` as input, we got the error message `a.py: error: the following arguments are required: --a, --b`.

As it turns out, the input was provided in the same way as if you had given the program a quoted string in the shell:

```
$ python a.py "--a=1 --b=2"
usage: a.py [-h] --a A --b B
a.py: error: the following arguments are required: --a, --b
```

When given a quoted string like this, `sys.argv` only has two elements, namely `a.py` and `--a=1 --b=2`. This was new to me! But it makes sense.

This was a bit annoying! One way to get around it, which we did indeed implement, is to mutate `sys.argv`, effectively unpacking the input string such that `sys.argv` ends up as `["a.py", "--a=1`, `--b=2`].

Given that the string contains named arguments, it seems to me that it could be possible, and safe, to unpack this quoted string. Would that make sense? Or am I using it incorrectly? Or is there some other way to provide input such that I don't have to do this hack that I mentioned?

If we make a similar program where the arguments `a` and `b` are not named arguments, but rather positional arguments,

```
# b.py

import argparse

parser = argparse.ArgumentParser()
parser.add_argument("a")
parser.add_argument("b")
parser.parse_args()
```

and we call the program as before with `python b.py "1 2"`, then `a` will be set to the string `1 2`, whereas `b` will not be set (and so the program will, of course, exit). This seems entirely reasonable. And perhaps it isn't possible to both get this behaviour, as well as the behaviour that I mentioned above.
msg375705 - (view) Author: Vegard Stikbakke (vegarsti) * Date: 2020-08-20 11:33
For what it's worth, I'd love to work on this if it's something that could be nice to have.
msg375706 - (view) Author: Vegard Stikbakke (vegarsti) * Date: 2020-08-20 11:56
It seems that I mixed up something in the post here. If the quoted string is `"--a=1 --b=2` as I said in the post, then the program will only complain about `b` missing. In this case, it sets `a` to be `1 --b=2`. Whereas if the quoted string is `"--a 1 --b 2"` (i.e. space and not `=` is used to separate), then it will say that both `a` and `b` are missing.
msg375707 - (view) Author: Vegard Stikbakke (vegarsti) * Date: 2020-08-20 12:00
In fact, what happens in the latter case (i.e. `"--a 1 --b 2"`), inside the call to `_parse_optional`, is that it fails to get the optional tuple. And so it continues to this line in argparse.py: 
https://github.com/python/cpython/blob/2ce39631f679e14132a54dc90ce764259d26e166/Lib/argparse.py#L2227

Here it says that if there's a space in the string, it was meant to be a positional, and so the function returns `None`, causing it to not find the argument.

In conclusion, it seems to me that argparse is not, in fact, meant to handle quoted strings, or rather, strings where there are spaces.
msg375710 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2020-08-20 13:44
This is all working as designed. We do not want to modify argparse to split parameters.

You probably want to split the input with shlex.split(). See https://stackoverflow.com/questions/44945815/how-to-split-a-string-into-command-line-arguments-like-the-shell-in-python

You shouldn't need to mutate sys.argv. You can break the input up into multiple strings with shlex.split() (or whatever you decide to use) and pass those to ArgumentParser.parse_args().
msg375711 - (view) Author: Vegard Stikbakke (vegarsti) * Date: 2020-08-20 13:54
I see! Thanks, had not heard about shlex. I also had not realized `parse_args` takes arguments. Doh. That makes sense.

Thanks a lot!
msg375715 - (view) Author: paul j3 (paul.j3) * (Python triager) Date: 2020-08-20 15:26
I'd say the problem is with the deployment tool.  Inputs like that should be split regardless of who's doing the commandline parsing.  With normal shell input, quotes are used to prevent splitting, or to otherwise prevent substitutions and special character handling.
msg375716 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2020-08-20 15:28
Completely agree with paul j3. The calling tool is breaking the "argv" conventions. If the OP can control the calling tool, it should be fixed there.
msg375717 - (view) Author: Vegard Stikbakke (vegarsti) * Date: 2020-08-20 15:46
Great idea, thanks! It's open source, so I'll see if I can fix it.

On Thu, 20 Aug 2020 at 17:28, Eric V. Smith <report@bugs.python.org> wrote:

>
>
> Eric V. Smith <eric@trueblade.com> added the comment:
>
>
>
> Completely agree with paul j3. The calling tool is breaking the "argv"
> conventions. If the OP can control the calling tool, it should be fixed
> there.
>
>
>
> ----------
>
>
>
> _______________________________________
>
> Python tracker <report@bugs.python.org>
>
> <https://bugs.python.org/issue41600>
>
> _______________________________________
>
>
History
Date User Action Args
2022-04-11 14:59:34adminsetgithub: 85766
2020-08-20 15:46:18vegarstisetmessages: + msg375717
2020-08-20 15:28:26eric.smithsetmessages: + msg375716
2020-08-20 15:26:09paul.j3setmessages: + msg375715
2020-08-20 13:56:57eric.smithsetstatus: open -> closed
type: enhancement -> behavior
resolution: not a bug
stage: resolved
2020-08-20 13:54:57vegarstisetmessages: + msg375711
2020-08-20 13:44:24eric.smithsetnosy: + eric.smith
messages: + msg375710
2020-08-20 12:33:32xtreaksetnosy: + rhettinger, paul.j3
2020-08-20 12:00:44vegarstisetmessages: + msg375707
2020-08-20 11:56:07vegarstisetmessages: + msg375706
2020-08-20 11:33:40vegarstisetmessages: + msg375705
2020-08-20 11:24:14vegarsticreate