Issue 47043: Argparse can't parse subparsers with parse_known_args

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/91199

classification

Title:	Argparse can't parse subparsers with parse_known_args
Type:	behavior	Stage:
Components:	Parser	Versions:	Python 3.8

process

Status:	open	Resolution:
Dependencies:		Superseder:
Assigned To:		Nosy List:	lys.nikolaou, rive-n
Priority:	normal	Keywords:

Created on 2022-03-17 13:49 by rive-n, last changed 2022-04-11 14:59 by admin.

Messages (4)
msg415412 - (view)	Author: rive_n (rive-n)	Date: 2022-03-17 13:49
Let's say we have default parser configuration: https://docs.python.org/3/library/argparse.html#other-utilities Like this one: ```python3 import argparse parser = argparse.ArgumentParser(prog='PROG') parser.add_argument('--foo', action='store_true', help='foo help') subparsers = parser.add_subparsers(help='sub-command help') # create the parser for the "a" command parser_a = subparsers.add_parser('a', help='a help') parser_a.add_argument('-a', help='bar help') # create the parser for the "b" command parser_b = subparsers.add_parser('b', help='b help') parser_b.add_argument('-b', help='baz help') ``` So i want to parse all subparsers arguments. For this purpose i could use `parse_known_args` method. But for some reason there is no way to get all of current args specified via command prompt or via list: `print(parser.parse_known_args(['a', '-a', '12', 'b', '-b', '32']))` - will not print 'a' and 'b' data. It will only contain first valid argument; This is pretty strange behavior. Why i can't just get all of actions (subparsers) ?
msg415414 - (view)	Author: rive_n (rive-n)	Date: 2022-03-17 15:04
Let's say we have default parser configuration: https://docs.python.org/3/library/argparse.html#other-utilities Like this one: ```python3 import argparse parser = argparse.ArgumentParser(prog='PROG') parser.add_argument('--foo', action='store_true', help='foo help') subparsers = parser.add_subparsers(help='sub-command help') # create the parser for the "a" command parser_a = subparsers.add_parser('a', help='a help') parser_a.add_argument('-a', help='bar help') # create the parser for the "b" command parser_b = subparsers.add_parser('b', help='b help') parser_b.add_argument('-b', help='baz help') ``` So i want to parse all subparsers arguments. For this purpose i could use `parse_known_args` method. But for some reason there is no way to get all of current args specified via command prompt or via list: `print(parser.parse_known_args(['a', '-a', '12', 'b', '-b', '32']))` - will not print 'a' and 'b' data. It will only contain first valid argument; This is pretty strange behavior. Why i can't just get all of actions (subparsers) ? --- I've found a solutions: 1. Easy way (and most stupid): ```python3 args, unk_args = (parser.parse_known_args(['a', '-a', '12', 'b', '-b', '32'])) args, unk_args = (parser.parse_known_args(unk_args)) ``` Simple recursive calls on `unk_args` 2. Hard way -> sources fork (or just fix this, lul!) `parse_know_args` -> 1799-1807 lines: ```python3 try: namespace, args = self._parse_known_args(args, namespace) if hasattr(namespace, _UNRECOGNIZED_ARGS_ATTR): args.extend(getattr(namespace, _UNRECOGNIZED_ARGS_ATTR)) delattr(namespace, _UNRECOGNIZED_ARGS_ATTR) return namespace, args except ArgumentError: err = _sys.exc_info()[1] self.error(str(err)) ``` There is only 1 call to `self._parse_known_args` instead of amount of current subparsers.
msg415490 - (view)	Author: rive_n (rive-n)	Date: 2022-03-18 12:28
Hi again. previous solution with: ```python3 try: namespace, args = self._parse_known_args(args, namespace) if hasattr(namespace, _UNRECOGNIZED_ARGS_ATTR): args.extend(getattr(namespace, _UNRECOGNIZED_ARGS_ATTR)) delattr(namespace, _UNRECOGNIZED_ARGS_ATTR) return namespace, args except ArgumentError: err = _sys.exc_info()[1] self.error(str(err)) ``` Is not working at all. So i spent some time (a lot of time) to make changes in source code. I figured out exactly how the algorithm works, I read about 3,000 lines of code. And here's what I came up with: argparse.py: line 1774: parse_known_args basically this function calling protected one on line 1800: amespace, args = self._parse_known_args(args, namespace) This one is making magic. But we need to check line 1856: def take_action(action, argument_strings, option_string=None): This function creating objects of necessary classes. For example: _SubParsersAction (problem in it). Before fix: ```python3 def __call__(self, parser, namespace, values, option_string=None): parser_name = values[0] arg_strings = values[1:] # set the parser name if requested if self.dest is not SUPPRESS: setattr(namespace, self.dest, parser_name) # select the parser try: parser = self._name_parser_map[parser_name] except KeyError: args = {'parser_name': parser_name, 'choices': ', '.join(self._name_parser_map)} msg = _('unknown parser %(parser_name)r (choices: %(choices)s)') % args raise ArgumentError(self, msg) # parse all the remaining options into the namespace # store any unrecognized options on the object, so that the top # level parser can decide what to do with them # In case this subparser defines new defaults, we parse them # in a new namespace object and then update the original # namespace for the relevant parts. subnamespace, arg_strings = parser.parse_known_args(arg_strings, None) for key, value in vars(subnamespace).items(): setattr(namespace, key, value) if arg_strings: vars(namespace).setdefault(_UNRECOGNIZED_ARGS_ATTR, []) getattr(namespace, _UNRECOGNIZED_ARGS_ATTR).extend(arg_strings) ``` After fix: ```python3 def __call__(self, parser, namespace, values, option_string=None, arg_strings_pattern:list =None): o_amount = arg_strings_pattern.count("O") if not o_amount: raise ValueError("No Os found") o_start, o_stop, indexes = arg_strings_pattern.index('O'), len(arg_strings_pattern), [] print(parser) try: while arg_strings_pattern.index('O', o_start, o_stop): indexes.append(arg_strings_pattern.index('O', o_start, o_stop)) o_start = arg_strings_pattern.index('O', o_start + 1, o_stop) except ValueError: pass for parser in range(o_amount): parser_name = values[indexes[parser] - 1] # indexes[parser] could give int (real index) arg_strings = values[indexes[parser]: indexes[(parser + 1)] - 1] if parser < len(indexes) - 1 else \ values[indexes[parser]:] # could give all data # set the parser name if requested if self.dest is not SUPPRESS: setattr(namespace, self.dest, parser_name) # select the parser try: parser = self._name_parser_map[parser_name] except KeyError: args = {'parser_name': parser_name, 'choices': ', '.join(self._name_parser_map)} msg = _('unknown parser %(parser_name)r (choices: %(choices)s)') % args raise ArgumentError(self, msg) # parse all the remaining options into the namespace # store any unrecognized options on the object, so that the top # level parser can decide what to do with them # In case this subparser defines new defaults, we parse them # in a new namespace object and then update the original # namespace for the relevant parts. subnamespace, arg_strings = parser.parse_known_args(arg_strings, None) for key, value in vars(subnamespace).items(): setattr(namespace, key, value) if arg_strings: vars(namespace).setdefault(_UNRECOGNIZED_ARGS_ATTR, []) getattr(namespace, _UNRECOGNIZED_ARGS_ATTR).extend(arg_strings) ``` That's not the best solution but this solution works.
msg415751 - (view)	Author: rive_n (rive-n)	Date: 2022-03-22 08:42
Long time no updates here. Another fix. In past version more than 1 argument could not be parsed. Fix (finally with unittests): ```python3 def __call__(self, parser, namespace, values, option_string=None, arg_strings_pattern:list =None): o_amount = arg_strings_pattern.count("O") if not o_amount: raise ValueError("No Os found") o_start, o_stop, indexes = arg_strings_pattern.index('O'), len(arg_strings_pattern), [] print(parser) try: while arg_strings_pattern.index('O', o_start, o_stop): indexes.append(arg_strings_pattern.index('O', o_start, o_stop)) o_start = arg_strings_pattern.index('O', o_start + 1, o_stop) except ValueError: pass used_indexes = [] known_args = {} for i, index in enumerate(indexes): parser_name = values[index - 1] if not known_args.get(parser_name): known_args[parser_name] = [] known_args[parser_name] += values[index: indexes[i + 1] - 1] if i + 1 < len(indexes) else values[index:] if index not in used_indexes: for s, subindex in enumerate(indexes[1:]): subparser_name = values[subindex - 1] if parser_name == subparser_name: used_indexes.append(index) used_indexes.append(subindex) subparser_args = values[subindex: indexes[s + 2] - 1] if s + 2 < len(indexes) else values[subindex:] known_args[parser_name] += subparser_args for parser_name, args in known_args.items(): self._create_parser(namespace, parser_name, args) def _create_parser(self, namespace, parser_name, arg_strings): # set the parser name if requested if self.dest is not SUPPRESS: setattr(namespace, self.dest, parser_name) # select the parser try: parser = self._name_parser_map[parser_name] except KeyError: args = {'parser_name': parser_name, 'choices': ', '.join(self._name_parser_map)} msg = _('unknown parser %(parser_name)r (choices: %(choices)s)') % args raise ArgumentError(self, msg) # parse all the remaining options into the namespace # store any unrecognized options on the object, so that the top # level parser can decide what to do with them # In case this subparser defines new defaults, we parse them # in a new namespace object and then update the original # namespace for the relevant parts. subnamespace, arg_strings = parser.parse_known_args(arg_strings, None) for key, value in vars(subnamespace).items(): setattr(namespace, key, value) if arg_strings: vars(namespace).setdefault(_UNRECOGNIZED_ARGS_ATTR, []) getattr(namespace, _UNRECOGNIZED_ARGS_ATTR).extend(arg_strings) ``` Unittests: ```python3 import unittest import argfork as argparse from argfork import Namespace class argparseTest(unittest.TestCase): def setUp(self) -> None: self.parser = argparse.ArgumentParser(prog='PROG') subparsers = self.parser.add_subparsers(help='sub-command help') # create the parser for the "a" command parser_a = subparsers.add_parser('a', help='a help') parser_a.add_argument('-a', help='bar help') # create the parser for the "b" command parser_b = subparsers.add_parser('b', help='b help') parser_b.add_argument('-b', help='baz help') parser_b.add_argument('-q', help='baz help') # create the parser for the "c" command parser_b = subparsers.add_parser('c', help='b help') parser_b.add_argument('-c', help='baz help') parser_b.add_argument('-k', help='baz help') # create the parser for the "c" command parser_b = subparsers.add_parser('d', help='b help') parser_b.add_argument('-d', help='baz help') parser_b.add_argument('-D', help='baz help') parser_b.add_argument('-R', help='baz help') def testSimple(self): case = ['a', '-a', 'test'] res_obj = Namespace(a='test').__dict__ rest_obj = self.parser.parse_known_args(case)[0].__dict__ res_k, res_v = res_obj.keys(), list(res_obj.values()) test_k, test_v = rest_obj.keys(), list(rest_obj.values()) self.assertEqual(res_v, test_v) self.assertEqual(res_k, test_k) def testMany(self): case = ['d', '-d', '1234', 'd', '-D', '12345', 'd', '-R', '1', 'c', '-c', '123', 'c', '-k', '555', 'b', '-q', 'test'] res_obj = Namespace(d='1234', D='12345', R='1', c='123', k='555', b=None, q='test').__dict__ rest_obj = self.parser.parse_known_args(case)[0].__dict__ res_k, res_v = res_obj.keys(), list(res_obj.values()) test_k, test_v = rest_obj.keys(), list(rest_obj.values()) self.assertEqual(res_v, test_v) self.assertEqual(res_k, test_k) def testZero(self): case = [] res_obj = Namespace().__dict__ rest_obj = self.parser.parse_known_args(case)[0].__dict__ res_k, res_v = res_obj.keys(), list(res_obj.values()) test_k, test_v = rest_obj.keys(), list(rest_obj.values()) self.assertEqual(res_v, test_v) self.assertEqual(res_k, test_k) if __name__ == '__main__': unittest.main() ```

History
Date	User	Action	Args
2022-04-11 14:59:57	admin	set	github: 91199
2022-03-22 08:42:30	rive-n	set	messages: + msg415751
2022-03-18 12:28:44	rive-n	set	messages: + msg415490
2022-03-17 15:07:37	pablogsal	set	nosy: - pablogsal
2022-03-17 15:04:28	rive-n	set	messages: + msg415414
2022-03-17 13:49:54	rive-n	create