classification
Title: argparse repeats itself when formatting help metavars
Type: behavior Stage:
Components: Library (Lib) Versions: Python 3.11
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: LewisGaul, forest, jkloth, paul.j3, rhettinger
Priority: normal Keywords:

Created on 2021-09-06 02:18 by forest, last changed 2021-09-11 08:55 by LewisGaul.

Messages (13)
msg401110 - (view) Author: Forest (forest) Date: 2021-09-06 02:18
When argparse actions have multiple option strings and at least one argument, the default formatter presents them like this:

  -t ARGUMENT, --task ARGUMENT
                        Perform a task with the given argument.
  -p STRING, --print STRING
                        Print the given string.

By repeating the metavars, the formatter wastes horizontal space, making the following side-effects more likely:

- The easy-to-read tabular format is undermined by overlapping text columns.
- An action and its description are split apart, onto separate lines.
- Fewer actions can fit on the screen at once.
- The user is presented with extra noise (repeat text) to read through.


I think the DRY principle is worth considering here. Help text would be cleaner, more compact, and easier to read if formatted like this:

  -t, --task ARGUMENT   Perform a task with the given argument.
  -p, --print STRING    Print the given string.

Obviously, actions with especially long option strings or metavars could still trigger line breaks, but they would be much less common and still easier to read without the repeat text.


I am aware of ArgumentParser's formatter_class option, but unfortunately, it is of little help here.  Since the default formatter class reserves every stage of its work as a private implementation detail, I cannot safely subclass it to get the behavior I want.  My choices are apparently to either re-implement an unreasonably large swath of its code in my own formatter class, or override the private _format_action_invocation() method in a subclass and risk future breakage (and still have to re-implement more code than is reasonable.)

Would it make sense to give HelpFormatter a "don't repeat yourself" option?  (For example, a boolean class attribute could be overridden by a subclass and would be a small change to the existing code.)

Alternatively, if nobody is attached to the current behavior, would it make sense to simply change HelpFormatter such that it never repeats itself?
msg401111 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2021-09-06 03:11
I don't agree this should be changed.  The repetition helps improve understanding because not everyone would assume that a METAVAR shown once would automatically also apply to its long form.   

Also, showing the METAVAR more than one is a norm.  For example, see this excerpt from "man grep":

     -B num, --before-context=num
             Print num lines of leading context before each match.  See also the -A and -C options.

     -b, --byte-offset
             The offset in bytes of a matched pattern is displayed in front of the respective matched line.

     -C[num, --context=num]
             Print num lines of leading and trailing context surrounding each match.  The default is 2 and is equivalent to
             -A 2 -B 2.  Note: no whitespace may be given between the option and its argument.

     -c, --count
             Only a count of selected lines is written to standard output.

     --colour=[when, --color=[when]]
             Mark up the matching text with the expression stored in GREP_COLOR environment variable.  The possible values of
             when can be `never', `always' or `auto'.

     -D action, --devices=action
             Specify the demanded action for devices, FIFOs and sockets.  The default action is `read', which means, that
             they are read as if they were normal files.  If the action is set to `skip', devices will be silently skipped.

     -d action, --directories=action
             Specify the demanded action for directories.  It is `read' by default, which means that the directories are read
             in the same manner as normal files.  Other possible values are `skip' to silently ignore the directories, and
             `recurse' to read them recursively, which has the same effect as the -R and -r option.
msg401112 - (view) Author: Jeremy Kloth (jkloth) * Date: 2021-09-06 03:40
Except that the output in question is not for manpages but for the command-line.  The analogous would be for `grep --help` (again an excerpt):

Context control:
  -B, --before-context=NUM  print NUM lines of leading context
  -A, --after-context=NUM   print NUM lines of trailing context
  -C, --context=NUM         print NUM lines of output context
  -NUM                      same as --context=NUM
      --color[=WHEN],
      --colour[=WHEN]       use markers to highlight the matching strings;
                            WHEN is 'always', 'never', or 'auto'

[using grep (GNU grep) 3.1]
msg401113 - (view) Author: Forest (forest) Date: 2021-09-06 04:04
On Mon, 06 Sep 2021 03:11:16 +0000, Raymond Hettinger wrote:

>The repetition helps improve understanding because not everyone would assume
>that a METAVAR shown once would automatically also apply to its long form.

I'm struggling to think of a real-world example that would lead someone to
think otherwise.  Is there a program with a short & long form option where
only one of those accepts an argument?

If such a thing does exist somewhere, the current behavior seems even worse
in that case: it shows the METAVAR alongside both forms, despite only one
form accepting an argument.

>Also, showing the METAVAR more than one is a norm.  For example, see this
>excerpt from "man grep":

I disagree about that being a norm. Counterexamples include:

cp -t, --target-directory=DIRECTORY
mv -S, --suffix=SUFFIX
ls -T, --tabsize=COLS
man -L, --locale=LOCALE

And, as Jeremy pointed out, we are not discussing man pages here, but
command line help text.  Even grep does it the way I suggest:

grep -e, --regexp=PATTERNS
grep -f, --file=FILE
grep -m, --max-count=NUM
(etc.)

More importantly, even if we do accept the current behavior as potentially
useful, do we really want Python's standard library to prescribe it?  Should
the application developer not be given an easy way for her program to
display cleaner, simpler, more space-efficient help text?
msg401114 - (view) Author: Forest (forest) Date: 2021-09-06 04:08
By the way, I would be happy to submit a patch, either to remove the repeat
text or to make it optional via an easily overridden class attribute.
msg401115 - (view) Author: paul j3 (paul.j3) * (Python triager) Date: 2021-09-06 04:58
This is has been requested various times on StackOverflow, and possibly here (I'd have to do a search).

The closest thing to making a compact action_invocation is to set the metavar to '', and even thing we get a space, eg.

     -f , --foo   Help text

This repeat has been a part of argparse from the beginning, so I can't see changing the default behavior.  But we could add a HelpFormatter subclass that changes one (or two methods) such as _format_action_invocation.  Subclassing the formatter is the accepted way of adding help features.   I, and possibly others, must have suggested such a change on SO.

Of course people can use such a subclass without it being part of the standard module.
msg401116 - (view) Author: paul j3 (paul.j3) * (Python triager) Date: 2021-09-06 05:13
https://bugs.python.org/issue42980 argparse: GNU-style help formatter

https://bugs.python.org/issue33389 argparse redundant help string

https://bugs.python.org/issue29626
Issue with spacing in argparse module while using help

https://bugs.python.org/issue27303
[argparse] Unify options in help output

https://stackoverflow.com/questions/23936145/python-argparse-help-message-disable-metavar-for-short-options

https://stackoverflow.com/questions/18275023/dont-show-long-options-twice-in-print-help-from-argparse
msg401118 - (view) Author: Forest (forest) Date: 2021-09-06 05:49
On Mon, 06 Sep 2021 04:58:38 +0000, paul j3 wrote:

>This repeat has been a part of argparse from the beginning, so I can't
>see changing the default behavior.

Yes, I guessed as much, which is why I first suggested making it optional.

>But we could add a HelpFormatter subclass that changes one (or two methods)
>such as _format_action_invocation.  Subclassing the formatter is the accepted
>way of adding help features.

If it was done in a subclass, how would people be expected to get the new
behavior *and* that of the other subclasses?  For example, someone with
pre-formatted description and epilog text is currently directed to use the
RawDescriptionHelpFormatter subclass.  If they also wanted to avoid repeat
metavars, and that behavior was implemented in another subclass, would they
be expected to write a third subclass inheriting from both module-defined
subclasses?

To me, multiclassing seems rather heavyweight for a simple behavior change
like this one, but yes, I recognize that argparse's current code uses that
approach.  Pity, that.

>Of course people can use such a subclass without it being part of the
>standard module.

Technically: sure. But practically: not so much.  An application would have
to subclass and override _format_action_invocation(), which (judging by the
leading underscore) appears to be intended as private to the module.  Even
the module doc string says so:

"""
(Also note that HelpFormatter and RawDescriptionHelpFormatter are only
considered public as object names -- the API of the formatter objects is
still considered an implementation detail.)
"""

So, a subclass that isn't part of the standard module is implicity and
explicitly discouraged by the module itself.

If we're married to the module's current policy for formatter tweaks, I
guess that leaves a module-defined subclass as the only option.  Here is
an example that works:

class OneMetavarHelpFormatter(argparse.HelpFormatter):
    """A formatter that avoids repeating action argument metavars.
    """
    def _format_action_invocation(self, action):
        "Format action help without repeating the argument metavar"
        if not action.option_strings or action.nargs == 0:
            return super()._format_action_invocation(action)

        default = self._get_default_metavar_for_optional(action)
        args_string = self._format_args(action, default)
        return ', '.join(action.option_strings) + ' ' + args_string
msg401119 - (view) Author: Forest (forest) Date: 2021-09-06 06:04
Here's another working example, allowing alternate separator strings (as
requested in #33389) via subclassing:

class OneMetavarHelpFormatter(argparse.HelpFormatter):
    """A formatter that avoids repeating action metavars.
    """
    OPTION_SEPARATOR = ', '
    METAVAR_SEPARATOR = ' '

    def _format_action_invocation(self, action):
        """Format action help without repeating the argument metavar
        """
        if not action.option_strings or action.nargs == 0:
            return super()._format_action_invocation(action)

        default = self._get_default_metavar_for_optional(action)
        args_string = self._format_args(action, default)
        options_string = self.OPTION_SEPARATOR.join(action.option_strings)
        return options_string + self.METAVAR_SEPARATOR + args_string
msg401120 - (view) Author: Forest (forest) Date: 2021-09-06 06:15
To be clear, I wrote those examples to be non-invasive, not patch proposals.
A cleaner approach would be possible if patching argparse is an option.  (I
believe the patch in #42980 proposes such an approach.)
msg401121 - (view) Author: paul j3 (paul.j3) * (Python triager) Date: 2021-09-06 06:17
The idea of combining help features by defining a subclass that inherits from other subclasses was endorsed by the original developer (we could dig up an old bug/issue to prove that).

The provided subclasses all tweak a "private" method, often one that's buried deep in the calling stack.

I can't quote any official policy, but my sense is that Python developers are ok with users subclassing and modifying "private" methods.  Methods, functions and classes (and variables) with leading '_' aren't documented, or imported via `__all__`, but otherwise the boundary between what is part of the user API and what's "hidden" is loose in Python.  

Apparently some corporate policies prohibit use or modification of things that aren't in the public API, but I don't think that policy protects you from changes.  'argparse' changes at a glacial rate, with a lot of concern for backward compatibility.  In fact it's that fear of unintended consequences that slows down the pace of change.  Subclassing a help formatter is preferred because it minimizes the chance of hurting existing users.
msg401122 - (view) Author: Forest (forest) Date: 2021-09-06 06:46
>Subclassing a help formatter is preferred because it minimizes the chance of 
>hurting existing users.

Fair enough.

Whatever the approach, I hope argparse can be made to support this through a
simple, documented interface.  I had to grovel through standard library code
to figure out what to override and what to duplicate in order to get the
output I want.  That seems like a needlessly high barrier for such a common
(and apparently oft-requested) format.
msg401635 - (view) Author: Lewis Gaul (LewisGaul) * Date: 2021-09-11 08:55
Big +1 from me for at least supporting a way to get the more concise output. I've never understood the verbosity of python's argparse where the metavar is repeated.
History
Date User Action Args
2021-09-11 08:55:11LewisGaulsetnosy: + LewisGaul
messages: + msg401635
2021-09-06 06:46:51forestsetmessages: + msg401122
2021-09-06 06:17:15paul.j3setmessages: + msg401121
2021-09-06 06:15:04forestsetmessages: + msg401120
2021-09-06 06:04:10forestsetmessages: + msg401119
2021-09-06 05:49:31forestsetmessages: + msg401118
2021-09-06 05:13:21paul.j3setmessages: + msg401116
2021-09-06 04:58:38paul.j3setmessages: + msg401115
2021-09-06 04:08:48forestsetmessages: + msg401114
2021-09-06 04:04:19forestsetmessages: + msg401113
2021-09-06 03:40:12jklothsetnosy: + jkloth
messages: + msg401112
2021-09-06 03:11:16rhettingersetmessages: + msg401111
versions: - Python 3.6, Python 3.7, Python 3.8, Python 3.9, Python 3.10
2021-09-06 02:49:15xtreaksetnosy: + rhettinger, paul.j3
2021-09-06 02:18:58forestcreate