classification
Title: Add native enum support for argparse
Type: enhancement Stage: resolved
Components: Library (Lib) Versions: Python 3.9
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: rhettinger Nosy List: barry, bethard, desbma, leycec, paul.j3, rhettinger
Priority: normal Keywords:

Created on 2015-09-10 18:04 by desbma, last changed 2019-08-30 06:22 by rhettinger. This issue is now closed.

Messages (18)
msg250399 - (view) Author: desbma (desbma) * Date: 2015-09-10 18:04
I often find myself using the following pattern with argparse:

import argparse
import enum

CustomEnumType = enum.Enum("CustomEnumType",
                           ("VAL1", "VAL2", "VAL3", ...))

arg_parser = argparse.ArgumentParser(...)
...
arg_parser.add_argument("-p",
                        "--param",
                        type="string",
                        action="store",
                        choices=tuple(t.name.lower() for t in CustomEnumType),
                        default=CustomEnumType.VAL1.name.lower(),
                        dest="param"
                        ...)
args = arg_parser.parse_args()
args.param = CustomEnumType[args.param.upper()]

I think it would be a great addition to be able to pass the enum type to the add_argument 'type' parameter directly, and have it validate the input and store the resulting enum.
msg250429 - (view) Author: paul j3 (paul.j3) * (Python triager) Date: 2015-09-10 23:24
The `type` parameter is a *function* that takes a string, and returns a valid value.  If it can't convert the string it is supposed to raise an error.  The default one is a do-nothing identity (take a string, return the string).  `int` is the Python function that converts a string to an integer.  Same for `float`.

Your example could be implemented with a simple type function:

    def enumtype(astring):
        try:
            return CustomEnumType[astring.upper()]
        except KeyError:
            raise argparse.ArgumentError()

    parser=argparse.ArgumentParser()
    parser.add_argument("-p", type=enumtype, default="VAL1")

    print(parser.parse_args([]))
    print(parser.parse_args(["-p","val2"]))
    print(parser.parse_args(["-p","val4"]))

which produces:

    1557:~/mypy$ python3 issue25061.py
    Namespace(p=<CustomEnumType.VAL1: 1>)
    Namespace(p=<CustomEnumType.VAL2: 2>)
    usage: issue25061.py [-h] [-p P]
    issue25061.py: error: argument -p: invalid enumtype value: 'val4'


The default and 'val2' produce enum values in the Namespace.

'val4' raises an error, resulting in the usage and error message.

I tried to customize the error message to include the list of valid strings, but it looks like I'll have to dig into the calling tree to do that right.  Still the default message is useful, displaying the name of the 'type' function.

If we were to add 'enum' support it would have to be something like 'argparse.FileType', a factory class that would take an enum class as parameter, and create a function like my sample.
msg250432 - (view) Author: paul j3 (paul.j3) * (Python triager) Date: 2015-09-11 01:00
Here's a type function that enumerates the enum in the error message:

def enumtype(astring):
    try:
        return CustomEnumType[astring.upper()]
    except KeyError:
        msg = ', '.join([t.name.lower() for t in CustomEnumType])
        msg = 'CustomEnumType: use one of {%s}'%msg
        raise argparse.ArgumentTypeError(msg)

You could do the same sort of enumeration in the `help` parameter (or even the metavar).

One further note - the input string is first passed through `type`, then it is checked against the `choices` (if any).  If it is converted to the enum in an type function, the choices will also have to be enums, not their string representation.  

String defaults are (usually) passed through `type`.  Nonstring defaults are not converted or tested.

So the value of this sort of type function depends on whether it is more convenient to work with the string representation or the enum itself.  When exactly do you want the commandline string to be converted to the enum?
msg250577 - (view) Author: desbma (desbma) * Date: 2015-09-13 17:13
I would like the enum type to be stored directly.

With your approach, the user does not know what are the possible choices, until he/she tries a invalid value and get the exception. If I pass the enum type to the choice parameter, the help message is not very user friendly ("<CustomEnumType.VAL1: 1>"...) and does not accurately represent possible input strings.

Anyway, my first example snippet works and does what I need, but it feels a bit like a hack.
In my opinion this is a classic use case for enums, that should be handled more naturally by the argparse module.
msg250584 - (view) Author: paul j3 (paul.j3) * (Python triager) Date: 2015-09-13 20:03
I'm not quite sure what you mean by 'the enum type to be stored directly'.

With my type function, the string input is converted to an enum object and that is stored in the Namespace.  You can't be any more direct than that.

Or are you thinking that `argparse` has some catalog of 'types' that it uses to check for value validity and conversion?  There isn't such a collection.  The value of the 'type' parameter has to be a callable.  (There is a registry mechanism, but that just maps strings on to callables.)

My suggestion does not provide help, but that isn't hard to write that for yourself.  A help parameter like:

    help='Enum: {%s}'%','.join([t.name for t in CustomEnumType])

would display:

usage: ipython3 [-h] [-p P]

optional arguments:
  -h, --help  show this help message and exit
  -p P        Enum: {VAL1,VAL2,VAL3}

For really long lists you could write a multiline help and display it with a RAW formatter.

I wouldn't use 'choices' with a type function like this.  The type function takes care of all the necessary testing and error messaging.  As you write, displaying the enum values would just be confusing.


If we define a dictionary wrapper for the enum class, the use of 'choices' might not seem so hacky:

     enumDict = {t.name.lower():t for t in CustomEnumType}
     parser.add_argument("-q", default="val1", choices=enumDict)

     parser.print_help()

producing:

usage: ipython3 [-h] [-p P] [-q {val1,val3,val2}]

optional arguments:
  -h, --help           show this help message and exit
  -p P                 Enum: {VAL1,VAL2,VAL3}
  -q {val1,val3,val2}

The keys of a 'choices' dictionary are used automatically in the usage and help.  That's nice when there are a few choices, but not so good when there are many.  Those problems have been discussed in other bug issues.

This dictionary can be used just like your code to convert the Namespace string to an enum object:

    In [57]: parser.parse_args([])
    Out[57]: Namespace(p=<Custom.VAL1: 1>, q='val1')
    In [58]: enumDict[_.q]
    Out[58]: <Custom.VAL1: 1>

And of course 'choices' handles the error listing as well - a plus or minus:

    In [59]: parser.parse_args(['-q','test'])
usage: ipython3 [-h] [-p P] [-q {val1,val3,val2}]
ipython3: error: argument -q: invalid choice: 'test' (choose from 'val1', 'val3', 'val2')
...

If we are going add anything to streamline the handling of 'enums', it should also streamline the handling of any mapping.  The main thing that 'enums' adds to Python is a uniqueness constraint - which is an insignificant issue in the argparse context.  My custom type function would work just as well with a dictionary.  And a dictionary would allow the use of aliases, as well as the upper/lower tweaking.
msg250675 - (view) Author: desbma (desbma) * Date: 2015-09-14 16:29
> With my type function, the string input is converted to an enum object and that is stored in the Namespace.  You can't be any more direct than that.

Yes I know, but in that case it's missing the autogenerated help message with the possible choices.

I know I can generate it manually, it just does not feel right for doing something so simple. IMO the goal of argparse is to unload the burden of checking/validating parameters manually, generating help messages with default/possible values, etc.

Your solution with a dictionnary is similar to what I currently use and wrote in my example, with the added drawback that the keys are randomly ordered in the help message, unless I use OrderedDict (one more import and more boilerplate code).

Each approach has its drawbacks, unless you write some additional code to workaround each limitation.

In a perfect world, argparse would:
* only show to the user the enum names, and use it in help/error messages, possible choice set, etc.
* after parsing, set the real enum value in the namespace
* and most importantly: to do that, don't require more code than just passing the enum
msg250703 - (view) Author: paul j3 (paul.j3) * (Python triager) Date: 2015-09-14 20:56
Here's a EnumType factory class, modeled on FileType.  

class EnumType(object):
    """Factory for creating enum object types
    """
    def __init__(self, enumclass):
        self.enums = enumclass

    def __call__(self, astring):
        name = self.enums.__name__
        try:
            return self.enums[astring.upper()]
        except KeyError:
            msg = ', '.join([t.name.lower() for t in self.enums])
            msg = '%s: use one of {%s}'%(name, msg)
            raise argparse.ArgumentTypeError(msg)

    def __repr__(self):
        astr = ', '.join([t.name.lower() for t in self.enums])
        return '%s(%s)' % (self.enums.__name__, astr)

It would be used like:

    parser=argparse.ArgumentParser()
    parser.add_argument("-p", type=EnumType(CustomEnumType),
        default="VAL1", help = 'type info: %(type)s')
    
'EnumType(CustomEnumType)' is as close as we are going to get to 'simply passing the enum' to the parser, given the current 'type' syntax.  This statement produces a callable object, the equivalent of my previous function.

By giving the class a `__repr__` it can also be used in the 'help' with the '%(type)s' syntax.  That's the main functionality that this factory adds to my previous function definition.

    parser.print_help()
    print(parser.parse_args([]))
    print(parser.parse_args(["-p","val2"]))
    print(parser.parse_args(["-p","val4"]))


produces

    usage: issue25061.py [-h] [-p P]
    optional arguments:
        -h, --help  show this help message and exit
        -p P        type info: CustomEnumType(val1, val2, val3)

    Namespace(p=<CustomEnumType.VAL1: 1>)
    Namespace(p=<CustomEnumType.VAL2: 2>)
    usage: issue25061.py [-h] [-p P]
    issue25061.py: error: argument -p: CustomEnumType: use one of
        {val1, val2, val3}

I was toying with writing a custom Action class that would create its own type and help attributes based on the enum parameter.  But since this EnumType.__repr__ takes care of the help angle, I don't think I'll bother.

If there's enough interest, I can imagine casting this EnumType as a formal patch, complete with tests and documentation.  Till then, feel free to experiment with and refine these ideas.
msg250916 - (view) Author: desbma (desbma) * Date: 2015-09-17 20:02
Thanks for sharing this code, I like the factory idea.

I'll have a look at creating a custom Action class too.
msg254004 - (view) Author: desbma (desbma) * Date: 2015-11-03 17:24
I came up with something that satisfies my needs (no boilerplate code, and intuitive add_argument call).

I modified your factory, and defined a simple action class (this is a quick and dirty prototype for discussion, I am in no way asking that such thing should be merged as such):

class EnumType(object):
    """Factory for creating enum object types
    """
    def __init__(self, enumclass, action):
        self.enums = enumclass
        self.action = action

    def __call__(self, astring):
        name = self.enums.__name__
        try:
            v = self.enums[astring.upper()]
        except KeyError:
            msg = ', '.join([t.name.lower() for t in self.enums])
            msg = '%s: use one of {%s}'%(name, msg)
            raise argparse.ArgumentTypeError(msg)
        else:
            self.action.choices = None  # hugly hack to prevent post validation from choices
            return v

    def __repr__(self):
        astr = ', '.join([t.name.lower() for t in self.enums])
        return '%s(%s)' % (self.enums.__name__, astr)


class StoreEnumAction(argparse._StoreAction):

  def __init__(self,
               option_strings,
               dest,
               type,
               nargs=None,
               const=None,
               default=None,
               required=False,
               help=None,
               metavar=None):
      super().__init__(option_strings=option_strings,
                       dest=dest,
                       nargs=nargs,
                       const=const,
                       default=default,
                       type=EnumType(type, self),
                       choices=tuple(t.name.lower() for t in type),
                       required=required,
                       help=help,
                       metavar=metavar)

Then all I have to do is to pass 'action=StoreEnumAction, type=TheEnum' to the add_argument call.

The good:
I get a proper usage line (which takes into account the value of 'nargs'), relevant error messages, and what is stored in the namespace after validation is the proper enum type, not a string.

The bad:
* The reference to the action inside the factory is ugly. This should probably be refractored to be all contained inside StoreEnumAction.
* The meaning of the 'type' parameter for StoreEnumAction is somewhat different than for other actions (enum class vs callable that validates)

What do you think?
msg254005 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2015-11-03 17:44
IMO, this adds more complexity than it solves.   Argparse already has more options than people can remember.  Also, it isn't clear whether this logic should go into the parser or into the business logic (consider for example that the requests package doesn't autogenerate enum results, not does sqlite) -- the extraction of values is a distinct step.
msg254047 - (view) Author: desbma (desbma) * Date: 2015-11-04 10:47
I guess the question is whether Enum should be considered a first class 'native' type that deserves support in argparse, or just a tool among others in the stdlib.

The fact that Enum is implemented as a class, and lives in a module, tends to lead to the second, but the fact that some constants were converted to enums in the stdlib (like in socket) tends to the first.

As a C/C++ developer, I may have a bias towards enums :)
msg254065 - (view) Author: paul j3 (paul.j3) * (Python triager) Date: 2015-11-04 16:54
The choice of 'type' for this parameter is occasionally confusing, because the connection to the Python 'type()' function or what we think of as 'native types' is only tangential.

A name like 'converter' or 'string_converter' would be more accurate (but I'm not advocating any change).  

There are lots of 'native' types (isn't everything 'first class' in Python?) that are not supported by 'argparse', simply because Python does not have functions that convert from string to that type.

For example:

   bool, list, tuple, dict, set

Technically `type=list` works, if you want `list('[1]') =>['[','i',']']`.  `type=json` is probably a better choice.

`type=int` works because there is a Python function, int(), that takes a string and returns an object of the same type name.  'argparse' does not do anything special to support it.

FileType is a nice example of how a factory class can be used as 'type' parameter, but shouldn't be confused with support for the 'file' 'native type'.

There's an on going tension between adding useful features and maintaining some level of simplicity and clarity in this module.
msg263166 - (view) Author: leycec (leycec) Date: 2016-04-11 06:02
I strongly support this feature request. Unsurprisingly, I wholeheartedly agree with desbma's heroic persistence and wholeheartedly disagree with rhettinger's curt dismissal.

> IMO, this adds more complexity than it solves.

Strongly disagree. Because "argparse" fails to support the core "Enum" type out-of-the-box, I now have to either (A) duplicate desbma's boilerplate or (B) duplicate paul.j3's original "EnumType" factory or desbma's revised "EnumType" factory across multiple "argparse"-based CLI interfaces in multiple Python applications having discrete codebases.

Both approaches are noxious, substantially increasing implementation burden and technical debt. While obviously feasible, both approaches violate DRY, invite code desynchronization and concomitant bugs, inhibit maintainability, intelligibility, and documentability, and... the list just crawls on.

DRY violations add complexity. Avoiding DRY violations decreases complexity.

> Argparse already has more options than people can remember.

That's what front-facing documentation, queryable docstrings, and https://docs.python.org/3/library/argparse.html are for. No one remembers even a tenth of the functionality provided by "argparse" or any other reasonably deep module (e.g., "importlib", "subprocess") in the stdlib, yet the stdlib justifiably grows, improves, and strengthens with time.

This is a good thing. API memorability and mnemonics, however, are not. We have machine lookup. Ergo, API memorability and mnemonics are poor metrics by which to gauge feature creep.

I'd hoped it would be intuitively obvious that "Enum" support should be officially added to "argparse". Enums are a core type native to most high-level languages, now including Python. Enum-based argument parsing is a Pythonic solution for string arguments accepting only a well-known set of valid alternatives. The stdlib itself is internally (albeit incrementally) migrating from non-Enums to Enums.

This needs to happen.
msg263217 - (view) Author: paul j3 (paul.j3) * (Python triager) Date: 2016-04-12 01:52
The best way to get an idea added is to write a good complete patch.  2nd best is to provide constructive input on ideas that are already here.  3rd is to illustrate how you would hope to use such a feature.  Include ideas on how the usage/help/error display would work.

At the risk of repeating myself, I'm still not convinced that being a 'native type' makes any difference.  Argparse does not support any native type, at least not directly.
msg263223 - (view) Author: paul j3 (paul.j3) * (Python triager) Date: 2016-04-12 05:42
desbma:

Rereading your latest code and comment:

> * The meaning of the 'type' parameter for StoreEnumAction is somewhat different than for other actions (enum class vs callable that validates)

it occurred to me that that parameter does not have to be named 'type'.  It could just as well be 'enumClass' or something else.  It's just local to the class __init__ method.

Something that's come up with other Action classes is that the parameter list is not well documented.  While there's a generic set of parameters, the subclasses vary in what they accept or require or ignore.  The docs don't elaborate, and the error messages can be cryptic.  A new class with a new parameter (whether new in name or meaning) can add to that confusion.

We need to think more abstractly, so we aren't just improving the handling of 'enums', but also all their derived and meta classes.  And possibly other mappings, where we want to do a 'key' lookup, and form help strings of the choices.
msg263229 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2016-04-12 06:11
Enum is just one tool among many, no more special than named tuples, nested dicts, string.Templates, regexes, pickles etc.  

The second issue is keeping the scope of argparse focused on its core task rather than trying to incorporate other parts of the standard library.  That is called separation-of-concerns or orthogonality.  A little parsimony is necessary for loose coupling and high cohesion.  We also don't want module sprawl or feature creep to impair maintainability or affect learnability.

That said, this is up to the module creator and maintainer, Steven Bethard.  He has the most experience with module and has the clearest vision of what its boundaries should be.
msg350853 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2019-08-30 06:04
Depending on how you want to expose enums to end-users, some reasonable options already exist:

    import argparse
    from enum import Enum

    class Shake(Enum):
        VANILLA = 7
        CHOCOLATE = 4
        COOKIES = 9
        MINT = 3

    # Option 1
    ap = argparse.ArgumentParser()
    ap.add_argument('shakes', nargs=2, choices=Shake, type=Shake.__getitem__)
    ns = ap.parse_args(['VANILLA', 'MINT'])
    print(ns)

    # Option 2
    ap = argparse.ArgumentParser()
    ap.add_argument('shakes', nargs=2, choices=Shake.__members__)
    ns = ap.parse_args(['VANILLA', 'MINT'])
    ns.shakes = [Shake[name] for name in ns.shakes]
    print(ns)

In Option 1, the user sees choices of:
    {Shake.VANILLA,Shake.CHOCOLATE,Shake.COOKIES,Shake.MINT}

In Option 2, the user sees choices of:
    {VANILLA,CHOCOLATE,COOKIES,MINT}
msg350855 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2019-08-30 06:22
Even with the proposed converter class, I don't see a straight-forward way to meet the OP's goal of just specifying type=EnumConverter(MyEnum)
to get all of:

* display all possible values in help
* display them in lowercase
* accept them in lowercase
* and convert them back to the correct case before casting to Enum 
* interact meaningfully with "default"
* and not interact badly with "choices"
History
Date User Action Args
2019-08-30 06:22:50rhettingersetstatus: open -> closed
versions: + Python 3.9
messages: + msg350855

resolution: rejected
stage: resolved
2019-08-30 06:04:49rhettingersetmessages: + msg350853
2019-08-30 03:31:56rhettingersetassignee: bethard -> rhettinger
2016-04-12 06:11:26rhettingersetmessages: + msg263229
2016-04-12 05:42:49paul.j3setmessages: + msg263223
2016-04-12 01:52:35paul.j3setmessages: + msg263217
2016-04-11 06:02:52leycecsetnosy: + leycec
messages: + msg263166
2015-11-04 16:54:16paul.j3setmessages: + msg254065
2015-11-04 10:47:13desbmasetmessages: + msg254047
2015-11-03 17:44:05rhettingersetassignee: bethard

messages: + msg254005
nosy: + rhettinger, bethard
2015-11-03 17:24:34desbmasetmessages: + msg254004
2015-09-17 20:02:27desbmasetmessages: + msg250916
2015-09-14 20:56:17paul.j3setmessages: + msg250703
2015-09-14 16:29:03desbmasetmessages: + msg250675
2015-09-13 20:03:16paul.j3setmessages: + msg250584
2015-09-13 17:13:37desbmasetmessages: + msg250577
2015-09-11 01:00:56paul.j3setmessages: + msg250432
2015-09-10 23:24:53paul.j3setnosy: + paul.j3
messages: + msg250429
2015-09-10 18:38:12desbmasetcomponents: + Library (Lib)
2015-09-10 18:15:02barrysetnosy: + barry
2015-09-10 18:04:05desbmacreate