classification
Title: In argparse adding wrong arguments makes malformed namespace
Type: behavior Stage:
Components: Library (Lib) Versions: Python 3.4, Python 3.5
process
Status: open Resolution: rejected
Dependencies: Superseder:
Assigned To: Nosy List: mbussonn, paul.j3, py.user, serhiy.storchaka
Priority: normal Keywords:

Created on 2015-05-31 11:01 by py.user, last changed 2019-07-11 11:43 by xtreak.

Messages (9)
msg244534 - (view) Author: py.user (py.user) * Date: 2015-05-31 11:01
>>> import argparse
>>> 
>>> parser = argparse.ArgumentParser()
>>> _ = parser.add_argument('foo bar')
>>> _ = parser.add_argument('--x --y')
>>> args = parser.parse_args(['abc'])
>>> 
>>> args
Namespace(foo bar='abc', x __y=None)
>>> 
>>> 'foo bar' in dir(args)
True
>>> 'x __y' in dir(args)
True
>>>

Passing wrong arguments silently makes a namespace which attributes are not accessible.

ISTM, add_argument() should raise a ValueError exception.
msg244542 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-05-31 14:51
They are accessible.

>>> getattr(args, 'foo bar')
'abc'

The limitation that argument names should be Python identifiers is too strong and adding it will break existing code (for example the use of such popular options as -0 or -@).
msg244561 - (view) Author: py.user (py.user) * Date: 2015-05-31 23:28
Serhiy Storchaka wrote:
> for example the use of such popular options as -0 or -@

Ok.

What about inconsistent conversion dashes to underscores?

>>> import argparse
>>> 
>>> parser = argparse.ArgumentParser(prefix_chars='@')
>>> _ = parser.add_argument('--x-one-two-three@')
>>> _ = parser.add_argument('@@y-one-two-three@')
>>> args = parser.parse_args(['abc'])
>>> args
Namespace(--x-one-two-three@='abc', y_one_two_three@=None)
>>>

We set dash as non-option char, but it continues to convert to underscore while another option char doesn't convert.
msg244562 - (view) Author: Matthias Bussonnier (mbussonn) * Date: 2015-05-31 23:30
Maybe the __repr__ of _AttributeHolder should be changed so that invalid args are shown as unpacked dict in the signature ?

Something that would :

>>> argparse.Namespace(**{'foo bar':1})
argparse.Namespace(**{'foo bar':1})
msg244565 - (view) Author: Matthias Bussonnier (mbussonn) * Date: 2015-05-31 23:43
Minimal changes to the repr seem to work.
I can submit a proper patch.

class N2(Namespace):
    
    def __repr__(self):
        type_name = type(self).__name__
        arg_strings = []
        unarg={}
        for arg in self._get_args():
            arg_strings.append(repr(arg))
        for name, value in self._get_kwargs():
            if name.isidentifier():
                arg_strings.append('%s=%r' % (name, value))
            else:
                unarg[name] = value
        if unarg:
            r_unarg = ', **%s' %(repr(unarg))
        else:
            r_unarg = ''
        return '%s(%s%s)' % (type_name, ', '.join(arg_strings), r_unarg)

>>> N2(a=1, b=2, **{"single ' quote ":"'", 'double " quote':'"'})
N = N2(a=1, b=2, **{"single ' quote ":"'", 'double " quote':'"'})
msg244777 - (view) Author: paul j3 (paul.j3) * (Python triager) Date: 2015-06-03 17:13
The code that converts '-' to '_' is independent of the code that uses 'prefix_chars'.

The '-' conversion handles a long standing UNIX practice of allowing that character in the middle of option flags.  It's an attempt to turn such flags into valid variable names.  There is a bug/issue about whether the conversion should be applied to positional argument 'dest' parameters.  

Is the use of other funny characters in optional flags common enough to warrant a patch?  It probably wouldn't be hard to convert all 'prefix_chars' to '_'.  But should it still convert '-', even if it isn't in that list?  What about users who are content with using 'getattr', and don't want the conversion?

Note also that you can invoke `parse_args` with your own custom Namespace object.

https://docs.python.org/3.4/library/argparse.html#the-namespace-object

This means you can write a Namespace class alternative that can handle funny characters in any way you want.  I discuss the use of custom Namespace classes in http://bugs.python.org/issue9351.

Between the availability of 'getattr' and namespace customization, I don't think there's anything here that requires a patch.  But I'm in favor of keeping the issue open for discussion.
msg244789 - (view) Author: paul j3 (paul.j3) * (Python triager) Date: 2015-06-03 21:15
http://bugs.python.org/issue15125
argparse: positional arguments containing - in name not handled well

Discussion on whether positionals 'dest' should translate '-' to '_'.
msg244791 - (view) Author: py.user (py.user) * Date: 2015-06-03 22:42
paul j3 wrote:
> It's an attempt to turn such flags into valid variable names.

I'm looking at code and see that he wanted to make it handy for use in a resulting Namespace.

args = argparse.parse_args(['--a-b-c'])
abc = args.a_b_c

If he doesn't convert, he cannot get attribute without getattr().

It's not a UNIX reason.
msg244793 - (view) Author: paul j3 (paul.j3) * (Python triager) Date: 2015-06-03 23:52
Yes, the '_' makes it accessible as an attribute name.  But the presence of '-' in the option name has a UNIX history.  That is a flag like '--a-b-c' is typical, '--a_b_c' is not.

There is less of precedent for a flag like '@@a@b' or '--a@b'.

Here's the relevant code from '_ActionContainer' class.

    def _get_optional_kwargs(self, *args, **kwargs):
        # determine short and long option strings
        ...
        for option_string in args:
            # error on strings that don't start with an appropriate prefix
            if not option_string[0] in self.prefix_chars:
                ...
                raise ValueError(msg % args)

            # strings starting with two prefix characters are long options
            option_strings.append(option_string)
            if option_string[0] in self.prefix_chars:
                if len(option_string) > 1:
                    if option_string[1] in self.prefix_chars:
                        long_option_strings.append(option_string)

        # infer destination, '--foo-bar' -> 'foo_bar' and '-x' -> 'x'
        dest = kwargs.pop('dest', None)
        if dest is None:
            if long_option_strings:
                dest_option_string = long_option_strings[0]
            else:
                dest_option_string = option_strings[0]
            dest = dest_option_string.lstrip(self.prefix_chars)
            if not dest:
                msg = _('dest= is required for options like %r')
                raise ValueError(msg % option_string)
            dest = dest.replace('-', '_')

Even if you need to have odd ball characters in the option flag, you don't have to settle for them in the 'dest'.  You can always give the argument a nice looking 'dest'.  

That's a rather common pattern in 'argparse'.  Provide a default handling for common cases, and provide parameters that let the user override those defaults.  The net effect is to limit the complexity of the code, while increasing the complexity of the documentation.
History
Date User Action Args
2019-07-11 11:43:33xtreaksetfiles: - 126.pdf
2019-07-11 11:39:04tomplatzsetfiles: + 126.pdf
2015-06-03 23:52:22paul.j3setmessages: + msg244793
2015-06-03 22:42:13py.usersetmessages: + msg244791
2015-06-03 21:15:00paul.j3setmessages: + msg244789
2015-06-03 17:13:47paul.j3setnosy: + paul.j3
messages: + msg244777
2015-05-31 23:43:43mbussonnsetmessages: + msg244565
2015-05-31 23:30:38mbussonnsetnosy: + mbussonn
messages: + msg244562
2015-05-31 23:28:11py.usersetstatus: pending -> open

messages: + msg244561
2015-05-31 14:51:32serhiy.storchakasetstatus: open -> pending

nosy: + serhiy.storchaka
messages: + msg244542

resolution: rejected
2015-05-31 11:01:36py.usercreate