This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author xtreak
Recipients xtreak
Date 2018-10-17.09:17:56
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1539767876.78.0.788709270274.issue35009@psf.upfronthosting.co.za>
In-reply-to
Content
argparse module uses str() in a few places where passing unicode strings will throw UnicodeDecodeError. In Python 3 these scripts run fine since Python 3 has unicode strings by default. I am working on this along with finding more places where this can throw error along with adding relevant tests for those scenarios. I couldn't find any related issues for this in the bug tracker. Feel free to close this if it's a duplicate. 

# foo_argparse.py with unicode choices

```
# -*- coding: utf-8 -*-

import argparse
parser = argparse.ArgumentParser()
parser.add_argument('--foo', help='foo help', choices=[u"早上好", u"早上好 早上好"])
args = parser.parse_args()

```

# printing help causes error since str is used for choices

$ ./python.exe ../backups/foo_argparse.py --help
Traceback (most recent call last):
  File "../backups/foo_argparse.py", line 5, in <module>
    parser.add_argument('--foo', help='foo help', choices=[u"早上好", u"早上好 早上好"])
  File "/Users/karthikeyansingaravelan/stuff/python/cpython/Lib/argparse.py", line 1308, in add_argument
    self._get_formatter()._format_args(action, None)
  File "/Users/karthikeyansingaravelan/stuff/python/cpython/Lib/argparse.py", line 578, in _format_args
    get_metavar = self._metavar_formatter(action, default_metavar)
  File "/Users/karthikeyansingaravelan/stuff/python/cpython/Lib/argparse.py", line 565, in _metavar_formatter
    choice_strs = [str(choice) for choice in action.choices]
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-2: ordinal not in range(128)


In case we use unicode() for the above and then also fix the place where comparison is done then using the correct choice passes. But using wrong choice throws an error where exception string is formed using str that causes error


$ ./python.exe ../backups/foo_argparse_unicode.py --foo 早上好 # passes
$ ./python.exe ../backups/foo_argparse_unicode.py --help # passes
usage: foo_argparse_unicode.py [-h] [--foo {早上好,早上好 早上好}]

optional arguments:
  -h, --help           show this help message and exit
  --foo {早上好,早上好 早上好}  foo help


$ ./python.exe ../backups/foo_argparse_unicode.py --foo 1 # Fails
Traceback (most recent call last):
  File "../backups/foo_argparse_unicode.py", line 7, in <module>
    args = parser.parse_args()
  File "/Users/karthikeyansingaravelan/stuff/python/cpython/Lib/argparse.py", line 1705, in parse_args
    args, argv = self.parse_known_args(args, namespace)
  File "/Users/karthikeyansingaravelan/stuff/python/cpython/Lib/argparse.py", line 1744, in parse_known_args
    self.error(str(err))
UnicodeEncodeError: 'ascii' codec can't encode characters in position 49-51: ordinal not in range(128)
History
Date User Action Args
2018-10-17 09:17:56xtreaksetrecipients: + xtreak
2018-10-17 09:17:56xtreaksetmessageid: <1539767876.78.0.788709270274.issue35009@psf.upfronthosting.co.za>
2018-10-17 09:17:56xtreaklinkissue35009 messages
2018-10-17 09:17:56xtreakcreate