This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: optparse and non-ascii help strings
Type: behavior Stage:
Components: Library (Lib) Versions: Python 2.7
process
Status: closed Resolution: wont fix
Dependencies: Superseder:
Assigned To: Nosy List: akuchling, ash, r.david.murray
Priority: normal Keywords:

Created on 2008-11-14 00:39 by akuchling, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Messages (3)
msg75847 - (view) Author: A.M. Kuchling (akuchling) * (Python committer) Date: 2008-11-14 00:39
(copied from the Optik bug tracker)

Related bug:
http://www.mail-archive.com/python-bugs-list@python.org/msg07227.html

Hi all,

It seems to me that the workaround to the above bug in optparse.py versio
1.5.3 introduces a new bug when help strings are byte strings (as opposed
to unicode) containing non-ascii characters. Consider the following
script:

$ cat test.py
#!/usr/bin/env python
# -*- coding:latin-1 -*-

import optparse
parser = optparse.OptionParser()
parser.add_option("--test",help="This does not work: é")
parser.parse_args()

When called with "$ ./test.py --help", this script fails with the following
traceback:

$ ./test.py -h
Traceback (most recent call last):
File "./test.py", line 7, in <module>
parser.parse_args()
File "/usr/lib/python2.5/optparse.py", line 1385, in parse_args
stop = self._process_args(largs, rargs, values)
File "/usr/lib/python2.5/optparse.py", line 1429, in _process_args
self._process_short_opts(rargs, values)
File "/usr/lib/python2.5/optparse.py", line 1536, in _process_short_opts
option.process(opt, value, values, self)
File "/usr/lib/python2.5/optparse.py", line 782, in process
self.action, self.dest, opt, value, values, parser)
File "/usr/lib/python2.5/optparse.py", line 804, in take_action
parser.print_help()
File "/usr/lib/python2.5/optparse.py", line 1655, in print_help
file.write(self.format_help().encode(encoding, "replace"))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe9 in position 117:
ordinal not in range(128)

This behaviour can be reproduced with utf-8 encoded strings as well.

If I understand correctly, line 1655 of optparse.py only works if
format_help() returns an ascii byte string or a unicode string, but the
call to "encoding" fails when it is a byte string containing non-ascii
character.

I think this is either a bug and should be fixed, or very misleading (and
should be fixed too :).

I hope to have helped even a little.
Thanks for optparse, and keep up the good work!

Cheers,
Antoine
msg90377 - (view) Author: Alexey Shamrin (ash) Date: 2009-07-10 06:44
There's nothing to fix here, I think... There's no point in allowing
arbitrary byte strings in help strings. Especially because Python 3
features unicode strings by default.

IMHO, this issue should be closed. And see also #2931 for remaining i18n
problems with optparse.
msg115140 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010-08-27 23:01
While I think there is indeed an expectations bug here, I also think it is no longer worth fixing.  argparse is the New Way, as is 3.x and its Strings.  Closing as wont fix.
History
Date User Action Args
2022-04-11 14:56:41adminsetgithub: 48569
2010-08-27 23:01:48r.david.murraysetstatus: open -> closed

nosy: + r.david.murray
messages: + msg115140

resolution: wont fix
2009-07-10 06:44:18ashsetnosy: + ash
messages: + msg90377
2008-11-14 00:39:21akuchlingcreate