classification
Title: add_option in optparse no longer accepts unicode string
Type: behavior Stage: resolved
Components: Library (Lib), Unicode Versions: Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: cmcqueen1975, eric.araujo, ezio.melotti, l0nwlf, mjjohnson, python-dev, r.david.murray
Priority: normal Keywords: patch

Created on 2010-07-05 05:36 by cmcqueen1975, last changed 2012-08-14 13:14 by python-dev. This issue is now closed.

Files
File name Uploaded Description Edit
issue9161_test.diff mjjohnson, 2012-08-14 01:56 test for python 2.7 r82581
Messages (11)
msg109300 - (view) Author: Craig McQueen (cmcqueen1975) Date: 2010-07-05 05:36
Working in Japan, I find it very helpful to be able to read full Unicode arguments in Python 2.x under Windows 2000/XP. So I am using the following:

http://stackoverflow.com/questions/846850/how-to-read-unicode-characters-from-command-line-arguments-in-python-on-windows/846931#846931

Brilliantly, the optparse module in Python 2.6 has worked fine with Unicode arguments. Sadly, it seems Python 2.7 is preventing this. When I try to run my program with Python 2.7, I get the following:

  ...
  File "c:\python27\lib\optparse.py", line 1018, in add_option
    raise TypeError, "invalid arguments"
TypeError: invalid arguments

It seems that the type check in optparse.py line 1018 has changed from this in 2.6:
    if type(args[0]) in types.StringTypes:

to this in 2.7:
    if type(args[0]) is types.StringType:

This makes it more difficult to support Unicode in 2.7, compared to 2.6. Any chance this could be reverted?
msg109303 - (view) Author: Craig McQueen (cmcqueen1975) Date: 2010-07-05 05:56
My program currently uses ASCII options, so I can change the Unicode string parameter to byte string. The optparse module still seems to match the option against the incoming Unicode argv, I guess by implicit string conversion.
msg109310 - (view) Author: Craig McQueen (cmcqueen1975) Date: 2010-07-05 07:12
To further explain, I had code e.g.:

    parser.add_option(u'-s', u'--seqfile', dest='seq_file_name', help=u'Write sequence file output to FILE', metavar=u'FILE')

I had to remove the unicode designator for the first parameter:
    parser.add_option('-s', u'--seqfile', dest='seq_file_name', help=u'Write sequence file output to FILE', metavar=u'FILE')

On further investigation, it looks as though the optparse module has other problems with Unicode: e.g. if I try to set a non-ASCII parameter on the command line e.g.:
    myprog.py -本
Then optparse can't handle that--it gets an encoding error on line 1396.

What _does_ work is that an option's parameters can be Unicode:
    myprog.py -s 本.txt

So I guess there are broader problems than the specific 2.6 to 2.7 change that I originally reported.
msg109331 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010-07-05 15:52
Too bad you didn't find this on one of the RCs.  The fix will have to wait for 2.7.1 now.

The line you originally quote as changing was, as far as I can tell, the original code (it enters our repository on 2006-04-22 in r45654 when optparse was upgraded to version 1.5.1 of optik).  The change to the plural form in that line was made for 2.6rc1 in r71345, possibly inadvertently since that changeset was the bump to rc1.

It would seem reasonable to change this to match 2.6 for 2.7.1, since as far as I know no bugs have been reported against it in 2.6, and it is unlikely to break working code in 2.7.

As for the more extensive problems, now that argparse is part of the standard library it would seem better to devote any development resources to it.  Can you test your issues against argparse, and open a new issue if it also has problems?
msg109334 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010-07-05 16:07
Regression fixed in r82581.  It would be nice to have a unit test, so I'm leaving this open to see if anyone wants to contribute one (it could probably be reused for argparse if argparse doesn't already have such a test).
msg166623 - (view) Author: Michael Johnson (mjjohnson) Date: 2012-07-28 01:16
Created a unit test for the patch that was committed in r82581. I can easily add the test to argparse too, if needed; I didn't see any tests related to Unicode in there.
msg168163 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012-08-14 01:35
@Michael: Thanks for working on this.  I don't see a patch attached to the issue, though.
msg168164 - (view) Author: Michael Johnson (mjjohnson) Date: 2012-08-14 01:56
Huh, let me try that again! I'm not sure how the attachment got dropped.
msg168166 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2012-08-14 02:04
New changeset 4c86a860e3d2 by R David Murray in branch '2.7':
#9161: add test for the bug fixed by r82581.
http://hg.python.org/cpython/rev/4c86a860e3d2
msg168167 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012-08-14 02:05
Thanks!
msg168199 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2012-08-14 13:14
New changeset ffd70c371fee by R David Murray in branch '2.7':
#9161: Fix test to use standard optparse test pattern (what was I thinking?)
http://hg.python.org/cpython/rev/ffd70c371fee
History
Date User Action Args
2012-08-14 13:14:53python-devsetmessages: + msg168199
2012-08-14 02:05:21r.david.murraysetstatus: open -> closed
resolution: fixed
messages: + msg168167

stage: test needed -> resolved
2012-08-14 02:04:51python-devsetnosy: + python-dev
messages: + msg168166
2012-08-14 01:56:04mjjohnsonsetfiles: + issue9161_test.diff
keywords: + patch
messages: + msg168164
2012-08-14 01:35:11r.david.murraysetmessages: + msg168163
2012-07-28 01:16:11mjjohnsonsetnosy: + mjjohnson
messages: + msg166623
2010-07-05 18:19:33l0nwlfsetnosy: + l0nwlf
2010-07-05 16:07:55r.david.murraysetmessages: + msg109334
2010-07-05 15:52:49r.david.murraysetnosy: + r.david.murray
messages: + msg109331
2010-07-05 11:45:58eric.araujosettype: behavior
stage: test needed
2010-07-05 11:45:39eric.araujosetnosy: + eric.araujo

type: behavior -> (no value)
stage: test needed -> (no value)
2010-07-05 07:12:34cmcqueen1975setmessages: + msg109310
2010-07-05 05:56:59cmcqueen1975setmessages: + msg109303
2010-07-05 05:52:17ezio.melottisetnosy: + ezio.melotti

type: behavior
components: + Library (Lib), Unicode
stage: test needed
2010-07-05 05:36:30cmcqueen1975create