This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author Santiago.Romero
Recipients Santiago.Romero, belopolsky, benjamin.peterson, cgwalters, cvrebert, dexen, doughellmann, eric.araujo, fperez, loewis, mark.dickinson, mcepl, nwerneck, r.david.murray, rhettinger, vstinner
Date 2011-07-13.07:51:47
SpamBayes Score 0.0032916483
Marked as misclassified No
Message-id <1310543508.39.0.0494174695906.issue1170@psf.upfronthosting.co.za>
In-reply-to
Content
I think I'm suffering the same problem in some small programs that use shlex:


>>> import shlex

>>> text = "python and shlex"
>>> shlex.split(text)
['python', 'and', 'shlex']

>>> text = u"python and shlex"
>>> shlex.split(text)
['p\x00\x00\x00y\x00\x00\x00t\x00\x00\x00h\x00\x00\x00o\x00\x00\x00n\x00\x00\x00', '\x00\x00\x00a\x00\x00\x00n\x00\x00\x00d\x00\x00\x00', '\x00\x00\x00s\x00\x00\x00h\x00\x00\x00l\x00\x00\x00e\x00\x00\x00x\x00\x00\x00']


 I'm currently using the following "basic" workaround (while assuming that my strings have only ascii chars):

>>> [ x.replace("\0", "") for x in shlex.split(text) ]
['python', 'and', 'shlex']

 It would be very nice if shlex could work with unicode strings ...

 Thanks.
History
Date User Action Args
2011-07-13 07:51:48Santiago.Romerosetrecipients: + Santiago.Romero, loewis, rhettinger, mark.dickinson, belopolsky, vstinner, dexen, benjamin.peterson, cgwalters, mcepl, eric.araujo, doughellmann, r.david.murray, nwerneck, cvrebert, fperez
2011-07-13 07:51:48Santiago.Romerosetmessageid: <1310543508.39.0.0494174695906.issue1170@psf.upfronthosting.co.za>
2011-07-13 07:51:47Santiago.Romerolinkissue1170 messages
2011-07-13 07:51:47Santiago.Romerocreate