Message 29709 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	drylock
Recipients
Date	2006-08-29.21:16:22
SpamBayes Score
Marked as misclassified
Message-id
In-reply-to

Content
Python 2.5c1 (r25c1:51305, Aug 19 2006, 18:23:29) [GCC 4.1.2 20060814 (prerelease) (Debian 4.1.1-11)] on linux2 (Also seen in 2.4) shlex.split do not like unicode strings: >>> shlex.split(u"foo") ['f\x00\x00\x00o\x00\x00\x00o\x00\x00\x00'] The shlex code IMO suggests that it should accept unicode (as it checks for argument being an instance of basestring). Digging slightly into this, this seems to be a difference between StringIO and cStringIO. While cStringIO claims it accepts unicode as long as it encode too ASCII it gives invalid results: >>> sys.getdefaultencoding() 'ascii' >>> cStringIO.StringIO(u'foo').getvalue() 'f\x00\x00\x00o\x00\x00\x00o\x00\x00\x00' Perhaps cStringIO should .encode to ASCII encoding before consuming the input, as I can't imagine anyone cares about the above result (which I guess are the UCS-2 or UCS-4 characters).

Python 2.5c1 (r25c1:51305, Aug 19 2006, 18:23:29) 
[GCC 4.1.2 20060814 (prerelease) (Debian 4.1.1-11)] on
linux2

(Also seen in 2.4)

shlex.split do not like unicode strings:

>>> shlex.split(u"foo")
['f\x00\x00\x00o\x00\x00\x00o\x00\x00\x00']

The shlex code IMO suggests that it should accept
unicode (as it checks for argument being an instance of
basestring).

Digging slightly into this, this seems to be a
difference between StringIO and cStringIO. While
cStringIO claims it accepts unicode as long as it
encode too ASCII it gives invalid results:

>>> sys.getdefaultencoding()
'ascii'


>>> cStringIO.StringIO(u'foo').getvalue()
'f\x00\x00\x00o\x00\x00\x00o\x00\x00\x00'

Perhaps cStringIO should .encode to ASCII encoding
before consuming the input, as I can't imagine anyone
cares about the above result (which I guess are the
UCS-2 or UCS-4 characters).

History
Date	User	Action	Args
2007-08-23 14:42:26	admin	link	issue1548891 messages
2007-08-23 14:42:26	admin	create