This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author Daniel.U..Thibault
Recipients Daniel.U..Thibault, docs@python, georg.brandl, r.david.murray
Date 2014-03-20.20:00:38
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1395345638.14.0.0710363520699.issue20686@psf.upfronthosting.co.za>
In-reply-to
Content
>>> mystring="äöü"
>>> myustring=u"äöü"

>>> mystring
'\xc3\xa4\xc3\xb6\xc3\xbc'
>>> myustring
u'\xe4\xf6\xfc'

>>> str(mystring)
'\xc3\xa4\xc3\xb6\xc3\xbc'
>>> str(myustring)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-2: ordinal not in range(128)

>>> f = open('workfile', 'w')
>>> f.write(mystring)
>>> f.close()
>>> f = open('workufile', 'w')
>>> f.write(myustring)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-2: ordinal not in range(128)
>>> f.close()

workfile contains C3 A4 C3 B6 C3 BC

So the Unicode string (myustring) does indeed try to convert to ASCII when written to file. But not when just printed.

It seems really strange that non-Unicode strings (mystring) should actually be more flexible than Unicode strings...
History
Date User Action Args
2014-03-20 20:00:38Daniel.U..Thibaultsetrecipients: + Daniel.U..Thibault, georg.brandl, r.david.murray, docs@python
2014-03-20 20:00:38Daniel.U..Thibaultsetmessageid: <1395345638.14.0.0710363520699.issue20686@psf.upfronthosting.co.za>
2014-03-20 20:00:38Daniel.U..Thibaultlinkissue20686 messages
2014-03-20 20:00:38Daniel.U..Thibaultcreate