This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author sgala
Recipients
Date 2006-08-17.14:53:10
SpamBayes Score
Marked as misclassified
Message-id
In-reply-to
Content
Logged In: YES 
user_id=178886

The behaviour of python in this area is confusing. See a
session with my Spanish keyboard:

>>> print "á"
á
>>> print len("á")
2
>>> print "á".upper()
á
>>> str("á")
'\xc3\xa1'
>>> print u"á"
á
>>> print len(u"á")
1
>>> print u"á".upper()
Á
>>> str(u"á")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
__builtin__.UnicodeEncodeError: 'ascii' codec can't encode
character u'\xe1' in position 0: ordinal not in range(128)


I guess this is what is happening to the reporter.

This violates the least surprising behavior principle in so
many different ways that it hurts. Can anybody make sense of it?
History
Date User Action Args
2007-08-23 14:41:34adminlinkissue1528802 messages
2007-08-23 14:41:34admincreate