This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author cito
Recipients cito
Date 2008-03-25.14:33:54
SpamBayes Score 0.10634738
Marked as misclassified No
Message-id <1206455635.63.0.674389192114.issue2481@psf.upfronthosting.co.za>
In-reply-to
Content
While locale.strcoll seems to work with Unicode strings, locale.strxfrm
gives a UnicodeError. Example:

###

try:
    locale.setlocale(locale.LC_ALL, 'de')
except locale.Error: # Windoof
    locale.setlocale(locale.LC_ALL, 'german')

s = ['Ägypten', 'Zypern']

print sorted(s, cmp=locale.strcoll) # works
print sorted(s, key=locale.strxfrm) # works

s = [u'Ägypten', u'Zypern']

print sorted(s, cmp=locale.strcoll) # works
print sorted(s, key=locale.strxfrm) # UnicodeError

###

Therefore, it is not possible to sort lists of Unicode strings
effectively. If possible, this should be fixed. If not possible, this
problem should at least be mentioned in the documentation. Currently,
the docs do not indicate that strcoll and strxfrm behave differently
concerning Unicode.
History
Date User Action Args
2008-03-25 14:33:56citosetspambayes_score: 0.106347 -> 0.10634738
recipients: + cito
2008-03-25 14:33:55citosetspambayes_score: 0.106347 -> 0.106347
messageid: <1206455635.63.0.674389192114.issue2481@psf.upfronthosting.co.za>
2008-03-25 14:33:54citolinkissue2481 messages
2008-03-25 14:33:54citocreate