Author pitrou
Recipients jcea, loewis, pitrou, trent
Date 2012-10-17.13:03:19
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1350479000.11.0.87429331219.issue16258@psf.upfronthosting.co.za>
In-reply-to
Content
With the system Python on s10:

Python 2.6.8 (unknown, Apr 13 2012, 17:08:12) [C] on sunos5
Type "help", "copyright", "credits" or "license" for more information.
>>> import locale
>>> locale.strxfrm('a')
'a'
>>> locale.setlocale(locale.LC_ALL, 'en_US.UTF-8')
'en_US.UTF-8'
>>> locale.strxfrm('a')
'\x01\x01\x01\x0e\x01\x01\x01\x01\x01\x01\x01\x02\x01\x01\x0fi\x01\x01\x01\x01'
>>> locale.strxfrm('a').decode('utf-8')
u'\x01\x01\x01\x0e\x01\x01\x01\x01\x01\x01\x01\x02\x01\x01\x0fi\x01\x01\x01\x01'

The difference between Python 2 and Python 3 is that Python 3 uses wcsxfrm, not strxfrm. Apparently Solaris' wcsxfrm is some broken thing that returns the same thing as strxfrm, cast to a wchar_t *, hence the character U+101010e (corresponding to the '\x01\x01\x01\x0e' bytestring above).
History
Date User Action Args
2012-10-17 13:03:20pitrousetrecipients: + pitrou, loewis, jcea, trent
2012-10-17 13:03:20pitrousetmessageid: <1350479000.11.0.87429331219.issue16258@psf.upfronthosting.co.za>
2012-10-17 13:03:20pitroulinkissue16258 messages
2012-10-17 13:03:19pitroucreate