Message217095
To understand why, understand that a byte string has no encoding inherent. So when you call b'utf8string'.decode('unicode_escape'), python has no way to know how to interpret the non-ascii characters in that bytestring. If you want the unicode_escape representation of something, you want to do 'string'.encode('unicode_escape'). If you then want that as a python string, you can do:
'mystring'.encode('unicode_escape').decode('ascii')
In theory there ought to be a way to use the codecs module to go directly from unicode string to unicode-escaped string, but I don't know how to do it, since the proposal for the 'transform' method was rejected :)
Just to bend your brain a bit further, note that this does work:
>>> codecs.decode(codecs.encode('ä', 'unicode-escape').decode('ascii'), 'unicode-escape')
'ä' |
|
Date |
User |
Action |
Args |
2014-04-23 22:17:11 | r.david.murray | set | recipients:
+ r.david.murray, lemburg, ncoghlan, vstinner, ezio.melotti, Sworddragon |
2014-04-23 22:17:11 | r.david.murray | set | messageid: <1398291431.63.0.582489831312.issue21331@psf.upfronthosting.co.za> |
2014-04-23 22:17:11 | r.david.murray | link | issue21331 messages |
2014-04-23 22:17:11 | r.david.murray | create | |
|