Message243660
This solution no longer works. If the system is configured to use the Japanese system locale and language pack, then 3.4.3 returns codepage 932 mojibake for the "%Z" time zone name. Originally [this approach worked][1] because it called PyUnicode_Decode using the 'mbcs' encoding.
Currently it calls PyUnicode_DecodeLocaleAndSize, which just ends up calling mbstowcs. That's pretty much what wcsftime does. In the default C locale, mbstowcs casts the byte values to wchar_t:
>>> time.strftime('%Z')
'\x91\xbe\x95\xbd\x97m\x89\xc4\x8e\x9e\x8a\xd4'
>>> time.strftime('%Z').encode('latin-1').decode('932')
'太平洋夏時間'
The problem is worse for 3.5 built with VC++ 14. In the new CRT strftime decodes the format string via MultiByteToWideChar, calls _Wcsftime_l, and encodes the result back via WideCharToMultiByte. The outer conversions use the default LC_TIME codepage, which is ANSI (ACP), so they're not the problem. The problem is the internal _mbstowcs_s_l conversion of the ANSI time zone name, which creates the above-shown mojibake 'unicode' string. This is then compounded by calling WideCharToMultiByte on the result:
>>> time.strftime('%Z')
'?????m?A???O'
There's no way to fix this by transcoding. The result is just garbage.
[1]: https://hg.python.org/cpython/file/79e60977fc04/Modules/timemodule.c#l501 |
|
Date |
User |
Action |
Args |
2015-05-20 13:30:40 | eryksun | set | recipients:
+ eryksun, belopolsky, pitrou, vstinner, ocean-city, brian.curtin, python-dev |
2015-05-20 13:30:40 | eryksun | set | messageid: <1432128640.8.0.377125320908.issue10653@psf.upfronthosting.co.za> |
2015-05-20 13:30:40 | eryksun | link | issue10653 messages |
2015-05-20 13:30:40 | eryksun | create | |
|