This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author eryksun
Recipients BreamoreBoy, amaury.forgeotdarc, belopolsky, eryksun, jcea, msmhrt, ocean-city, prikryl, vstinner
Date 2015-09-19.09:34:48
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1442655288.94.0.331617908978.issue16322@psf.upfronthosting.co.za>
In-reply-to
Content
To decode the tzname strings, Python calls mbstowcs, which on Windows uses Latin-1 in the "C" locale. However, in this locale the tzname strings are actually encoded using the system ANSI codepage (e.g. 1250 for Central/Eastern Europe). So it ends up decoding ANSI strings as Latin-1 mojibake. For example:

    >>> s
    'Střední Evropa (běžný čas) | Střední Evropa (letní čas)'
    >>> s.encode('1250').decode('latin-1')
    'Støední Evropa (bì\x9ený èas) | Støední Evropa (letní èas)'

You can work around the inconsistency by calling setlocale(LC_ALL, "") before anything imports the time module. This should set a locale that's not "C", in which case the codepage should be consistent. Of course, this won't help if you can't control when the time module is first imported. 

The latter wouldn't be a issue if time.tzset were implemented on Windows. You can at least use ctypes to call the CRT's _tzset function. This solves the problem with time.strftime('%Z'). You can also get the CRT's tzname by calling the exported __tzname function. Here's a Python 3.5 example that sets the current thread to use Russian and creates a new tzname tuple:

    import ctypes
    import locale

    kernel32 = ctypes.WinDLL('kernel32')
    ucrtbase = ctypes.CDLL('ucrtbase')

    MUI_LANGUAGE_NAME = 8
    kernel32.SetThreadPreferredUILanguages(MUI_LANGUAGE_NAME, 
                                           'ru-RU\0', None)
    locale.setlocale(locale.LC_ALL, 'ru-RU')

    # reset tzname in current locale
    ucrtbase._tzset()
    ucrtbase.__tzname.restype = ctypes.POINTER(ctypes.c_char_p * 2)
    c_tzname = ucrtbase.__tzname()[0]
    tzname = tuple(tz.decode('1251') for tz in c_tzname)

    # print Cyrillic characters to the console
    kernel32.SetConsoleOutputCP(1251)
    stdout = open(1, 'w', buffering=1, encoding='1251', closefd=0)

    >>> print(tzname, file=stdout)
    ('Время в формате UTC', 'Время в формате UTC')
History
Date User Action Args
2015-09-19 09:34:49eryksunsetrecipients: + eryksun, jcea, amaury.forgeotdarc, prikryl, belopolsky, vstinner, ocean-city, BreamoreBoy, msmhrt
2015-09-19 09:34:48eryksunsetmessageid: <1442655288.94.0.331617908978.issue16322@psf.upfronthosting.co.za>
2015-09-19 09:34:48eryksunlinkissue16322 messages
2015-09-19 09:34:48eryksuncreate