classification
Title: python 3.4 vs. 3.5 strftime same locale different output on Windows
Type: behavior Stage: resolved
Components: Library (Lib), Windows Versions: Python 3.5
process
Status: closed Resolution: third party
Dependencies: Superseder:
Assigned To: Nosy List: David Perra, eryksun, paul.moore, steve.dower, tim.golden, zach.ware
Priority: normal Keywords:

Created on 2016-06-08 05:25 by David Perra, last changed 2016-06-09 14:02 by eryksun. This issue is now closed.

Messages (2)
msg267786 - (view) Author: David Perra (David Perra) Date: 2016-06-08 05:25
The execution of these commands in python 3.4.x (Windows 10 Home)

    import locale
    from datetime import datetime
    locale.setlocale(locale.LC_ALL, 'Spanish')
    datetime.strftime(datetime.now(), '%a %d %b %Y')

renders the output

    'Spanish_Spain.1252'
    'mar 07 jun 2016'

but with Python 3.5.x the output is

'Spanish_Spain.1252'
'ma. 07 jun. 2016'
msg267835 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2016-06-08 11:29
The universal CRT that's used by 3.5+ implements locales using Windows locale names [1], which were introduced in Vista. Examples for Spanish include 'es', 'es-ES', and 'es-ES_tradnl' . The latter is the traditional sort for Spain, which has the 3-letter abbreviations for the days of the week. The default for Spain is the modern sort, which uses 2-letter abbreviations. 

The old LCID system [2] defaults to the traditional sort (0x40A), to which the old CRT maps unqualified "spanish". The modern sort (0xC0A) is available as "spanish-modern" [3]. The universal CRT still honors "spanish-modern" [4], but just "spanish" by itself is mapped instead to neutral "es", which uses the modern sort.

If you need to use the traditional form in both versions, then in 3.4 it's just "spanish", but 3.5+ requires a locale name with the sort suffix. I actually couldn't find a table on MSDN that listed the "tradnl" sort name to append to "es-ES", so I wrote a quick script to find it, assuming at least "tra" would be in the name:

    import re
    import ctypes

    kernel32 = ctypes.WinDLL('kernel32')

    LOCALE_ENUMPROCEX = ctypes.WINFUNCTYPE(
        ctypes.c_int,
        ctypes.c_wchar_p,
        ctypes.c_uint,
        ctypes.c_void_p)

    def find_locale(pattern):
        result = []
        @LOCALE_ENUMPROCEX
        def cb(locale, flags, param):
            if re.match(pattern, locale, re.I):
                result.append(locale)
            return True
        kernel32.EnumSystemLocalesEx(cb, 0, None, None)
        result.sort()
        return result

    >>> find_locale('es-.*TRA.*')
    ['es-ES_tradnl']

    >>> import locale, time
    >>> locale.setlocale(locale.LC_TIME, 'es-ES_tradnl')
    'es-ES_tradnl'
    >>> time.strftime('%a')
    'mié'

Note that abbreviations in Spanish generally end with a period. It's present for every country except Spain, such as Mexico:

    >>> locale.setlocale(locale.LC_TIME, 'spanish_mexico')
    'Spanish_Mexico.1252'
    >>> time.strftime('%a')
    'mié.'

or using the locale name (3.5+):

    >>> locale.setlocale(locale.LC_TIME, 'es-MX')
    'es-MX'
    >>> time.strftime('%a')
    'mié.'

Note that Python still doesn't support parsing locale names:

    >>> locale.getlocale(locale.LC_TIME)
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "C:\Program Files\Python35\lib\locale.py", line 578, in getlocale
        return _parse_localename(localename)
      File "C:\Program Files\Python35\lib\locale.py", line 487, in _parse_localename
        raise ValueError('unknown locale: %s' % localename)
    ValueError: unknown locale: es-MX

Since locale names don't include the encoding, parsing them for getlocale() will require an additional call to get the locale's ANSI codepage, in case anyone wants to update the locale module to support this:

    >>> LOCALE_IDEFAULTANSICODEPAGE = 0x1004
    >>> buf = (ctypes.c_wchar * 10)()
    >>> kernel32.GetLocaleInfoEx('es-MX', LOCALE_IDEFAULTANSICODEPAGE, buf, 10)
    5
    >>> buf.value
    '1252'

If no one has additional concerns here, I'll close this issue as 3rd party in a day or so. Inconsistencies in the locale that "spanish" maps to in different versions of the CRT are completely Microsoft's problem.

[1]: https://msdn.microsoft.com/en-us/library/dd373814
[2]: https://msdn.microsoft.com/en-us/library/dd318693
[3]: https://msdn.microsoft.com/en-us/library/39cwe7zf%28v=vs.100%29.aspx
[4]: https://msdn.microsoft.com/en-us/library/39cwe7zf.aspx
History
Date User Action Args
2016-06-09 14:02:32eryksunsetstatus: open -> closed
resolution: third party
stage: resolved
2016-06-08 11:29:10eryksunsetnosy: + eryksun
messages: + msg267835
2016-06-08 05:25:57David Perracreate