Author eryksun
Recipients eryksun, fomcl@yahoo.com, r.david.murray
Date 2015-02-14.01:06:22
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1423875983.03.0.00767022442293.issue23425@psf.upfronthosting.co.za>
In-reply-to
Content
> -setlocale should return nothing. It's a setter
> -getlocale should return a platform-specific locale specification,
> probably what is currently returned by setlocale. The output 
> should be ready for consumption by setlocale.

These functions are well documented, so it's pointless to talk about major changes to the API. Per the docs, getlocale should return an RFC 1766 language code. If you want the platform result, use something like the following:

    def getrawlocale(category=locale.LC_CTYPE):
        return locale.setlocale(category)

    >>> locale.setlocale(locale.LC_CTYPE, 'eng')   
    'English_United Kingdom.1252'
    >>> getrawlocale()                          
    'English_United Kingdom.1252'

    >>> # the new CRT supports RFC1766
    ... locale.setlocale(locale.LC_CTYPE, 'en-GB')
    'en-GB'
    >>> getrawlocale()                            
    'en-GB'

As I mentioned in issue 20088, the locale_alias dict is based on X11's locale.alias file. It doesn't handle most Windows locale strings of the form language_country.codepage. 

On Windows, the _locale extension module could enumerate the system locales at startup to build a mapping. Here's a rough prototype using ctypes (requires Vista or later for the new locale functions):

    import locale
    from ctypes import *
    from ctypes.wintypes import *

    LOCALE_WINDOWS = 1
    LOCALE_SENGLISHLANGUAGENAME = 0x1001
    LOCALE_SENGLISHCOUNTRYNAME = 0x1002
    LOCALE_IDEFAULTANSICODEPAGE = 0x1004
    LCTYPES = (LOCALE_SENGLISHLANGUAGENAME,
               LOCALE_SENGLISHCOUNTRYNAME,
               LOCALE_IDEFAULTANSICODEPAGE)

    kernel32 = WinDLL('kernel32')
    EnumSystemLocalesEx = kernel32.EnumSystemLocalesEx
    GetLocaleInfoEx = kernel32.GetLocaleInfoEx

    EnumLocalesProcEx = WINFUNCTYPE(BOOL, LPWSTR, DWORD, LPARAM)

    def enum_system_locales():
        alias = {}
        codepage = {}
        info = (WCHAR * 100)()
    
        @EnumLocalesProcEx
        def callback(locale, flags, param):
            if '-' not in locale:
                return True
            parts = []
            for lctype in LCTYPES:
                if not GetLocaleInfoEx(locale, 
                                       lctype, 
                                       info, len(info)):
                    raise WinError()
                parts.append(info.value)
            lang, ctry, code = parts
            if lang and ctry and code != '0':
                locale = locale.replace('-', '_')
                full = '{}_{}'.format(lang, ctry)
                alias[full] = locale
                codepage[locale] = 'cp' + code
            return True
        
        if not EnumSystemLocalesEx(callback, 
                                   LOCALE_WINDOWS, 
                                   None, None):
            raise WinError()
        return alias, codepage


    >>> alias, codepage = enum_system_locales()

    >>> alias["English_United Kingdom"]
    'en_GB'
    >>> codepage['en_GB']              
    'cp1252'
    >>> alias["Spanish_United States"] 
    'es_US'
    >>> codepage['es_US']             
    'cp1252'
    >>> alias["Russian_Russia"]
    'ru_RU'
    >>> codepage['ru_RU']
    'cp1251'
    >>> alias["Chinese (Simplified)_People's Republic of China"]
    'zh_CN'
    >>> codepage['zh_CN']
    'cp936'
History
Date User Action Args
2015-02-14 01:06:23eryksunsetrecipients: + eryksun, r.david.murray, fomcl@yahoo.com
2015-02-14 01:06:23eryksunsetmessageid: <1423875983.03.0.00767022442293.issue23425@psf.upfronthosting.co.za>
2015-02-14 01:06:23eryksunlinkissue23425 messages
2015-02-14 01:06:22eryksuncreate