Author eryksun
Recipients eryksun, ezio.melotti, paul.moore, serhiy.storchaka, steve.dower, tim.golden, vidartf, vstinner, zach.ware
Date 2016-01-06.15:54:55
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1452095695.69.0.114494700321.issue26024@psf.upfronthosting.co.za>
In-reply-to
Content
PyLocale_setlocale in Modules/_localemodule.c is incorrectly passing the locale as a UTF-8 string ("z") instead of using the codepage of the current locale. 

As you can see below "å" is passed as the UTF-8 string "\xc3\xa5":

    >>> locale._setlocale(locale.LC_TIME, 'Norwegian Bokmål_Norway.1252')
    Breakpoint 0 hit
    MSVCR100!setlocale:
    00000000`56d23d14 48895c2408      mov     qword ptr [rsp+8],rbx
                                              ss:00000000`004af800=
                                              0000000002ad2a68
    0:000> db @rdx l0n29
    00000000`02808910  4e 6f 72 77 65 67 69 61-
                       6e 20 42 6f 6b 6d c3 a5  Norwegian Bokm..
    00000000`02808920  6c 5f 4e 6f 72 77 61 79-
                       2e 31 32 35 32           l_Norway.1252

The CRT's setlocale works fine when passed the locale string encoded with codepage 1252:

    >>> msvcr100 = ctypes.CDLL('msvcr100')
    >>> msvcr100.setlocale.restype = ctypes.c_char_p
    >>> loc_no = 'Norwegian Bokmål_Norway.1252'.encode('1252')
    >>> msvcr100.setlocale(locale.LC_TIME, loc_no)
    b'Norwegian Bokm\xe5l_Norway.1252'
History
Date User Action Args
2016-01-06 15:54:55eryksunsetrecipients: + eryksun, paul.moore, vstinner, tim.golden, ezio.melotti, zach.ware, serhiy.storchaka, steve.dower, vidartf
2016-01-06 15:54:55eryksunsetmessageid: <1452095695.69.0.114494700321.issue26024@psf.upfronthosting.co.za>
2016-01-06 15:54:55eryksunlinkissue26024 messages
2016-01-06 15:54:55eryksuncreate