classification
Title: locale.nl_langinfo() can't decode value
Type: behavior Stage:
Components: Library (Lib) Versions: Python 3.8
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: nnja Nosy List: barry, lemburg, loewis, nnja, r.david.murray, serhiy.storchaka, vstinner
Priority: normal Keywords:

Created on 2015-12-05 21:19 by serhiy.storchaka, last changed 2019-01-09 11:53 by vstinner.

Messages (4)
msg255979 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-12-05 21:19
>>> import locale
>>> locale.setlocale(locale.LC_NUMERIC, 'uk_UA')
'uk_UA'
>>> locale.getlocale(locale.LC_NUMERIC)
('uk_UA', 'KOI8-U')
>>> locale.nl_langinfo(locale.THOUSEP)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'locale' codec can't decode byte 0x9a in position 0: Invalid or incomplete multibyte or wide character

Looks as locale.nl_langinfo() always uses the UTF-8 encoding (or may be locale.getpreferredencoding()).
msg267017 - (view) Author: Nina Zakharenko (nnja) * (Python triager) Date: 2016-06-03 01:14
Adding the test below to test__locale.py will reproduce the issue under the following conditions:
- The locale `uk_UA` is installed on your system.
- 'uk_UA': (',', '\xa0') is added to the `known_numerics` dictionary in this test file

    @unittest.skipUnless(nl_langinfo, "nl_langinfo is not available")
    def test_lc_numeric_not_char_nl_langinfo(self):
        # Test nl_langinfo against known values/
        # It should still work if there's a mismatch between
        # String & Numeric Locales
        tested = False
        for loc in candidate_locales:
            try:
                setlocale(LC_NUMERIC, loc)
            except Error:
                continue
            for li, lc in ((RADIXCHAR, "decimal_point"),
                            (THOUSEP, "thousands_sep")):
                if self.numeric_tester('nl_langinfo', nl_langinfo(li), lc, loc):
                    tested = True
        if not tested:
            self.skipTest('no suitable locales')
msg267192 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2016-06-04 00:13
Thanks, Nina.  We do have support in test.support for running a test with a specific locale (run_with_locale), so this could be turned into a unit test patch if you or someone else is willing to do that.
msg333308 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-01-09 11:53
Since this bug has been reported, locale.localeconv() has been fixed in bpo-31900 to temporarily set LC_CTYPE to LC_NUMERIC to decode numeric fields of localeconv() from the proper encoding. I guess that a similar fix can be applied to locale.nl_langinfo(): set LC_CTYPE to LC_NUMERIC if the parameter is a numeric field.

I only knew locale.nl_langinfo(locale.CODESET). I didn't know that this function accepted other arguments :-)

I even wrote an article about these locale bugs :-)
https://github.com/python/cpython/pull/5191

See also bpo-35697: "decimal: formatter error if LC_NUMERIC uses a different encoding than LC_CTYPE".
History
Date User Action Args
2019-01-09 11:53:37vstinnersetnosy: + vstinner
messages: + msg333308
2018-09-28 20:19:26barrysetassignee: nnja
2018-09-28 18:18:12barrysetnosy: + barry
2018-09-28 18:16:44barrysetversions: + Python 3.8, - Python 3.5, Python 3.6
2016-06-04 00:13:38r.david.murraysetnosy: + r.david.murray
messages: + msg267192
2016-06-03 01:14:14nnjasetnosy: + nnja
messages: + msg267017
2015-12-24 13:31:30serhiy.storchakasetversions: - Python 3.4
2015-12-05 21:19:05serhiy.storchakacreate