Message336855
Ah, I can reproduce the bug on Fedora 29 using "LANG=en_IN ./python -m test -v test_re".
The problem is that locale.getlocale() is not reliable: it pretends that the locale encoding is ISO8859-1, whereas the real encoding is UTF-8:
$ LANG=en_IN ./python
Python 3.8.0a2+ (heads/master:4cbea518a0, Feb 28 2019, 18:19:44)
>>> chr(224).encode('ISO8859-1')
b'\xe0'
>>> import _testcapi
>>> _testcapi.DecodeLocaleEx(b'\xe0', 0, 'strict')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
RuntimeError: decode error: pos=0, reason=decoding error
>>> import locale
# Wrong encoding
>>> locale.getlocale(locale.LC_CTYPE)
('en_IN', 'ISO8859-1')
>>> locale.setlocale(locale.LC_CTYPE, None)
'en_IN'
>>> locale._parse_localename('en_IN')
('en_IN', 'ISO8859-1')
# Real encoding
>>> locale.getpreferredencoding()
'UTF-8'
>>> locale.nl_langinfo(locale.CODESET)
'UTF-8'
Attached PR 12099 fix the issue. |
|
Date |
User |
Action |
Args |
2019-02-28 17:34:31 | vstinner | set | recipients:
+ vstinner, barry, doko, paul.moore, ncoghlan, tim.golden, benjamin.peterson, ezio.melotti, mrabarnett, zach.ware, serhiy.storchaka, steve.dower, jaysinh.shukla, Naman-Bhalla, xtreak |
2019-02-28 17:34:31 | vstinner | set | messageid: <1551375271.56.0.263764017823.issue29571@roundup.psfhosted.org> |
2019-02-28 17:34:31 | vstinner | link | issue29571 messages |
2019-02-28 17:34:31 | vstinner | create | |
|