New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Windows] test__locale fails on Windows local machines #82505
Comments
On Windows 10 version 1903, 3 locale tests fail: vstinner@WIN C:\vstinner\python\3.8>python -m test -v test_locale test__locale ====================================================================== Traceback (most recent call last):
File "C:\vstinner\python\3.8\lib\test\test_locale.py", line 567, in test_getsetlocale_issue1813
locale.setlocale(locale.LC_CTYPE, loc)
File "C:\vstinner\python\3.8\lib\locale.py", line 608, in setlocale
return _setlocale(category, locale)
locale.Error: unsupported locale setting ====================================================================== Traceback (most recent call last):
File "C:\vstinner\python\3.8\lib\test\test__locale.py", line 184, in test_float_parsing
if localeconv()['decimal_point'] != '.':
UnicodeDecodeError: 'locale' codec can't decode byte 0xa0 in position 0: decoding error ====================================================================== Traceback (most recent call last):
File "C:\vstinner\python\3.8\lib\test\test__locale.py", line 130, in test_lc_numeric_localeconv
formatting = localeconv()
UnicodeDecodeError: 'locale' codec can't decode byte 0xa0 in position 0: decoding error test_float_parsing() fails for locales:
test_lc_numeric_localeconv() fails for locales:
test_getsetlocale_issue1813() fails with: testing with ('tr_TR', 'ISO8859-9') Example: vstinner@WIN C:\vstinner\python\3.8>python >>> import locale
>>> locale.setlocale(locale.LC_CTYPE, 'tr_TR')
'tr_TR'
>>> loc=locale.getlocale(locale.LC_CTYPE)
>>> loc
('tr_TR', 'ISO8859-9')
>>> locale.setlocale(locale.LC_CTYPE, loc)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\vstinner\python\3.8\lib\locale.py", line 608, in setlocale
return _setlocale(category, locale)
locale.Error: unsupported locale setting # It works using the low-level _locale module >>> import _locale
>>> _locale.setlocale(_locale.LC_CTYPE, None)
'tr_TR'
>>> locale.setlocale(locale.LC_CTYPE, "tr_TR")
'tr_TR' |
This is a known issue (forgetting the number right now) - Linux-style locales don't work on Windows, and so there's a normalization function that has to be completely rewritten. Also, since these seem to be on your own machine, I'm guessing your locale is not en-US? Please make sure you provide enough information for others to reproduce these - if they were occurring all the time, we wouldn't have got this far, so you've obviously got something configured differently. |
This is the existing issue https://bugs.python.org/issue37945 which I haven't had time to progress. Please feel free to follow up |
Oh, the behavior of setlocale() depends on my system locale? Yeah, my system is configured in French, sorry I don't recall the locale name. |
I started a report on failures on my machine a few days ago, but never finished editing. I am pasting in what I wrote so far. The same 4 tests fail today with a rebuild 12 hours ago. I ran the test suite on my 64-bit Win10 machine with fresh debug 32 builds in freshly updated repository. ===================================================================================== f:\dev\3x>python -m test -j0 -ugui
Running Debug|Win32 interpreter...
== CPython 3.9.0a0 (heads/master:c5a7e0ce19, Sep 28 2019, 13:58:57) [MSC v.1900 32 bit (Intel)]
== Windows-10-10.0.18362-SP0 little-endian
== cwd: F:\dev\3x\build\test_python_12736
== CPU count: 12
== encodings: locale=cp1252, FS=utf-8
Run tests in parallel using 14 child processes
____________________________________________________________________________________
Failure 1.
test test__locale failed -- Traceback (most recent call last):
File "F:\dev\3x\lib\test\test__locale.py", line 133, in test_lc_numeric_localeconv
if self.numeric_tester('localeconv', formatting[lc], lc, loc):
File "F:\dev\3x\lib\test\test__locale.py", line 97, in numeric_tester
self.assertEqual(calc_value, known_value,
AssertionError: ',' != '\u066b'
- ,
+ \u066b
: , != \u066b (localeconv for decimal_point; set to ps_AF, using ps_AF)
___________________________________________________________________________
Failure 2:
test_importlib failed -- running: test_concurrent_futures (1 min 6 sec)
Failed to import test module: test.test_importlib.import_.test_fromlist
Traceback (most recent call last):
File "F:\dev\3x\lib\unittest\loader.py", line 436, in _find_test_path
module = self._get_module_from_name(name)
File "F:\dev\3x\lib\unittest\loader.py", line 377, in _get_module_from_name
__import__(name)
ValueError: source code string cannot contain null bytes
test test_importlib crashed -- Traceback (most recent call last):
File "F:\dev\3x\lib\test\libregrtest\runtest.py", line 270, in _runtest_inner
refleak = _runtest_inner2(ns, test_name)
File "F:\dev\3x\lib\test\libregrtest\runtest.py", line 234, in _runtest_inner2
test_runner()
File "F:\dev\3x\lib\test\libregrtest\runtest.py", line 208, in _test_module
raise Exception("errors while loading tests")
Exception: errors while loading tests
_____________________________________________________________________________________
Failure 3.
test test_locale failed -- Traceback (most recent call last):
File "F:\dev\3x\lib\test\test_locale.py", line 567, in test_getsetlocale_issue1813
locale.setlocale(locale.LC_CTYPE, loc)
File "F:\dev\3x\lib\locale.py", line 608, in setlocale
return _setlocale(category, locale)
locale.Error: unsupported locale setting
_____________________________________________________________________________________ Failure 4. test_winconsoleio ERROR: test_input (test.test_winconsoleio.WindowsConsoleIOTests) Traceback (most recent call last):
File "F:\dev\3x\lib\test\test_winconsoleio.py", line 148, in test_input
self.assertStdinRoundTrip('\U00100000\U0010ffff\U0010fffd')
File "F:\dev\3x\lib\test\test_winconsoleio.py", line 135, in assertStdinRoundTrip
actual = input()
OSError: [WinError 87] The parameter is incorrect ====================================================================== Traceback (most recent call last):
File "F:\dev\3x\lib\test\test_winconsoleio.py", line 161, in test_partial_reads
b = stdin.read(read_count)
OSError: [WinError 87] The parameter is incorrect ====================================================================== Traceback (most recent call last):
File "F:\dev\3x\lib\test\test_winconsoleio.py", line 178, in test_partial_surrogate_reads
b = stdin.read(read_count)
OSError: [WinError 87] The parameter is incorrect
______________________________________________________________________________________
Cancelling an overlapped future failed
future: <_OverlappedFuture pending overlapped=<pending, 0x432ab70> cb=[BaseProactorEventLoop._loop_self_reading()]>
Traceback (most recent call last):
File "F:\dev\3x\lib\asyncio\windows_events.py", line 66, in _cancel_overlapped
self._ov.cancel()
OSError: [WinError 6] The handle is invalid
Cancelling an overlapped future failed
future: <_OverlappedFuture pending overlapped=<pending, 0x432ab70> cb=[BaseProactorEventLoop._loop_self_reading()]>
Traceback (most recent call last):
File "F:\dev\3x\lib\asyncio\windows_events.py", line 66, in _cancel_overlapped
self._ov.cancel()
OSError: [WinError 6] The handle is invalid
Error on reading from the event loop self pipe
loop: <ProactorEventLoop running=True closed=False debug=False>
Traceback (most recent call last):
File "F:\dev\3x\lib\asyncio\windows_events.py", line 453, in finish_recv
return ov.getresult()
OSError: [WinError 995] The I/O operation has been aborted because of either a thread exit or an application request
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "F:\dev\3x\lib\asyncio\proactor_events.py", line 768, in _loop_self_reading
f.result() # may raise
File "F:\dev\3x\lib\asyncio\windows_events.py", line 808, in _poll
value = callback(transferred, key, ov)
File "F:\dev\3x\lib\asyncio\windows_events.py", line 457, in finish_recv
raise ConnectionResetError(*exc.args)
ConnectionResetError: [WinError 995] The I/O operation has been aborted because of either a thread exit or an application request
F:\dev\3x\lib\asyncio\base_events.py:673: ResourceWarning: unclosed event loop <ProactorEventLoop running=False closed=False debug=False>
_warn(f"unclosed event loop {self!r}", ResourceWarning, source=self)
ResourceWarning: Enable tracemalloc to get the object allocation traceback ========================================================================================= 3.8 CPython 3.8.0b4+ (heads/3.8:36c6fa9680, Sep 28 2019, 14:00:28) [MSC v.1900 32 bit (Intel)] test_locale, locale, and winconsoleio failed as before, test_import ok. 3.7 CPython 3.7.4+ (heads/3.7:80dd66ac27, Sep 28 2019, 14:01:04) test_locale, locale, and winconsoleio failed as before
>>> locale.setlocale(locale.LC_CTYPE, 'tr_TR')
'tr_TR'
>>> loc = locale.getlocale(locale.LC_CTYPE)
>>> loc
('tr_TR', 'ISO8859-9')
>>> locale.setlocale(locale.LC_CTYPE, loc)
Traceback (most recent call last):
File "<pyshell#6>", line 1, in <module>
locale.setlocale(locale.LC_CTYPE, loc)
File "C:\Programs\Python38\lib\locale.py", line 608, in setlocale
return _setlocale(category, locale)
locale.Error: unsupported locale setting Some change made this fail again, after being fixed on my local machine, even if it continues to work on buildbots. |
Terry, the test_winconsoleio problem is bpo-38325. Test cases with surrogate pairs that are known to fail in recent builds of Windows 10 have to be split out. For the "ps_AF" locale failure that you noted, in my case with Windows 10 18362, I have to first modify the tests in Lib/test/test__locale.py to set LC_CTYPE before setting LC_NUMERIC. Otherwise the lconv result in C has the wrong encoding, and PyUnicode_DecodeLocale fails. After making this change, I can reproduce the noted failure. The "ps_AF" (Pashto, Afghanistan) case will have to be skipped in Windows because the system NLS data does not agree with the assumed Arabic decimal and thousands separator, U+066B and U+066C, but instead uses "," and ".". This can be verified directly via WINAPI GetLocaleInfoEx: >>> n = kernel32.GetLocaleInfoEx('ps-AF', LOCALE_SSCRIPTS, buf, len(buf))
>>> buf.value
'Arab;'
>>> n = kernel32.GetLocaleInfoEx('ps-AF', LOCALE_SDECIMAL, buf, len(buf))
>>> buf.value
','
>>> n = kernel32.GetLocaleInfoEx('ps-AF', LOCALE_STHOUSAND, buf, len(buf))
>>> buf.value
'.' In case this was a quirk in the NLS data for languages that use a Perso-Arabic script, such as Pashto, I also checked Saudi Arabia ("ar-SA"), which uses a standard Arabic script, but the result was the same. |
test_locale and test__locale are still failing on my Windows 10 VM. I proposed PR 18403 to skip failing tests on Windows. |
Details on this error: ERROR: test_getsetlocale_issue1813 (test.test_locale.TestMiscellaneous) Traceback (most recent call last):
File "C:\vstinner\python\3.8\lib\test\test_locale.py", line 567, in test_getsetlocale_issue1813
locale.setlocale(locale.LC_CTYPE, loc)
File "C:\vstinner\python\3.8\lib\locale.py", line 608, in setlocale
return _setlocale(category, locale)
locale.Error: unsupported locale setting On Windows 10 (version 1903), ANSI code page 1252, OEM code page 437, LC_CTYPE locale "French_France.1252": vstinner@WIN C:\vstinner\python\master>python
Running Debug|x64 interpreter...
Python 3.9.0a3+ (heads/master:d68e0a8a16, Feb 10 2020, 22:59:58) [MSC v.1916 64
bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import locale
>>> locale.setlocale(locale.LC_CTYPE, "tr_TR")
'tr_TR'
>>> loc=locale.getlocale(locale.LC_CTYPE)
>>> loc
('tr_TR', 'ISO8859-9')
>>> locale.setlocale(locale.LC_CTYPE, loc)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\vstinner\python\master\lib\locale.py", line 610, in setlocale
return _setlocale(category, locale)
locale.Error: unsupported locale setting
>>> name=locale._build_localename(loc)
>>> name
'tr_TR.ISO8859-9'
>>> name2 = locale.normalize(name)
>>> name2 == name
True
>>> name2
'tr_TR.ISO8859-9'
>>> locale.setlocale(locale.LC_CTYPE, 'tr_TR.ISO8859-9')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\vstinner\python\master\lib\locale.py", line 610, in setlocale
return _setlocale(category, locale)
locale.Error: unsupported locale setting Note: I changed the OEM code page, usually OEM cp is 850, but the OEM code page should have no effect on setlocale(). |
>>> loc=locale.getlocale(locale.LC_CTYPE)
>>> loc
('tr_TR', 'ISO8859-9') getlocale() has issues on Unix, but worse issues on Windows. See: I never use getlocale() and I never understood the purpose of this function. I use locale.setlocale(loc) (same than locale.setlocale(loc, None)) to *get* a locale: the result can be passed to locale.setlocale(loc, result) with no problem. |
The CRT default locale (i.e. the empty locale "") uses the user locale, which is the "Format" value on the Region->Formats tab. It does not use the system locale from the Region->Administrative tab. The default locale normally uses the user locale's ANSI codepage, as returned by GetLocaleInfoEx(LOCALE_NAME_USER_DEFAULT, LOCALE_IDEFAULTANSICODEPAGE, ...). But if the active codepage of the process is UTF-8, then GetACP(), GetOEMCP(), and setlocale(LC_CTYPE, "") all use UTF-8 (i.e. CP_UTF8, i.e. 65001). The active codepage can be set to UTF-8 either at the system-locale level or in the application-manifest. For example, with the active codepage setting in the manifest:
>>> from locale import setlocale, LC_CTYPE
>>> setlocale(LC_CTYPE, "")
'English_Canada.utf8'
>>> kernel32 = ctypes.WinDLL('kernel32', use_last_error=True)
>>> kernel32.GetACP()
65001
>>> kernel32.GetOEMCP()
65001 A default locale name can also specify the codepage to use. It could be UTF-8, a particular codepage, ".ACP" (ANSI), or ".OCP" (OEM). "ACP" and "OCP" have to be in upper case. For example: >>> setlocale(LC_CTYPE, '.utf8')
'English_Canada.utf8'
>>> setlocale(LC_CTYPE, '.437')
'English_Canada.437'
>>> setlocale(LC_CTYPE, ".ACP")
'English_Canada.1252'
>>> setlocale(LC_CTYPE, ".OCP")
'English_Canada.850' Otherwise, if you provide a known locale -- using full names, or three-letter abbreviations, or from the small set of locale aliases, then setlocale queries any missing values from the NLS database. One snag in the road is the set of Unicode-only locales, such as "Hindi_India". Querying the ANSI and OEM codepages for a Unicode-only locale respectively returns CP_ACP (0) and CP_OEMCP (1). It used to be that the CRT would end up using the system locale for these cases. But recently ucrt has switched to using UTF-8 for these cases. For example: >>> setlocale(LC_CTYPE, "Hindi_India")
'Hindi_India.utf8' That brings us to the case of modern Windows BCP-47 locale names, which usually lack an implicit encoding. For example: >>> setlocale(LC_CTYPE, "hi_IN")
'hi_IN' The current CRT codepage can be queried via __lc_codepage_func: >>> import ctypes; ucrt = ctypes.CDLL('ucrtbase', use_errno=True)
>>> ucrt.___lc_codepage_func()
65001 With the exception of Unicode-only locales, using a modern name without an encoding defaults to the named locale's ANSI codepage. For example: >>> setlocale(LC_CTYPE, "en_CA")
'en_CA'
>>> ucrt.___lc_codepage_func()
1252 The only encoding allowed in BCP-47 locale names is ".utf8" or ".utf-8" (case insensitive): >>> setlocale(LC_CTYPE, "fr_FR.utf8")
'fr_FR.utf8'
>>> setlocale(LC_CTYPE, "fr_FR.UTF-8")
'fr_FR.UTF-8' No other encoding is allowed with this form. For example: >>> try: setlocale(LC_CTYPE, "fr_FR.ACP")
... except Exception as e: print(e)
...
unsupported locale setting
>>> try: setlocale(LC_CTYPE, "fr_FR.1252")
... except Exception as e: print(e)
...
unsupported locale setting As to the "tr_TR" locale bug, the Windows implementation is broken due to assumptions that POSIX locale names are directly supported. A significant redesign is required to connect the dots. >>> from locale import getlocale
>>> setlocale(LC_CTYPE, 'tr_TR')
'tr_TR'
>>> ucrt.___lc_codepage_func()
1254
>>> getlocale(LC_CTYPE)
('tr_TR', 'ISO8859-9') Codepage 1254 is similar to ISO8859-9, except, in typical fashion, Microsoft assigned most of the upper control range 0x80-0x9F to an assortment of characters it deemed useful, such as the Euro symbol "€". The exact codepage needs to be queried via __lc_codepage_func() and returned as ('tr_TR', 'cp1254'). Conversely, setlocale() needs to know that this BCP-47 name does not support an explicit encoding, unless it's "utf8". If the given codepage, or an associated alias, doesn't match the locale's ANSI codepage, then the locale name has to be expanded to the full name "Turkish_Turkey". The long name allows specifying an arbitrary codepage. For example, say we have ('tr_TR', 'ISO8859-7'), i.e. Greek with Turkish locale rules. This transforms to the closest approximation ('tr_TR', '1253'). When setlocale queries the OS, it will find that the ANSI codepage is actually 1254, so it cannot use "tr_TR" or "tr-TR". It needs to expand to the long form: >>> setlocale(LC_CTYPE, 'Turkish_Turkey.1253')
'Turkish_Turkey.1253' |
I mark this issuse as a duplicate of bpo-37945. |
It is not a duplicate of bpo-37945. The tests in test/test__locale.py need to be fixed to work with Windows. In msg354021, I discussed the problem reported with test_lc_numeric_localeconv. The "ps_AF" (Pashto, Afghanistan) item in known_numerics has to be skipped in Windows because the system data does not agree with the test's assumed Arabic decimal and thousands separator, U+066B and U+066C, but instead uses "," and ".". Also, all tests need to swap the order of setting LC_NUMERIC and LC_CTYPE in order to avoid a UnicodeDecodeError with locales that use UTF-8, such as "ka_GE". _locale.localeconv should be using the wide-character_W_ prefixed string fields from the lconv structure in Windows 1, such as _W_decimal_point. Until that gets fixed, tests need to be mindful that ucrt in Windows uses the current LC_CTYPE to update the multibyte strings in the lconv structure when setting LC_NUMERIC. So they should be changed as a pair, with LC_CTYPE set first. |
I just took a look at the PR and I think it's good, but if anyone else wants to have a look before I merge it (probably not today), please do. |
PR 20529 looks good to me. Thank you, Tiago. |
This issue *is* a duplicate of bpo-37945 with respect to test_locale. So I remove that from the title. It is not a duplicate with respect to test__locale, which fails for a very different reason. The other failures I mentioned are noise here (and now fixed). So when PR-20529 is merged (it fixes test__locale for me also), we can close this. |
Thanks! |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: