classification
Title: test_c_locale_coercion fails when the default LC_CTYPE != "C"
Type: behavior Stage: patch review
Components: Tests Versions: Python 3.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: ncoghlan Nosy List: erik.bray, ncoghlan, xdegaye
Priority: normal Keywords: patch

Created on 2017-11-10 10:47 by erik.bray, last changed 2017-11-11 12:57 by xdegaye.

Pull Requests
URL Status Linked Edit
PR 4361 open erik.bray, 2017-11-10 10:48
PR 4369 open ncoghlan, 2017-11-11 06:35
Messages (8)
msg306019 - (view) Author: Erik Bray (erik.bray) * Date: 2017-11-10 10:47
Several of the tests in test_c_locale_coercion (particularly LocaleCoercionTests._check_c_locale_coercion) tend to assume that the system default locale used when setting setlocale(category, "") and when all the relevant environment variables are empty/blank will be the "C"/"POSIX" locale.

While this is often true POSIX does not require this to be the case.  For example on Cygwin it already defaults to "C.UTF-8", so these tests fail because they assume the legacy coercion will be used, when it isn't (e.g. the LC_CTYPE environment variable does not get forced to "C.UTF-8").  In principle this can affect any platform, however, that chooses a different default.
msg306022 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2017-11-10 10:59
Issue 30672 is potentially related here - some of the test cases are already disabled on Mac OS X and other *BSD systems since the tests assume that C & POSIX are aliases of each other.

I've also added Xavier to the nosy list, since the current implementation and tests aren't quite right for Android either and it would be good to come up with a unified solution to more robust platform feature detection: https://bugs.python.org/issue28180#msg305850
msg306023 - (view) Author: Erik Bray (erik.bray) * Date: 2017-11-10 11:16
Yes, I looked at some of the other issues pertaining to this, but it wasn't immediately apparent how to kill multiple birds with one stone, so here I just focused on this one assumption.
msg306027 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2017-11-10 12:49
OK, I'd been meaning to get back to refactoring those tests anyway, so assigning this to myself.

I'm thinking that the right way to go will be to give the test case a more explicit model of "expected platform behaviour" (initialised in setupModule), rather than having that be implicit in a bunch of conditionals scattered throughout the individual test cases.

Then we'd have at least the following cases:

- default is C, POSIX is an alias for C (most Linux distros)
- default is C, POSIX is a separate locale (*BSD)
- default is C.UTF-8 (Cygwin, potentially Android depending on exactly how we resolve that)
msg306036 - (view) Author: Erik Bray (erik.bray) * Date: 2017-11-10 15:24
In my PR there's a behavior test for the default, so we don't have to hard-code that on a per-platform basis at least.  The C != POSIX thing I'm not sure you can easily test for.
msg306078 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2017-11-11 06:41
The essential problem in both this issue and issue 30672 is that the tests are currently incorporating some Linux-specific assumptions about ways to request the "C" locale.

In https://github.com/python/cpython/pull/4369, I've taken the approach of making the baseline tests only cover "C" and "invalid.ascii", and then explicitly *opt-in* to testing an empty locale and "POSIX" on Linux machines.

If that's enough to get the test passing on Cygwin, I'm inclined to leave it at that. Dynamically calculated test expectations always make me nervous, since it's all too easy to end up with bugs that impact both the test case and the expectation calculator in the same way, and hence end up with the test passing when it should really fail.
msg306079 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2017-11-11 06:44
Note: I'm not entirely sold on my own argument though, as I believe at least Alpine Linux already interprets the empty locale as C.UTF-8, so it may make more sense to use your dynamic check with both the empty string and "POSIX", and only testing those locales if they get reported back as effectively configuring the "C" locale.
msg306083 - (view) Author: Xavier de Gaye (xdegaye) * (Python committer) Date: 2017-11-11 12:57
> Several of the tests in test_c_locale_coercion (particularly LocaleCoercionTests._check_c_locale_coercion) tend to assume that the system default locale used when setting setlocale(category, "") and when all the relevant environment variables are empty/blank will be the "C"/"POSIX" locale.
>
> While this is often true POSIX does not require this to be the case.

I think you are right. The section starting with "The values of locale categories shall be determined by a precedence order;" in [1] states:

4. If the LANG environment variable is not set or is set to the empty string, the implementation-defined default locale shall be used.

In the current implementation of PR 4334 [2] only one change to test_c_locale_coercion is needed to fix the failures of some subtests of test_PYTHONCOERCECLOCALE_set_to_warn when all the locale envt variables are set to the empty string. All the other tests are unchanged and ok because the new _Py_SetLocaleFromEnv() function [3] causes Android to behave as a plain *nix platform except when the locale envt variables are unset or set to an empty string.

[1] http://pubs.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap08.html
[2] PR 4334: Fix the implementation of PEP 538 on Android
[3] And because after calling setlocale(category, "C"), setlocale(category) returns "C" on Android (this may not be the case on Cygwin).
History
Date User Action Args
2017-11-11 12:57:26xdegayesetmessages: + msg306083
2017-11-11 06:44:57ncoghlansetmessages: + msg306079
2017-11-11 06:41:20ncoghlansetmessages: + msg306078
2017-11-11 06:35:40ncoghlansetpull_requests: + pull_request4322
2017-11-10 15:24:25erik.braysetmessages: + msg306036
2017-11-10 12:49:37ncoghlansetassignee: ncoghlan
messages: + msg306027
2017-11-10 11:16:39erik.braysetmessages: + msg306023
2017-11-10 10:59:20ncoghlansetnosy: + xdegaye
messages: + msg306022
2017-11-10 10:48:45erik.braysetkeywords: + patch
stage: patch review
pull_requests: + pull_request4316
2017-11-10 10:47:52erik.braycreate