classification
Title: PEP 538: Unexpected locale behaviour on *BSD (including Mac OS X)
Type: behavior Stage: patch review
Components: Tests Versions: Python 3.7
process
Status: open Resolution:
Dependencies: 30647 34589 Superseder:
Assigned To: ncoghlan Nosy List: ncoghlan, ned.deily, ronaldoussoren, vstinner
Priority: normal Keywords: patch

Created on 2017-06-15 09:23 by ncoghlan, last changed 2019-10-20 12:27 by ncoghlan.

Pull Requests
URL Status Linked Edit
PR 9257 closed ncoghlan, 2018-09-30 07:22
Messages (12)
msg296076 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2017-06-15 09:23
To get the new PEP 538 tests passing on Mac OS X (see [1,2]), I ended up having to skip the following test scenarios:

    LANG=UTF-8 (behaves like LANG=C, *not* LC_CTYPE=UTF-8)
    LANG=POSIX (behaves like a distinct locale is set, not LANG=C)
    LC_CTYPE=POSIX (behaves like a distinct locale is set, not LANG=C)
    LC_ALL=POSIX (behaves like a distinct locale is set, not LANG=C)

However, I'm not sure whether that should be diagnosed as a pure testing problem, where we change the test's expectations to match the current behaviour, or a bug in the PEP 538 implementation, where we should be updating it to produce the behaviour that the tests were originally expecting.


[1] https://github.com/python/cpython/commit/4563099d28e832aed22b85ce7e2a92236df03847
[2] https://github.com/python/cpython/commit/7926516ff95ed9c8345ed4c4c4910f44ffbd5949 )
msg296239 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2017-06-17 14:12
For the case where POSIX is a distinct locale from the default C locale (rather than a simple alias), I'm leaning towards taking PEP 538 literally, and adjusting the affected test cases to assume that locale coercion *won't* happen for that case on Mac OS X.

I *think* we can check for the alias behaviourally by setting the "POSIX" locale and seeing if it reports itself back to us as "C", but if not, then I'll just add in a sys.platform check, and we can figure out a behavioural check later if the platform check turns out to cause problems.
msg296248 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2017-06-18 01:15
It seems the "POSIX is not just an alias for the C locale" behaviour is inherited from *BSD, rather than being specific to Mac OS X: http://buildbot.python.org/all/builders/AMD64%20FreeBSD%20CURRENT%20Debug%20custom/builds/12/steps/test/logs/stdio
msg296250 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2017-06-18 01:26
Flagging issue 30647 as a dependency, since the problem with breaking nl_langinfo will need to be fixed before these tests can be re-enabled.
msg307784 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2017-12-07 00:45
As discussed in https://bugs.python.org/issue32238 and https://mail.python.org/pipermail/python-dev/2017-December/151105.html, I now think the right answer for the POSIX case is to ensure the legacy locale detection logic always treats that the same way as it does the C locale.

LANG=UTF-8 I'm still not sure about - as I understand it, that's a partial locale that only defines LC_CTYPE, which may be why the libc implementation ignores it if you try to set it as a general category default.
msg314646 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2018-03-29 13:43
Ned, I'd forgotten about the part of this issue which amounts to "Check for 'POSIX' as a locale coercion trigger in addition to 'C', as not every platform aliases the former to the latter the way glibc does".

So while I don't think this is really a release blocker (as you have to explicitly request the POSIX locale to encounter the discrepancy, and I think fixing it in 3.7.1 would be OK), I *am* bumping it up to "critical", as I think it would be better to resolve this for 3.7.0b4 rather than letting it linger further.
msg314648 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2018-03-29 13:48
Thanks for the headsup, Nick.  BTW, some of the discussion in languishing Issue18378 might be relevant here.
msg314652 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2018-03-29 14:43
The locale module has its own extra layer of oddities that I don't personally understand - #29571 and #20087 are another couple of issues along those lines.
msg326710 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2018-09-30 07:14
Putting back to normal, as the difference between the C locale and the POSIX locale is that you never get the latter by default - you have to explicitly request it.

The underlying fix for this is in the PR for bpo-34589.
msg354503 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-10-11 21:35
Since this issue has been created, I deeply reworked the Python Initialization with the PEP 587, and I made many changes related to locales and the UTF-8 Mode (PEP 540). What's the status of this issue?
msg354994 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2019-10-20 12:26
There are a couple of cases that the C locale coercion tests skip because I don't (or didn't) know what they *should* do:

* https://github.com/python/cpython/blob/24dc2f8c56697f9ee51a4887cf0814b6600c1815/Lib/test/test_c_locale_coercion.py#L262 (skips the "LANG=UTF-8" test)
* https://github.com/python/cpython/blob/24dc2f8c56697f9ee51a4887cf0814b6600c1815/Lib/test/test_c_locale_coercion.py#L37 (only adds "POSIX" to the expected C locale equivalents list on non-Android Linux systems)

With the interpreter explicitly checking for "POSIX" now, at least the latter special case could potentially be removed, with "POSIX" just always being part of the EXPECTED_C_LOCALE_EQUIVALENTS list. (It was only removed because it used to break on FreeBSD and Mac OS X)

I don't think we ever figured out why the "LANG=UTF-8" case didn't work the way we expected, so I suspect where going to need to keep that skip for now.
msg354995 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2019-10-20 12:27
(Removed the patch keyword, as the linked PR was for an old change that didn't cover the remaining test issues)
History
Date User Action Args
2019-10-20 12:27:26ncoghlansetmessages: + msg354995
2019-10-20 12:26:38ncoghlansetmessages: + msg354994
2019-10-11 21:35:15vstinnersetnosy: + vstinner
messages: + msg354503
2018-10-01 10:23:12vstinnersetnosy: - vstinner
2018-09-30 07:22:56ncoghlansetkeywords: + patch
stage: test needed -> patch review
pull_requests: + pull_request9033
2018-09-30 07:14:00ncoghlansetpriority: critical -> normal

dependencies: + Py_Initialize() and Py_Main() should not enable C locale coercion
messages: + msg326710
2018-03-29 14:43:28ncoghlansetmessages: + msg314652
2018-03-29 13:48:49ned.deilysetmessages: + msg314648
2018-03-29 13:43:08ncoghlansetpriority: normal -> critical

messages: + msg314646
2018-03-29 01:23:38ncoghlanunlinkissue28180 dependencies
2017-12-07 00:45:21ncoghlansetmessages: + msg307784
2017-12-07 00:42:10ncoghlanlinkissue32238 superseder
2017-06-18 04:09:06ncoghlansetassignee: ncoghlan
2017-06-18 01:26:26ncoghlansetdependencies: + CODESET error on AMD64 FreeBSD 10.x Shared 3.x caused by the PEP 538
messages: + msg296250
2017-06-18 01:15:00ncoghlansetmessages: + msg296248
title: PEP 538: Unexpected locale behaviour on Mac OS X -> PEP 538: Unexpected locale behaviour on *BSD (including Mac OS X)
2017-06-17 14:12:11ncoghlansetmessages: + msg296239
2017-06-15 09:32:09ncoghlanlinkissue28180 dependencies
2017-06-15 09:23:53ncoghlancreate