classification
Title: PEP 538: silence locale coercion and compatibility warnings by default?
Type: enhancement Stage: resolved
Components: Versions: Python 3.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: ncoghlan Nosy List: barry, inada.naoki, ncoghlan, vstinner
Priority: normal Keywords:

Created on 2017-06-04 09:57 by ncoghlan, last changed 2017-06-18 02:32 by ncoghlan. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 2186 closed vstinner, 2017-06-14 14:25
PR 2260 merged ncoghlan, 2017-06-17 06:42
Messages (13)
msg295120 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2017-06-04 09:57
This is a follow-up to PEP 538 that reflects the uncertainty over whether or not the warning on *successful* implicit locale coercion is a good idea.

The argument for this warning is that it alerts redistributors, system integrators, and application developers to the fact that LC_CTYPE may not be what they expect it to be.

The argument against it is that in many, and potentially even most, cases where the warning is omitted, there won't be any subsequent problems, and so the warning qualifies as a false alarm (especially for end users of applications that just happen to be written in Python), and the PEP 538 section in the .37 What's New, together with the fact that "LC_CTYPE=C.UTF-8" (or similar) appears in the process environment can be considered sufficient notice of the change for debugging purposes.

The initial PEP 538 implementation at https://github.com/python/cpython/pull/659 includes the warning, this issue reflects the possibility that we may decide to reverse that decision prior to the release of Python 3.7.0.
msg295881 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2017-06-13 09:36
My latest proposal on python-dev:

- no warning by default on successful coercion
- set "PYTHONCOERCECLOCALE=warn" to enable the warning when it's considered a configuration error in the runtime environment
msg295882 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2017-06-13 09:39
Some relevant mailing list threads on the usability problems posed by issuing the warning by default:

python-dev (impact on developer experience): https://mail.python.org/pipermail/python-dev/2017-June/148323.html

Fedora's python-devel (build environment warnings due to F26 backport of PEP 538): https://lists.fedorahosted.org/archives/list/python-devel@lists.fedoraproject.org/thread/VSEGOF76XMBJOAO4C2MORNK3I2VIPOTU/
msg296006 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-06-14 14:15
The warnings caused something around 27 test failures on "AMD64 FreeBSD CURRENT Non-Debug 3.x" buildbot:

http://buildbot.python.org/all/builders/AMD64%20FreeBSD%20CURRENT%20Non-Debug%203.x/builds/438/steps/test/logs/stdio

27 tests failed again:
    test_asyncio test_base64 test_c_locale_coercion test_capi
    test_cmd_line test_cmd_line_script test_compileall
    test_concurrent_futures test_doctest test_faulthandler
    test_file_eintr test_inspect test_io test_json test_keyword
    test_module test_multiprocessing_fork
    test_multiprocessing_forkserver test_multiprocessing_main_handling
    test_subprocess test_symbol test_sys test_threading test_tools
    test_traceback test_venv test_warnings
msg296007 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-06-14 14:27
I wrote a PR to remove PEP 538 warnings:
https://github.com/python/cpython/pull/2186

Reference (unpatched):

haypo@selma$ env -i LC_ALL=C ./python -c "import locale; print(locale.getpreferredencoding())"
Python runtime initialized with LC_CTYPE=C (a locale with default ASCII encoding), which may cause Unicode compatibility problems. Using C.UTF-8, C.utf8, or UTF-8 (if available) as alternative Unicode-compatible locales is recommended.
ANSI_X3.4-1968

haypo@selma$ env -i LC_CTYPE=C ./python -c "import locale; print(locale.getpreferredencoding())"
Python detected LC_CTYPE=C: LC_CTYPE coerced to C.UTF-8 (set another locale or PYTHONCOERCECLOCALE=0 to disable this locale coercion behavior).
UTF-8

haypo@selma$ env -i LC_ALL=C ./python -c "import locale; print(locale.getpreferredencoding())"
Python runtime initialized with LC_CTYPE=C (a locale with default ASCII encoding), which may cause Unicode compatibility problems. Using C.UTF-8, C.utf8, or UTF-8 (if available) as alternative Unicode-compatible locales is recommended.
ANSI_X3.4-1968


With my change:

haypo@selma$ env -i ./python -c "import locale; print(locale.getpreferredencoding())"
UTF-8

haypo@selma$ env -i LC_CTYPE=C ./python -c "import locale; print(locale.getpreferredencoding())"
UTF-8

haypo@selma$ env -i LC_ALL=C ./python -c "import locale; print(locale.getpreferredencoding())"
ANSI_X3.4-1968
msg296057 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2017-06-15 02:28
As Victor notes above, for systems where no suitable coercion target locale is available, even the "unsupported locale" warning is an issue, since it's only full Unicode text handling that's unsupported in such locales - you can still process ASCII-only and non-text data just fine in such environments (and even Unicode data if you're sufficiently careful).

As a matter of future-proofing CPython, we also need to account for the fact that some C implementations (potentially including future versions of glibc) may starting using UTF-8 as the default text encoding in the C locale itself (requiring the use of variants like C.ASCII or C.latin-1 to opt-in to the use of legacy text encodings).

I still think the warnings are potentially valuable for system integrators and operating system developers trying to ensure that C.UTF-8 is genuinely pervasive as their default locale, so my counter-proposal to Victor's PR is:

* remove the build-time warning flag
* gate the warnings for both successful and unsuccessful locale coercion behind a runtime check for PYTHONCOERCECLOCALE=warn

That means:

* we provide the best possible developer experience we can by default (i.e. no visible warnings, we assume UTF-8 where possible)
* redistributors have a relatively simple patch to always emit the warnings if they want to do so (i.e. remove the conditional checks)
* system integrators and folks debugging application misbehaviour can readily enable the warnings at runtime as part of fault investigation activities
msg296078 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2017-06-15 09:33
Updated issue title to reflect the fact we're now considering just silencing *all* the warnings by default.
msg296238 - (view) Author: Inada Naoki (inada.naoki) * (Python committer) Date: 2017-06-17 12:56
If this warnings are disabled by default, who enable it?
How about just remove them?

I'm OK to remove them all.
Since it's not ideal, nothing go worse than Python 3.6.

Additionally, if PEP 540 is accepted, we can use UTF-8 for
stdio and filesystem encoding even when there are no UTF-8 locale.
msg296240 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2017-06-17 14:52
By having the warnings always compiled in, but off by default, "PYTHONCOERCECLOCALE=warn" becomes a debugging tool for integration failures that we (or end users) suspect might be due to the locale coercion behaviour. It's essentially an even more esoteric variant of tools like "PYTHONINSPECT=1", "python -X faulthandler", "python -v", etc.

I already learned something myself based on updating the test cases to cover the opt-in warning model: I initially thought that when we set "LC_ALL=C" we'd get *both* warnings, but we don't (at least with glibc).

My working theory is that setting 'LC_ALL=C' causes 'setlocale(LC_CTYPE, "C.UTF-8")' to fail, even if there's otherwise a C.UTF-8 locale available. (I haven't conclusively proven that theory, but it's the most likely point for the locale coercion to be implicitly bypassed)
msg296246 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-06-17 22:25
As I wrote on python-dev, I would prefer no warning and no option to enable
warnings. But it's not my PEP, I would prefer that Nick makes a choice
here. Right now, my main worry is that my little (freebsd) buildbots are
still sick
msg296249 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2017-06-18 01:24
OK, based on the latest round of custom buildbot results (e.g. http://buildbot.python.org/all/builders/AMD64%20FreeBSD%20CURRENT%20Debug%20custom/builds/12/steps/test/logs/stdio ), it looks like the main remaining problems are those covered by issue 30672, where *BSD platforms handle some locales differently from the way Linux handles them (and Mac OS X then inherits those differences), and issue 30647 (where attempting to use the UTF-8 locale can cause nl_langinfo to fail).

So for the latest iteration on *this* change, I'm going to do the following:

1. Disable "UTF-8" as a candidate target locale
2. Adjust the test suite's expectations accordingly
msg296252 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2017-06-18 02:29
New changeset eb81795d7d3a8c898fa89a376d63fc3bbfb9a081 by Nick Coghlan in branch 'master':
bpo-30565: Add PYTHONCOERCECLOCALE=warn runtime flag (GH-2260)
https://github.com/python/cpython/commit/eb81795d7d3a8c898fa89a376d63fc3bbfb9a081
msg296253 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2017-06-18 02:32
OK, PEP 538 is effectively disabled on FreeBSD and Mac OS X now, and the locale coercion and compatibility warnings are off by default everywhere.

PYTHONCOERCECLOCALE=warn is explicitly documented as a debugging tool for use when folks already suspect that locale coercion or legacy locale incompatibility might be to blame for particular problems that they're seeing.
History
Date User Action Args
2017-06-18 02:32:29ncoghlansetstatus: open -> closed
resolution: fixed
messages: + msg296253

stage: resolved
2017-06-18 02:29:44ncoghlansetmessages: + msg296252
2017-06-18 01:24:57ncoghlansetmessages: + msg296249
2017-06-17 22:25:04vstinnersetmessages: + msg296246
2017-06-17 14:52:59ncoghlansetmessages: + msg296240
2017-06-17 12:56:30inada.naokisetmessages: + msg296238
2017-06-17 06:42:40ncoghlansetpull_requests: + pull_request2310
2017-06-17 05:29:10ncoghlansetnosy: + inada.naoki
2017-06-15 09:33:02ncoghlansetmessages: + msg296078
title: PEP 538: default to skipping warning for implicit locale coercion? -> PEP 538: silence locale coercion and compatibility warnings by default?
2017-06-15 09:32:09ncoghlanlinkissue28180 dependencies
2017-06-15 02:28:21ncoghlansetmessages: + msg296057
2017-06-14 14:27:14vstinnersetmessages: + msg296007
2017-06-14 14:25:57vstinnersetpull_requests: + pull_request2235
2017-06-14 14:15:13vstinnersetnosy: + vstinner
messages: + msg296006
2017-06-13 09:39:46ncoghlansetmessages: + msg295882
2017-06-13 09:36:59ncoghlansetmessages: + msg295881
2017-06-04 11:58:51barrysetnosy: + barry
2017-06-04 09:57:45ncoghlancreate