This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: PYTHONCOERCECLOCALE is ignored when using -E or -I option
Type: Stage: resolved
Components: Interpreter Core Versions: Python 3.8, Python 3.7
process
Status: closed Resolution: duplicate
Dependencies: Superseder: Py_Initialize() and Py_Main() should not enable C locale coercion
View: 34589
Assigned To: Nosy List: ncoghlan, vstinner
Priority: normal Keywords:

Created on 2018-09-11 23:47 by vstinner, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Messages (5)
msg325096 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-09-11 23:47
I modified Py_Main() to ignore the PYTHONCOERCECLOCALE environment variable if -E or -I command line option is used. But Nick asks to always read PYTHONCOERCECLOCALE.

We should either update the PEP or change the code.

I am not sure why PYTHONCOERCECLOCALE should be handled differently than other PYTHON* variable like PYTHONWARNINGS or PYTHONUTF8. Is it because it impacts the encodings? Is it because there was a chicken-and-egg issue before I reworked Py_Main() code? (PYTHONCOERCECLOCALE env var was read before reading command line arguments.)

--

Copy of Nick Coghlan's msg325009:

(The one exception to "nothing gets decoded incorrectly" is that PYTHONCOERCECLOCALE itself is always interpreted as an ASCII field: the values that it checks for are actually ASCII byte sequences, not Unicode code points.

The documentation could definitely be much clearer on that point though, as even in the PEP it's only implied by the final paragraph in https://www.python.org/dev/peps/pep-0538/#legacy-c-locale-coercion-in-the-standalone-python-interpreter-binary which is mostly talking about the fact that this particular environment variable is still checked, even if you pass the -I or -E command line options.
msg325240 - (view) Author: Alyssa Coghlan (ncoghlan) * (Python committer) Date: 2018-09-13 11:51
Respecting -E and -I isn't a problem per se - the problem is moving the _Py_CoerceLegacyLocale call to a point that's incredibly late in the startup process *just* to get it to respect those flags.

I don't actually mind if you reinstate the extra pass through the command line arguments just to check for -E and -I early enough for it to affect a properly located call to _Py_CoerceLegacyLocale, I just don't think it's necessary to do so.
msg325244 - (view) Author: Alyssa Coghlan (ncoghlan) * (Python committer) Date: 2018-09-13 13:42
For the PYTHONCOERCECLOCALE=warn case, it turns out that my preferred approach to implementing bpo-34589 also naturally ends up respecting -I and -E for that (i.e. supplying -I or -E will suppressed the warning).

However, my upcoming PR for that also reinstates and expands on my original comment that explained why it was valuable to support "PYTHONCOERCECLOCALE=0 python3 -E ..." and "PYTHONCOERCECLOCALE=0 python3 -I ...": so you can readily reproduce the way that locale coercion behaves on a platform *without* a suitable target locale (e.g. CentOS 7), even if your current platform actually does have such a locale available (e.g. Fedora).
msg325276 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-09-13 18:27
> Respecting -E and -I isn't a problem per se - the problem is moving the _Py_CoerceLegacyLocale call to a point that's incredibly late in the startup process *just* to get it to respect those flags.

Would you mind to elaborate how it is an issue? The LC_CTYPE is coerced *before* the configuration is read a second time. If there is an issue, would you mind to show an example where something is decoded from the wrong encoding?
msg325816 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-09-19 23:05
The discussion moved back to bpo-34589.
History
Date User Action Args
2022-04-11 14:59:05adminsetgithub: 78820
2018-09-19 23:05:10vstinnersetstatus: open -> closed
superseder: Py_Initialize() and Py_Main() should not enable C locale coercion
messages: + msg325816

resolution: duplicate
stage: resolved
2018-09-13 18:27:12vstinnersetmessages: + msg325276
2018-09-13 13:42:09ncoghlansetmessages: + msg325244
2018-09-13 11:51:17ncoghlansetmessages: + msg325240
2018-09-12 08:31:46vstinnersetnosy: + ncoghlan
2018-09-11 23:47:42vstinnercreate