classification
Title: Check for systemd locale on startup if current locale is set to POSIX
Type: enhancement Stage:
Components: Interpreter Core Versions: Python 3.4, Python 3.5
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: Nosy List: eric.araujo, martin.panter, ncoghlan, vstinner
Priority: normal Keywords:

Created on 2014-04-27 19:30 by ncoghlan, last changed 2017-12-18 14:36 by vstinner. This issue is now closed.

Messages (6)
msg217313 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2014-04-27 19:30
Issue 19977 added "surrogateescape" to the fallback settings for the standard streams if Python 3 appears to be running under the POSIX locale (which Python 3 currently reads as setting a default encoding of ASCII, which is almost certainly wrong on any modern Linux system).

If a modern Linux system is using systemd as the process manager, then there will likely be a "/etc/locale.conf" file providing settings like LANG - due to problematic requirements in the POSIX specification, this file (when available) is likely to be a better "source of truth" regarding the system encoding than the environment where the interpreter process is started, at least when the latter is claiming ASCII as the default encoding.

See http://www.freedesktop.org/software/systemd/man/locale.conf.html for more details.
msg217328 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2014-04-27 23:26
I don't think that Python should read such configuration file. If you consider that something is wrong here, please report the issue to the C library.
msg217776 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2014-05-02 19:11
Why is the default encoding of POSIX wrong on a modern Linux system?
Today I installed Debian testing, and the first question of the
installer is to choose between "C" and "English" locales for the
install.  This with the remark that the chosen locale will be
the default system locale.
msg217813 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2014-05-03 04:29
That's the problem - even on UTF-8 systems, Linux programs will often be
started in a misconfigured environment: the POSIX locale. Python 2 doesn't
try to interpret the binary data as text, so it doesn't care. Python 3
*does* care, since it automatically converts several pieces of OS provided
data to text using the locale encoding.

systemd tackles that by adding the extra config file to ensure *all* the
environments it creates get the right config, and to provide a more
reliable "source of truth" as to the actual likely encoding of system
interfaces.

However, a fair bit of groundwork is needed to avoid any innate reliance on
the locale encoding before we can reliably override it by default, so this
issue is unlikely to go anywhere before PEP 432 is implemented (and even
then, actually changing the behaviour would be a separate discussion).
msg282985 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2016-12-12 11:51
While this is still a problem I'm interested in solving, I no longer think reading the systemd locale config file would be a good way to address it.

See issue 28180 for a more recent discussion of some other alternatives.
msg308566 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-12-18 14:36
Follow-up: the PEP 538 (bpo-28180) and PEP 540 (bpo-29240) have been accepted and implemented in Python 3.7!
History
Date User Action Args
2017-12-18 14:36:59vstinnersetmessages: + msg308566
2016-12-12 11:51:23ncoghlansetstatus: open -> closed
type: enhancement
resolution: rejected
messages: + msg282985
2014-10-14 16:29:50skrahsetnosy: - skrah
2014-05-03 04:29:15ncoghlansetmessages: + msg217813
2014-05-02 19:11:02skrahsetnosy: + skrah
messages: + msg217776
2014-05-02 18:01:54eric.araujosetnosy: + eric.araujo
2014-05-02 03:01:06martin.pantersetnosy: + martin.panter
2014-04-27 23:26:39vstinnersetnosy: + vstinner
messages: + msg217328
2014-04-27 19:30:25ncoghlancreate