Author ncoghlan
Recipients Jan Niklas Hasse, abarry, ezio.melotti, inada.naoki, lemburg, ncoghlan, r.david.murray, vstinner
Date 2016-12-12.11:49:23
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1481543363.76.0.276824936753.issue28180@psf.upfronthosting.co.za>
In-reply-to
Content
The challenge that arises in being selective about this is that "sys.getfilesystemencoding()" is actually a misnomer, and some of the things we use it for (like decoding command line arguments and environment variables) necessarily happen *really* early in the interpreter bootstrapping process. The bugs that arise from being internally inconsistent are then even harder to debug than those that arise from believing the OS when it says the right encoding to use is ASCII - the latter at least don't tend to be subtle, and are amenable to being resolved via "LC_ALL=C.UTF-8" and "LANG=C.UTF-8".

I believe Victor put quite a bit of time into trying to get more selective approaches to work reliably and eventually gave up.

For Fedora 26, I'm going to explore the feasibility of patching our system 3.6 installation such that the python3 command itself (rather than the shared library) checks for "LC_CTYPE=C" as almost the first thing it does, and forcibly sets LANG and LC_ALL to C.UTF-8 if it gets an answer it doesn't like. If we're able to do that successfully in the more constrained environment of a specific recent Fedora release, then I think it will bode well for doing something similar by default in CPython 3.7
History
Date User Action Args
2016-12-12 11:49:23ncoghlansetrecipients: + ncoghlan, lemburg, vstinner, ezio.melotti, r.david.murray, inada.naoki, abarry, Jan Niklas Hasse
2016-12-12 11:49:23ncoghlansetmessageid: <1481543363.76.0.276824936753.issue28180@psf.upfronthosting.co.za>
2016-12-12 11:49:23ncoghlanlinkissue28180 messages
2016-12-12 11:49:23ncoghlancreate