Author ncoghlan
Recipients Jan Niklas Hasse, abarry, ezio.melotti, inada.naoki, ncoghlan, r.david.murray, vstinner
Date 2016-12-12.05:26:54
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1481520415.25.0.991854549409.issue28180@psf.upfronthosting.co.za>
In-reply-to
Content
I think we're genuinely getting to the point now where the majority of "LANG=C" cases are misconfigurations rather than intended behaviour. We're also to the point where:

- on Mac OS X, binary system interfaces have been handled as UTF-8 by default since 3.0
- on Windows, as of 3.6, the OS native binary system interfaces are now bypassed entirely in favour of transcoding from UTF-8 to UTF-16-LE 

So I think for Python 3.7 it makes sense to do the following on other *nix systems:

- very early in CPython startup (even before argument processing), if the detected locale is "C", force it to "C.UTF-8" if possible, and print a warning either way
- add a PYTHONKEEPASCIILOCALE environment variable to turn that behaviour off

I do think we actually want to *change* the C level locale in the process though, as otherwise we can expect to see weird interactions where CPython and extension modules disagree about the default text encoding.
History
Date User Action Args
2016-12-12 05:26:55ncoghlansetrecipients: + ncoghlan, vstinner, ezio.melotti, r.david.murray, inada.naoki, abarry, Jan Niklas Hasse
2016-12-12 05:26:55ncoghlansetmessageid: <1481520415.25.0.991854549409.issue28180@psf.upfronthosting.co.za>
2016-12-12 05:26:55ncoghlanlinkissue28180 messages
2016-12-12 05:26:54ncoghlancreate