This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author ncoghlan
Recipients Sworddragon, larry, lemburg, loewis, ncoghlan, pitrou, r.david.murray, terry.reedy, vstinner
Date 2013-12-08.11:16:01
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <>
Yes, that's the point. *Every* case I've seen where the locale encoding has been reported as ASCII on a modern Linux system has been because the environment has been configured to use the C locale, and that locale has a silly, antiquated, encoding setting.

This is particularly problematic when people remotely access a system with ssh and get given the C locale instead of something sensible, and then can't properly read the filesystem on that server.

The idea of using UTF-8 instead in that case is to *change* (and hopefully reduce) the number of cases where things go wrong.

- if no non-ASCII data is encountered, the choice of ASCII vs UTF-8 doesn't matter
- if it's a modern Linux distro, then the real filesystem encoding is UTF-8, and the setting it provides for LANG=C is just plain *wrong*
- there may be other cases where ASCII actually *is* the filesystem encoding (in which case they're going to have trouble anyway), or the real filesystem encoding is something other than UTF-8

We're already approximating things on Linux by assuming every filesystem is using the *same* encoding, when that's not necessarily the case. Glib applications also assume UTF-8, regardless of the locale (

At the moment, setting "LANG=C" on a Linux system *fundamentally breaks Python 3*, and that's not OK.
Date User Action Args
2013-12-08 11:16:02ncoghlansetrecipients: + ncoghlan, lemburg, loewis, terry.reedy, pitrou, vstinner, larry, r.david.murray, Sworddragon
2013-12-08 11:16:02ncoghlansetmessageid: <>
2013-12-08 11:16:02ncoghlanlinkissue19846 messages
2013-12-08 11:16:01ncoghlancreate