This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author eryksun
Recipients alexis, docs@python, eric.araujo, eryksun, feth, iritkatriel, terry.reedy
Date 2020-11-17.00:14:36
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1605572077.53.0.640080368875.issue12726@roundup.psfhosted.org>
In-reply-to
Content
> I tried "import locale; locale.getlocale()" on macOS and 
> windows (3.10) and linux (3.7) and in all cases I got 
> non-None values.  

In Windows, starting with Python 3.8, Python sets the LC_CTYPE locale to the user (not system) default locale instead of the CRT's initial "C" locale. (This is possibly an unintended consequence of redesigning the interpreter startup code, but what's done is done.) The same has been implemented in POSIX going back to Python 3.1. It's not a significant change for the core interpreter and standard library, which do not use the LC_CTYPE encoding for much in Windows, but it might affect third-party code. Embedding applications can use an isolated configuration that doesn't modify LC_CTYPE.

locale.getdefaultlocale() is not based on C setlocale() in Windows. It returns the language and region of the user locale from WinAPI GetLocaleInfo() paired with the process code page from WinAPI GetACP(). The latter is generally the same as the system code page, but possibly not in Windows 10 if the application manifest sets the process "activeCodePage" to UTF-8. (python.exe as distributed doesn't use the "activeCodePage" setting in its manifest, but an embedding application might.)

> Given the next paragraph describing 'C' as a non-standard language 
> code, I would have expected ('C',None), but it is as it is.

The documentation is unclear. Locale normalization handles the common cases, for better or worse. "C.ASCII" maps to "C", which is parsed as (None, None). "C.UTF8" maps to "en_US.UTF-8", and "C.ISO88591" maps to "en_US.ISO8859-1". Other encodings combined with the "C" locale have no alias, in which case "C" is returned as the language code, even though it's not a valid RFC 1766 code.
History
Date User Action Args
2020-11-17 00:14:37eryksunsetrecipients: + eryksun, terry.reedy, eric.araujo, docs@python, alexis, feth, iritkatriel
2020-11-17 00:14:37eryksunsetmessageid: <1605572077.53.0.640080368875.issue12726@roundup.psfhosted.org>
2020-11-17 00:14:37eryksunlinkissue12726 messages
2020-11-17 00:14:36eryksuncreate