classification
Title: explain that locale.getlocale() does not read system's locales
Type: Stage: resolved
Components: Documentation Versions: Python 3.2, Python 3.3, Python 2.7
process
Status: closed Resolution: works for me
Dependencies: Superseder:
Assigned To: alexis Nosy List: alexis, docs@python, eric.araujo, eryksun, feth, iritkatriel, terry.reedy
Priority: normal Keywords:

Created on 2011-08-11 09:18 by alexis, last changed 2020-11-17 06:08 by terry.reedy. This issue is now closed.

Messages (5)
msg141897 - (view) Author: Alexis Metaireau (alexis) * (Python triager) Date: 2011-08-11 09:18
The documentation about locale.getlocale() doesn't talk about the fact that the locale isn't read from the system locale. Thus, it seemed strange to have locale.getlocale() returning (None, None).

As it seems to be the expected behaviour, it seems useful to specify this in the documentation and make it explicit.

I'm okay to write a patch and apply it.

This issue is related to #6203, but does not supersede it (the two conversations are discussing two different things).
msg141986 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2011-08-12 18:58
Our docs explain behavior without, generally, explaining why. Hence the title change.

'Returns the current setting for the given locale category' seems pretty clear that it returns the current program setting rather than the default system setting. However, 'program' could be added to be clearer.

The previous discussion for locale.getdefaultlocale makes it clear that the starting program locale is (should be) the "portable 'C' locale". I presume you are saying that in this locale, the setting for the default LC_CTYPE category is (None,None). However, this appears to currently only be true for 2.7. So I suppose we could add for 2.7 "In the starting 'C' locale, the LC_CTYPE setting is (None,None)." (Given the next paragraph describing 'C' as a non-standard language code, I would have expected ('C',None), but it is as it is.)

Reading #6203, something different is needed for 3.2 and something else again might be needed for 3.3 depending on what is or is not done.
msg381135 - (view) Author: Irit Katriel (iritkatriel) * (Python committer) Date: 2020-11-16 18:08
I tried "import locale; locale.getlocale()" on macOS and windows (3.10) and linux (3.7) and in all cases I got non-None values.  Can we close this as out of date?
msg381197 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2020-11-17 00:14
> I tried "import locale; locale.getlocale()" on macOS and 
> windows (3.10) and linux (3.7) and in all cases I got 
> non-None values.  

In Windows, starting with Python 3.8, Python sets the LC_CTYPE locale to the user (not system) default locale instead of the CRT's initial "C" locale. (This is possibly an unintended consequence of redesigning the interpreter startup code, but what's done is done.) The same has been implemented in POSIX going back to Python 3.1. It's not a significant change for the core interpreter and standard library, which do not use the LC_CTYPE encoding for much in Windows, but it might affect third-party code. Embedding applications can use an isolated configuration that doesn't modify LC_CTYPE.

locale.getdefaultlocale() is not based on C setlocale() in Windows. It returns the language and region of the user locale from WinAPI GetLocaleInfo() paired with the process code page from WinAPI GetACP(). The latter is generally the same as the system code page, but possibly not in Windows 10 if the application manifest sets the process "activeCodePage" to UTF-8. (python.exe as distributed doesn't use the "activeCodePage" setting in its manifest, but an embedding application might.)

> Given the next paragraph describing 'C' as a non-standard language 
> code, I would have expected ('C',None), but it is as it is.

The documentation is unclear. Locale normalization handles the common cases, for better or worse. "C.ASCII" maps to "C", which is parsed as (None, None). "C.UTF8" maps to "en_US.UTF-8", and "C.ISO88591" maps to "en_US.ISO8859-1". Other encodings combined with the "C" locale have no alias, in which case "C" is returned as the language code, even though it's not a valid RFC 1766 code.
msg381206 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2020-11-17 06:08
locale.getlocale(category=LC_CTYPE)

    Returns the current setting for the given locale category as sequence containing language code, encoding. category may be one of the LC_* values except LC_ALL. It defaults to LC_CTYPE.

    Except for the code 'C', the language code corresponds to RFC 1766. language code and encoding may be None if their values cannot be determined.
---
The non-standard 'C' language code is documented.  I am not sure that 'current program setting' is an improvement, especially if details have changed.  So closing as 'good enough'
History
Date User Action Args
2020-11-17 06:08:03terry.reedysetstatus: open -> closed
resolution: works for me
messages: + msg381206

stage: needs patch -> resolved
2020-11-17 00:14:37eryksunsetstatus: pending -> open
nosy: + eryksun
messages: + msg381197

2020-11-16 18:08:42iritkatrielsetstatus: open -> pending
nosy: + iritkatriel
messages: + msg381135

2011-08-12 18:58:24terry.reedysetnosy: + terry.reedy

messages: + msg141986
title: explain why locale.getlocale() does not read system's locales -> explain that locale.getlocale() does not read system's locales
2011-08-12 17:56:16eric.araujosetnosy: + eric.araujo, docs@python
stage: needs patch

versions: + Python 2.7, Python 3.2, Python 3.3
2011-08-11 09:18:55alexiscreate