This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Deprecate locale.getdefaultlocale() function
Type: Stage: resolved
Components: Library (Lib) Versions: Python 3.11
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: eryksun, lemburg, serhiy.storchaka, vstinner
Priority: normal Keywords: patch

Created on 2022-02-06 17:33 by vstinner, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
cal_locale.py vstinner, 2022-02-06 18:24
Pull Requests
URL Status Linked Edit
PR 31166 merged vstinner, 2022-02-06 18:14
PR 31167 merged vstinner, 2022-02-06 18:30
PR 31168 closed vstinner, 2022-02-06 18:41
PR 31206 merged vstinner, 2022-02-08 00:15
PR 31214 merged vstinner, 2022-02-08 14:39
PR 31218 merged vstinner, 2022-02-08 17:08
Messages (19)
msg412647 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-02-06 17:33
The locale.getdefaultlocale() function only relies on environment variables. At Python startup, Python calls setlocale() is set the LC_CTYPE locale to the user preferred encoding.

Since Python 3.7, if the LC_CTYPE locale is "C" or "POSIX", PEP 538 sets the LC_CTYPE locale to a UTF-8 variant if available, and PEP 540 ignores the locale and forces the usage of the UTF-8 encoding. The *effective* encoding used by Python is inconsistent with environment variables.

Moreover, if setlocale() is called to set the LC_CTYPE locale to a locale different than the user locale, again, environment variables are inconsistent with the effective locale.

In these cases, locale.getdefaultlocale() result is not the expected locale and it can lead to mojibake and other issues.

For these reasons, I propose to deprecate locale.getdefaultlocale(): setlocale(), getpreferredencoding() and getlocale() should be used instead.

For the background on these issues, see recent issue:

* bpo-43552
* bpo-43557
msg412652 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-02-06 18:21
cal_locale.py: Test calendar.LocaleTextCalendar() default locale, manual test for GH-31166.
msg412664 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-02-06 20:50
New changeset 04dd60e50cd3da48fd19cdab4c0e4cc600d6af30 by Victor Stinner in branch 'main':
bpo-46659: Update the test on the mbcs codec alias (GH-31168)
https://github.com/python/cpython/commit/04dd60e50cd3da48fd19cdab4c0e4cc600d6af30
msg412666 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-02-06 20:52
New changeset 06b8f1615b09099fae5c5393334b8716a4144d20 by Victor Stinner in branch 'main':
bpo-46659: test.support avoids locale.getdefaultlocale() (GH-31167)
https://github.com/python/cpython/commit/06b8f1615b09099fae5c5393334b8716a4144d20
msg412667 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2022-02-06 21:13
> For these reasons, I propose to deprecate locale.getdefaultlocale(): setlocale(), getpreferredencoding() and getlocale() should be used instead.

Please see the discussion on https://bugs.python.org/issue43552: locale.getpreferredencoding() needs to be deprecated as well. Instead we should have a single locale.getencoding() as outlined there... perhaps in a separate ticket ?! Thanks.
msg412668 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-02-06 21:15
> Please see the discussion on https://bugs.python.org/issue43552: locale.getpreferredencoding() needs to be deprecated as well. Instead we should have a single locale.getencoding() as outlined there... perhaps in a separate ticket ?! Thanks.

Yeah, I read this issue. But these things are too complicated :-) I prefer to move step by step.

Once locale.getencoding() (or a similar function) is added, we can update the deprecation message.

I hope to be able to deprecate getdefaultlocale() and to add such new function in Python 3.11.
msg412687 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-02-06 23:18
> New changeset 04dd60e50cd3da48fd19cdab4c0e4cc600d6af30 by Victor Stinner in branch 'main':
> bpo-46659: Update the test on the mbcs codec alias (GH-31168)

This change is not correct, I created bpo-46668 to fix it.
msg412800 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-02-07 23:24
New changeset 7a0486eaa98083e0407ff491872db6d7a0da2635 by Victor Stinner in branch 'main':
bpo-46659: calendar uses locale.getlocale() (GH-31166)
https://github.com/python/cpython/commit/7a0486eaa98083e0407ff491872db6d7a0da2635
msg412819 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2022-02-08 08:36
getdefaultlocale() falls back to LANG and LANGUAGE. It allows also to specify a list of looked up environment variables. How could this use case be covered with getlocale()?
msg412825 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2022-02-08 10:10
> getdefaultlocale() falls back to LANG and LANGUAGE.

_Py_SetLocaleFromEnv(LC_CTYPE) (e.g. setlocale(LC_CTYPE, "")) gets called at startup, except for the isolated configuration [1].

I think calendar.Locale*Calendar should try the LC_CTYPE locale if LC_TIME is "C", i.e. (None, None). Otherwise, it's introducing new default behavior. For example, with LC_ALL set to "ru_RU.utf8":

3.8:

    >>> locale.getlocale(locale.LC_TIME)
    (None, None)
    >>> locale.getlocale(locale.LC_CTYPE)
    ('ru_RU', 'UTF-8')
    >>> cal = calendar.LocaleTextCalendar()
    >>> cal.formatweekday(0, 15)
    '  Понедельник  '

3.11.0a5+:

    >>> locale.getlocale(locale.LC_TIME)
    (None, None)
    >>> locale.getlocale(locale.LC_CTYPE)
    ('ru_RU', 'UTF-8')
    >>> cal = calendar.LocaleTextCalendar()
    >>> cal.formatweekday(0, 15)
    '     Monday    '
    >>> locale.setlocale(locale.LC_TIME, '')
    'ru_RU.utf8'
    >>> cal = calendar.LocaleTextCalendar()
    >>> cal.formatweekday(0, 15)
    '  Понедельник  '

---

[1] https://docs.python.org/3/c-api/init_config.html?#isolated-configuration
msg412826 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-02-08 10:20
Serhiy: "getdefaultlocale() falls back to LANG and LANGUAGE. It allows also to specify a list of looked up environment variables. How could this use case be covered with getlocale()?"

What's your use case to use env vars rather than the current LC_CTYPE locale?

My concern is that when setlocale() is called, the current LC_CTYPE locale is inconsistent and you can get mojibake and others issues.

See for example:
https://bugs.python.org/issue43552#msg389069

Marc-Andre Lemburg wants to deprecate it:
https://bugs.python.org/issue43552#msg389076
msg412827 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-02-08 10:26
> I think calendar.Locale*Calendar should try the LC_CTYPE locale if LC_TIME is "C", i.e. (None, None). Otherwise, it's introducing new default behavior. For example, with LC_ALL set to "ru_RU.utf8": (...)

Oh. Serhiy asked me to use LC_TIME rather than LC_CTYPE.

See also my example in the PR:
https://github.com/python/cpython/pull/31166#issuecomment-1030887394
msg412829 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2022-02-08 10:58
> Oh. Serhiy asked me to use LC_TIME rather than LC_CTYPE.

Since Locale*Calendar is documented as not being thread safe, __init__() could get the real default via setlocale(LC_TIME, "") when locale=None and the current LC_TIME is "C". Restore it back to "C" after getting the default. That should usually match the behavior from previous versions that called getdefaultlocale(). In cases where it differs, it's fixing a bug because the default LC_TIME is the correct default.
msg412842 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-02-08 14:40
Eryk: I created GH-31214 which uses the user preferred locale if the current LC_TIME locale is "C" or "POSIX".

Moreover, it no longer gets the current locale when the class is created. If locale=locale is passed, just use the current LC_TIME (or the user preferred is the locale is "C" or "POSIX").
msg413744 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-02-22 21:04
New changeset ccbe8045faf6e63d36229ea4e1b9298572cda126 by Victor Stinner in branch 'main':
bpo-46659: Fix the MBCS codec alias on Windows (GH-31218)
https://github.com/python/cpython/commit/ccbe8045faf6e63d36229ea4e1b9298572cda126
msg413745 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-02-22 21:06
New changeset b899126094731bc49fecb61f2c1b7557d74ca839 by Victor Stinner in branch 'main':
bpo-46659: Deprecate locale.getdefaultlocale() (GH-31206)
https://github.com/python/cpython/commit/b899126094731bc49fecb61f2c1b7557d74ca839
msg413907 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-02-24 13:29
New changeset 4fccf910738d1442852cb900747e6dccb8fe03ef by Victor Stinner in branch 'main':
bpo-46659: Enhance LocaleTextCalendar for C locale (GH-31214)
https://github.com/python/cpython/commit/4fccf910738d1442852cb900747e6dccb8fe03ef
msg413910 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-02-24 13:41
locale.getdefaultlocale() is now deprecated.

calendar now uses locale.setlocale() instead of locale.getdefaultlocale().

The ANSI code page alias to MBCS now has better tests and better comments.

Thanks Eryk Sun for your very useful feedback!
msg413915 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2022-02-24 14:53
Thanks, Victor.
History
Date User Action Args
2022-04-11 14:59:55adminsetgithub: 90817
2022-02-24 14:53:20lemburgsetmessages: + msg413915
2022-02-24 13:41:35vstinnersetstatus: open -> closed
resolution: fixed
messages: + msg413910

stage: patch review -> resolved
2022-02-24 13:29:36vstinnersetmessages: + msg413907
2022-02-22 21:06:49vstinnersetmessages: + msg413745
2022-02-22 21:04:20vstinnersetmessages: + msg413744
2022-02-08 17:08:16vstinnersetpull_requests: + pull_request29388
2022-02-08 14:40:54vstinnersetmessages: + msg412842
2022-02-08 14:39:02vstinnersetpull_requests: + pull_request29384
2022-02-08 10:58:48eryksunsetmessages: + msg412829
2022-02-08 10:26:06vstinnersetmessages: + msg412827
2022-02-08 10:20:19vstinnersetmessages: + msg412826
2022-02-08 10:10:09eryksunsetnosy: + eryksun
messages: + msg412825
2022-02-08 08:36:30serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg412819
2022-02-08 00:15:41vstinnersetpull_requests: + pull_request29377
2022-02-07 23:24:43vstinnersetmessages: + msg412800
2022-02-06 23:18:59vstinnersetmessages: + msg412687
2022-02-06 21:15:22vstinnersetmessages: + msg412668
2022-02-06 21:13:32lemburgsetnosy: + lemburg
messages: + msg412667
2022-02-06 20:52:04vstinnersetmessages: + msg412666
2022-02-06 20:50:17vstinnersetmessages: + msg412664
2022-02-06 18:41:46vstinnersetpull_requests: + pull_request29341
2022-02-06 18:30:52vstinnersetpull_requests: + pull_request29340
2022-02-06 18:24:05vstinnersetfiles: + cal_locale.py
2022-02-06 18:24:00vstinnersetfiles: - cal_locale.py
2022-02-06 18:21:24vstinnersetfiles: + cal_locale.py

messages: + msg412652
2022-02-06 18:14:46vstinnersetkeywords: + patch
stage: patch review
pull_requests: + pull_request29339
2022-02-06 17:33:14vstinnercreate