msg412647 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2022-02-06 17:33 |
The locale.getdefaultlocale() function only relies on environment variables. At Python startup, Python calls setlocale() is set the LC_CTYPE locale to the user preferred encoding.
Since Python 3.7, if the LC_CTYPE locale is "C" or "POSIX", PEP 538 sets the LC_CTYPE locale to a UTF-8 variant if available, and PEP 540 ignores the locale and forces the usage of the UTF-8 encoding. The *effective* encoding used by Python is inconsistent with environment variables.
Moreover, if setlocale() is called to set the LC_CTYPE locale to a locale different than the user locale, again, environment variables are inconsistent with the effective locale.
In these cases, locale.getdefaultlocale() result is not the expected locale and it can lead to mojibake and other issues.
For these reasons, I propose to deprecate locale.getdefaultlocale(): setlocale(), getpreferredencoding() and getlocale() should be used instead.
For the background on these issues, see recent issue:
* bpo-43552
* bpo-43557
|
msg412652 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2022-02-06 18:21 |
cal_locale.py: Test calendar.LocaleTextCalendar() default locale, manual test for GH-31166.
|
msg412664 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2022-02-06 20:50 |
New changeset 04dd60e50cd3da48fd19cdab4c0e4cc600d6af30 by Victor Stinner in branch 'main':
bpo-46659: Update the test on the mbcs codec alias (GH-31168)
https://github.com/python/cpython/commit/04dd60e50cd3da48fd19cdab4c0e4cc600d6af30
|
msg412666 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2022-02-06 20:52 |
New changeset 06b8f1615b09099fae5c5393334b8716a4144d20 by Victor Stinner in branch 'main':
bpo-46659: test.support avoids locale.getdefaultlocale() (GH-31167)
https://github.com/python/cpython/commit/06b8f1615b09099fae5c5393334b8716a4144d20
|
msg412667 - (view) |
Author: Marc-Andre Lemburg (lemburg) * |
Date: 2022-02-06 21:13 |
> For these reasons, I propose to deprecate locale.getdefaultlocale(): setlocale(), getpreferredencoding() and getlocale() should be used instead.
Please see the discussion on https://bugs.python.org/issue43552: locale.getpreferredencoding() needs to be deprecated as well. Instead we should have a single locale.getencoding() as outlined there... perhaps in a separate ticket ?! Thanks.
|
msg412668 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2022-02-06 21:15 |
> Please see the discussion on https://bugs.python.org/issue43552: locale.getpreferredencoding() needs to be deprecated as well. Instead we should have a single locale.getencoding() as outlined there... perhaps in a separate ticket ?! Thanks.
Yeah, I read this issue. But these things are too complicated :-) I prefer to move step by step.
Once locale.getencoding() (or a similar function) is added, we can update the deprecation message.
I hope to be able to deprecate getdefaultlocale() and to add such new function in Python 3.11.
|
msg412687 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2022-02-06 23:18 |
> New changeset 04dd60e50cd3da48fd19cdab4c0e4cc600d6af30 by Victor Stinner in branch 'main':
> bpo-46659: Update the test on the mbcs codec alias (GH-31168)
This change is not correct, I created bpo-46668 to fix it.
|
msg412800 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2022-02-07 23:24 |
New changeset 7a0486eaa98083e0407ff491872db6d7a0da2635 by Victor Stinner in branch 'main':
bpo-46659: calendar uses locale.getlocale() (GH-31166)
https://github.com/python/cpython/commit/7a0486eaa98083e0407ff491872db6d7a0da2635
|
msg412819 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2022-02-08 08:36 |
getdefaultlocale() falls back to LANG and LANGUAGE. It allows also to specify a list of looked up environment variables. How could this use case be covered with getlocale()?
|
msg412825 - (view) |
Author: Eryk Sun (eryksun) * |
Date: 2022-02-08 10:10 |
> getdefaultlocale() falls back to LANG and LANGUAGE.
_Py_SetLocaleFromEnv(LC_CTYPE) (e.g. setlocale(LC_CTYPE, "")) gets called at startup, except for the isolated configuration [1].
I think calendar.Locale*Calendar should try the LC_CTYPE locale if LC_TIME is "C", i.e. (None, None). Otherwise, it's introducing new default behavior. For example, with LC_ALL set to "ru_RU.utf8":
3.8:
>>> locale.getlocale(locale.LC_TIME)
(None, None)
>>> locale.getlocale(locale.LC_CTYPE)
('ru_RU', 'UTF-8')
>>> cal = calendar.LocaleTextCalendar()
>>> cal.formatweekday(0, 15)
' Понедельник '
3.11.0a5+:
>>> locale.getlocale(locale.LC_TIME)
(None, None)
>>> locale.getlocale(locale.LC_CTYPE)
('ru_RU', 'UTF-8')
>>> cal = calendar.LocaleTextCalendar()
>>> cal.formatweekday(0, 15)
' Monday '
>>> locale.setlocale(locale.LC_TIME, '')
'ru_RU.utf8'
>>> cal = calendar.LocaleTextCalendar()
>>> cal.formatweekday(0, 15)
' Понедельник '
---
[1] https://docs.python.org/3/c-api/init_config.html?#isolated-configuration
|
msg412826 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2022-02-08 10:20 |
Serhiy: "getdefaultlocale() falls back to LANG and LANGUAGE. It allows also to specify a list of looked up environment variables. How could this use case be covered with getlocale()?"
What's your use case to use env vars rather than the current LC_CTYPE locale?
My concern is that when setlocale() is called, the current LC_CTYPE locale is inconsistent and you can get mojibake and others issues.
See for example:
https://bugs.python.org/issue43552#msg389069
Marc-Andre Lemburg wants to deprecate it:
https://bugs.python.org/issue43552#msg389076
|
msg412827 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2022-02-08 10:26 |
> I think calendar.Locale*Calendar should try the LC_CTYPE locale if LC_TIME is "C", i.e. (None, None). Otherwise, it's introducing new default behavior. For example, with LC_ALL set to "ru_RU.utf8": (...)
Oh. Serhiy asked me to use LC_TIME rather than LC_CTYPE.
See also my example in the PR:
https://github.com/python/cpython/pull/31166#issuecomment-1030887394
|
msg412829 - (view) |
Author: Eryk Sun (eryksun) * |
Date: 2022-02-08 10:58 |
> Oh. Serhiy asked me to use LC_TIME rather than LC_CTYPE.
Since Locale*Calendar is documented as not being thread safe, __init__() could get the real default via setlocale(LC_TIME, "") when locale=None and the current LC_TIME is "C". Restore it back to "C" after getting the default. That should usually match the behavior from previous versions that called getdefaultlocale(). In cases where it differs, it's fixing a bug because the default LC_TIME is the correct default.
|
msg412842 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2022-02-08 14:40 |
Eryk: I created GH-31214 which uses the user preferred locale if the current LC_TIME locale is "C" or "POSIX".
Moreover, it no longer gets the current locale when the class is created. If locale=locale is passed, just use the current LC_TIME (or the user preferred is the locale is "C" or "POSIX").
|
msg413744 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2022-02-22 21:04 |
New changeset ccbe8045faf6e63d36229ea4e1b9298572cda126 by Victor Stinner in branch 'main':
bpo-46659: Fix the MBCS codec alias on Windows (GH-31218)
https://github.com/python/cpython/commit/ccbe8045faf6e63d36229ea4e1b9298572cda126
|
msg413745 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2022-02-22 21:06 |
New changeset b899126094731bc49fecb61f2c1b7557d74ca839 by Victor Stinner in branch 'main':
bpo-46659: Deprecate locale.getdefaultlocale() (GH-31206)
https://github.com/python/cpython/commit/b899126094731bc49fecb61f2c1b7557d74ca839
|
msg413907 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2022-02-24 13:29 |
New changeset 4fccf910738d1442852cb900747e6dccb8fe03ef by Victor Stinner in branch 'main':
bpo-46659: Enhance LocaleTextCalendar for C locale (GH-31214)
https://github.com/python/cpython/commit/4fccf910738d1442852cb900747e6dccb8fe03ef
|
msg413910 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2022-02-24 13:41 |
locale.getdefaultlocale() is now deprecated.
calendar now uses locale.setlocale() instead of locale.getdefaultlocale().
The ANSI code page alias to MBCS now has better tests and better comments.
Thanks Eryk Sun for your very useful feedback!
|
msg413915 - (view) |
Author: Marc-Andre Lemburg (lemburg) * |
Date: 2022-02-24 14:53 |
Thanks, Victor.
|
|
Date |
User |
Action |
Args |
2022-04-11 14:59:55 | admin | set | github: 90817 |
2022-02-24 14:53:20 | lemburg | set | messages:
+ msg413915 |
2022-02-24 13:41:35 | vstinner | set | status: open -> closed resolution: fixed messages:
+ msg413910
stage: patch review -> resolved |
2022-02-24 13:29:36 | vstinner | set | messages:
+ msg413907 |
2022-02-22 21:06:49 | vstinner | set | messages:
+ msg413745 |
2022-02-22 21:04:20 | vstinner | set | messages:
+ msg413744 |
2022-02-08 17:08:16 | vstinner | set | pull_requests:
+ pull_request29388 |
2022-02-08 14:40:54 | vstinner | set | messages:
+ msg412842 |
2022-02-08 14:39:02 | vstinner | set | pull_requests:
+ pull_request29384 |
2022-02-08 10:58:48 | eryksun | set | messages:
+ msg412829 |
2022-02-08 10:26:06 | vstinner | set | messages:
+ msg412827 |
2022-02-08 10:20:19 | vstinner | set | messages:
+ msg412826 |
2022-02-08 10:10:09 | eryksun | set | nosy:
+ eryksun messages:
+ msg412825
|
2022-02-08 08:36:30 | serhiy.storchaka | set | nosy:
+ serhiy.storchaka messages:
+ msg412819
|
2022-02-08 00:15:41 | vstinner | set | pull_requests:
+ pull_request29377 |
2022-02-07 23:24:43 | vstinner | set | messages:
+ msg412800 |
2022-02-06 23:18:59 | vstinner | set | messages:
+ msg412687 |
2022-02-06 21:15:22 | vstinner | set | messages:
+ msg412668 |
2022-02-06 21:13:32 | lemburg | set | nosy:
+ lemburg messages:
+ msg412667
|
2022-02-06 20:52:04 | vstinner | set | messages:
+ msg412666 |
2022-02-06 20:50:17 | vstinner | set | messages:
+ msg412664 |
2022-02-06 18:41:46 | vstinner | set | pull_requests:
+ pull_request29341 |
2022-02-06 18:30:52 | vstinner | set | pull_requests:
+ pull_request29340 |
2022-02-06 18:24:05 | vstinner | set | files:
+ cal_locale.py |
2022-02-06 18:24:00 | vstinner | set | files:
- cal_locale.py |
2022-02-06 18:21:24 | vstinner | set | files:
+ cal_locale.py
messages:
+ msg412652 |
2022-02-06 18:14:46 | vstinner | set | keywords:
+ patch stage: patch review pull_requests:
+ pull_request29339 |
2022-02-06 17:33:14 | vstinner | create | |