Message387205
> Is this the same as the other issue where get_locale is normalising
> the result according to some particular glibc logic and isn't at
> all portable?
If I understand Anders' immediate problem correctly, I think it can be addressed by using setlocale() to save and restore the current locale in the calendar and _strptime modules. This requires no changes to the locale module, let alone the complete rewrite that's required to make getlocale(), normalize(), _parse_localename(), and _build_localename() work reliably in Windows.
It's not just a Windows problem. For example, getlocale() doesn't return the POSIX locale modifier in a 3-tuple (language, encoding, modifier). So it can't be used to restore a locale for which the modifier is mandatory.
The following example in Linux uses Serbian, a language that's customarily written with both the Cyrillic and Latin alphabets (i.e. BCP 47 / RFC 5646 language tags "sr-Cyrl-RS" and "sr-Latn-RS"). The Latin-based Unix locale name uses a "latin" modifier.
Say the process is currently using the Latin-based locale, but I need the name of a weekday in Cyrillic:
>>> locale.setlocale(locale.LC_TIME, 'sr_RS.UTF-8@latin')
'sr_RS.UTF-8@latin'
>>> c = calendar.LocaleTextCalendar(locale='sr_RS.UTF-8')
>>> c.formatweekday(1, 10)
' уторак '
LocaleTextCalendar() temporarily sets LC_TIME to the given locale and then restores the previous locale. But this is based on the getlocale() result, which omits the "latin" modifier. So now my current LC_TIME locale has changed to Cyrillic:
>>> locale.setlocale(locale.LC_TIME)
'sr_RS.UTF-8'
A related problem with modifiers affects getdefaultlocale(). For example, in Linux:
$ LC_ALL=sr_RS.UTF-8@latin python -q
>>> import locale, calendar
In this case, the LC_ALL environment variable specifies the Latin-based locale, but getdefaultlocale() omits this important detail:
>>> locale.getdefaultlocale()
('sr_RS', 'UTF-8')
Based on the default locale set in the LC_ALL environment variable, the following is supposed to return the Latin name "utorak", not the Cyrillic name "уторак":
>>> c = calendar.LocaleTextCalendar()
>>> c.formatweekday(1, 10)
' уторак '
If I make it call setlocale(LC_TIME, '') instead of getdefaultlocale(), I get the right result:
>>> c = calendar.LocaleTextCalendar(locale='')
>>> c.formatweekday(1, 10)
' utorak '
Thus an empty string should be the default locale value in LocaleTextCalendar(), instead of getdefaultlocale(). |
|
Date |
User |
Action |
Args |
2021-02-18 09:36:06 | eryksun | set | recipients:
+ eryksun, lemburg, paul.moore, tim.golden, zach.ware, steve.dower, swt2c, AndersMunch |
2021-02-18 09:36:06 | eryksun | set | messageid: <1613640966.73.0.226496859182.issue43115@roundup.psfhosted.org> |
2021-02-18 09:36:06 | eryksun | link | issue43115 messages |
2021-02-18 09:36:06 | eryksun | create | |
|