Issue20027
Created on 2013-12-19 19:42 by serhiy.storchaka, last changed 2013-12-26 19:24 by serhiy.storchaka. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| locale_devanagari_3.patch | serhiy.storchaka, 2013-12-20 17:07 | review | ||
| Messages (10) | |||
|---|---|---|---|
| msg206636 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * ![]() |
Date: 2013-12-19 19:42 | |
The locales alias table contains invalid entries for devanagari modifiers (see issue5815): 'ks_in@devanagari': 'ks_IN@devanagari.UTF-8', 'sd': 'sd_IN@devanagari.UTF-8', Here is a patch which fixes aliases for these locales. |
|||
| msg206680 - (view) | Author: Marc-Andre Lemburg (lemburg) * ![]() |
Date: 2013-12-20 13:26 | |
On 20.12.2013 12:19, Serhiy Storchaka wrote: > > Added file: http://bugs.python.org/file33231/locale_devanagari_2.patch See my message on issue20034: There is some recent activity in glibc related to these. Here's a patch that adds the sd_IN@devanagari locale to glibc: http://sourceware.org/cgi-bin/cvsweb.cgi/libc/localedata/locales/sd_IN@devanagari.diff?cvsroot=glibc&r1=NONE&r2=1.1 So they will start working once platforms adopt the new glibc versions. The @-modifier is applied to the locale, not the encoding, because the locale uses a different script, as opposed to limiting itself to part of an encoding. This looks reasonable, even though I'm not sure it conforms to standards. Since all this is still very much in flux, perhaps we ought to wait a bit more and let the dust settle ?! |
|||
| msg206684 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * ![]() |
Date: 2013-12-20 14:55 | |
Ubuntu 12.04 supports Kashmiri and Sindhi locales (requires language-pack-sd-base and language-pack-sd-base packages). $ locale -a ... ks_IN ks_IN@devanagari ks_IN.utf8 ks_IN.utf8@devanagari ... sd_IN sd_IN@devanagari sd_IN.utf8 sd_IN.utf8@devanagari ... Current Python doesn't support all of these locales: $ LC_ALL=ks_IN ./python -c 'import locale; print(locale.getlocale())' Traceback (most recent call last): File "<string>", line 1, in <module> File "/home/serhiy/py/cpython/Lib/locale.py", line 556, in getlocale return _parse_localename(localename) File "/home/serhiy/py/cpython/Lib/locale.py", line 465, in _parse_localename raise ValueError('unknown locale: %s' % localename) ValueError: unknown locale: ks_IN $ LC_ALL=ks_IN@devanagari ./python -c 'import locale; print(locale.getlocale())' Traceback (most recent call last): File "<string>", line 1, in <module> File "/home/serhiy/py/cpython/Lib/locale.py", line 556, in getlocale return _parse_localename(localename) File "/home/serhiy/py/cpython/Lib/locale.py", line 465, in _parse_localename raise ValueError('unknown locale: %s' % localename) ValueError: unknown locale: ks_IN@devanagari $ LC_ALL=ks_IN.utf8 ./python -c 'import locale; print(locale.getlocale())' ('ks_IN', 'utf8') $ LC_ALL=ks_IN.utf8@devanagari ./python -c 'import locale; print(locale.getlocale())' ('ks_IN', 'UTF-8') $ LC_ALL=sd_IN ./python -c 'import locale; print(locale.getlocale())' Traceback (most recent call last): File "<string>", line 1, in <module> File "/home/serhiy/py/cpython/Lib/locale.py", line 556, in getlocale return _parse_localename(localename) File "/home/serhiy/py/cpython/Lib/locale.py", line 465, in _parse_localename raise ValueError('unknown locale: %s' % localename) ValueError: unknown locale: sd_IN $ LC_ALL=sd_IN@devanagari ./python -c 'import locale; print(locale.getlocale())' Traceback (most recent call last): File "<string>", line 1, in <module> File "/home/serhiy/py/cpython/Lib/locale.py", line 556, in getlocale return _parse_localename(localename) File "/home/serhiy/py/cpython/Lib/locale.py", line 465, in _parse_localename raise ValueError('unknown locale: %s' % localename) ValueError: unknown locale: sd_IN@devanagari $ LC_ALL=sd_IN.utf8 ./python -c 'import locale; print(locale.getlocale())' ('sd_IN', 'utf8') $ LC_ALL=sd_IN.utf8@devanagari ./python -c 'import locale; print(locale.getlocale())' ('sd_IN', 'utf8') After applying the patch Python supports all ks_IN and sd_IN locales. |
|||
| msg206685 - (view) | Author: Marc-Andre Lemburg (lemburg) * ![]() |
Date: 2013-12-20 15:06 | |
On 20.12.2013 15:55, Serhiy Storchaka wrote: > > After applying the patch Python supports all ks_IN and sd_IN locales. Well, yes, but only because you are removing the @-modifiers. I don't think that's correct, since e.g. the string formatting used for numbers is different with the modifier. If you keep the modifiers, but move them to the end of the locale string you should get the correct behavior, e.g. - 'sd': 'sd_IN@devanagari.UTF-8', + 'sd': 'sd_IN.UTF-8@devanagari', (modulo perhaps the spelling of "UTF-8") |
|||
| msg206687 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * ![]() |
Date: 2013-12-20 15:24 | |
> Well, yes, but only because you are removing the @-modifiers. I don't > think that's correct, since e.g. the string formatting used for > numbers is different with the modifier. All the @-modifiers except euro are applied to the locale, not the encoding. And Python removes all the @-modifiers, e.g. latin and cyrillic which specify the script. > If you keep the modifiers, but move them to the end of the locale > string you should get the correct behavior, e.g. > > - 'sd': 'sd_IN@devanagari.UTF-8', > + 'sd': 'sd_IN.UTF-8@devanagari', > > (modulo perhaps the spelling of "UTF-8") Recent the locale.alias file changes these entities: sd: sd_IN.UTF-8 sd_IN.utf8: sd_IN.UTF-8 sd@devanagari: sd_IN@devanagari.UTF-8 sd_IN@devanagari: sd_IN@devanagari.UTF-8 sd_IN@devanagari.utf8: sd_IN@devanagari.UTF-8 |
|||
| msg206689 - (view) | Author: Marc-Andre Lemburg (lemburg) * ![]() |
Date: 2013-12-20 16:03 | |
On 20.12.2013 16:24, Serhiy Storchaka wrote: > > Serhiy Storchaka added the comment: > >> Well, yes, but only because you are removing the @-modifiers. I don't >> think that's correct, since e.g. the string formatting used for >> numbers is different with the modifier. > > All the @-modifiers except euro are applied to the locale, not the encoding. > And Python removes all the @-modifiers, e.g. latin and cyrillic which specify > the script. That's not quite correct. The modifiers are used to determine the correct mapping, so you'll often find them on the left side, but not necessarily on the right side. There are several cases where the modifiers are kept around, since they have implications on the way number or dates are formatted. For the Indian "devanagari" locales we have to keep them, because the locale formatting of number and dates depends on them. >> If you keep the modifiers, but move them to the end of the locale >> string you should get the correct behavior, e.g. >> >> - 'sd': 'sd_IN@devanagari.UTF-8', >> + 'sd': 'sd_IN.UTF-8@devanagari', >> >> (modulo perhaps the spelling of "UTF-8") > > Recent the locale.alias file changes these entities: > > sd: sd_IN.UTF-8 > sd_IN.utf8: sd_IN.UTF-8 > sd@devanagari: sd_IN@devanagari.UTF-8 > sd_IN@devanagari: sd_IN@devanagari.UTF-8 > sd_IN@devanagari.utf8: sd_IN@devanagari.UTF-8 I'm not sure I can parse this comment :-) Looking at issue20034 I think we are saying that the new updated local.alias file contains these entries: sd: sd_IN.UTF-8 sd_IN.utf8: sd_IN.UTF-8 sd@devanagari: sd_IN@devanagari.UTF-8 sd_IN@devanagari: sd_IN@devanagari.UTF-8 sd_IN@devanagari.utf8: sd_IN@devanagari.UTF-8 So my example is wrong with the new locale.alias file. Instead, sd will map directly to sd_IN.UTF-8. Still, I think the makelocalalias.py script should correct the non-standard locale names from sd_IN@devanagari.UTF-8 to sd_IN.UTF-8@devanagari in order to match the output of "locale -a". |
|||
| msg206695 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * ![]() |
Date: 2013-12-20 17:07 | |
Updated patch to tip. The makelocalalias.py script now corrects the non-standard locale names. |
|||
| msg206953 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * ![]() |
Date: 2013-12-26 18:59 | |
Could you please make a decision about last patch, Marc-Andre? |
|||
| msg206954 - (view) | Author: Marc-Andre Lemburg (lemburg) * ![]() |
Date: 2013-12-26 19:16 | |
On 26.12.2013 19:59, Serhiy Storchaka wrote: > > Could you please make a decision about last patch, Marc-Andre? Looks good. Thanks, Serhiy. |
|||
| msg206955 - (view) | Author: Roundup Robot (python-dev) | Date: 2013-12-26 19:22 | |
New changeset aad582f717da by Serhiy Storchaka in branch '2.7': Issue #20027: Fixed locale aliases for devanagari locales. http://hg.python.org/cpython/rev/aad582f717da New changeset 7615c009e925 by Serhiy Storchaka in branch '3.3': Issue #20027: Fixed locale aliases for devanagari locales. http://hg.python.org/cpython/rev/7615c009e925 New changeset fff3f28733b4 by Serhiy Storchaka in branch 'default': Issue #20027: Fixed locale aliases for devanagari locales. http://hg.python.org/cpython/rev/fff3f28733b4 |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2013-12-26 19:24:04 | serhiy.storchaka | set | status: open -> closed assignee: serhiy.storchaka resolution: fixed stage: patch review -> resolved |
| 2013-12-26 19:22:16 | python-dev | set | nosy:
+ python-dev messages: + msg206955 |
| 2013-12-26 19:16:21 | lemburg | set | messages: + msg206954 |
| 2013-12-26 18:59:29 | serhiy.storchaka | set | messages: + msg206953 |
| 2013-12-21 19:22:08 | serhiy.storchaka | link | issue20046 dependencies |
| 2013-12-20 17:08:17 | serhiy.storchaka | set | files: - locale_devanagari_2.patch |
| 2013-12-20 17:07:58 | serhiy.storchaka | set | files:
+ locale_devanagari_3.patch messages: + msg206695 |
| 2013-12-20 16:03:47 | lemburg | set | messages: + msg206689 |
| 2013-12-20 15:24:10 | serhiy.storchaka | set | messages: + msg206687 |
| 2013-12-20 15:06:02 | lemburg | set | messages: + msg206685 |
| 2013-12-20 14:55:29 | serhiy.storchaka | set | messages: + msg206684 |
| 2013-12-20 13:26:32 | lemburg | set | messages: + msg206680 |
| 2013-12-20 11:22:50 | serhiy.storchaka | set | files: - locale_devanagari.patch |
| 2013-12-20 11:19:10 | serhiy.storchaka | set | files: + locale_devanagari_2.patch |
| 2013-12-19 19:49:41 | serhiy.storchaka | set | files: + locale_devanagari.patch |
| 2013-12-19 19:49:09 | serhiy.storchaka | set | files: - locale_aliases.patch |
| 2013-12-19 19:42:08 | serhiy.storchaka | create | |
