classification
Title: locale._parse_localename fails when localename does not contain encoding information
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 2.7
process
Status: closed Resolution: out of date
Dependencies: Superseder:
Assigned To: lemburg Nosy List: BreamoreBoy, ggenellina, haypo, lemburg, loewis, santhosh.thottingal, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2009-09-12 14:26 by santhosh.thottingal, last changed 2014-08-29 21:28 by terry.reedy. This issue is now closed.

Files
File name Uploaded Description Edit
locale.py-parselocale-patch.diff santhosh.thottingal, 2009-09-12 14:26 Patch to fix _parse_localename failure when localename does not contain encoding information. Patch on python 3.1.1 Lib/locale.py
test_locale.py.diff santhosh.thottingal, 2009-09-17 17:20 Patch for adding testcases for this bug to Lib/test/test_locale.py
Messages (12)
msg92546 - (view) Author: Santhosh Thottingal (santhosh.thottingal) Date: 2009-09-12 14:26
locale._parse_localename fails when the locale name is in xx_YY format.
For example when the system locale is Malayalam(India),  ml_IN we get
the following result
>>> locale._parse_localename("ml_IN")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/python311/lib/python3.1/locale.py", line 424, in
_parse_localename
    raise ValueError('unknown locale: %s' % localename)
ValueError: unknown locale: ml_IN
The expected result is ('ml_IN', None)
For Latin languages, locale.py assumes iso-8859-15 as the encoding type
if encoding type is not given in localename. In case 
of other locales, None can be returned for encoding type.
Attached patch fixes this.
The result after applying patch to locale.py
>>> import locale
>>> locale._parse_localename("ml_IN")
('ml_IN', None)
msg92668 - (view) Author: Gabriel Genellina (ggenellina) Date: 2009-09-16 08:42
If you provide a test case the patch has a greater chance of being 
accepted.
msg92783 - (view) Author: Santhosh Thottingal (santhosh.thottingal) Date: 2009-09-17 17:20
Attached the testcases as a patch to Lib/test/test_locale.py
msg95932 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2009-12-03 13:56
Assigning to MAL, as this is his code.
msg95937 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2009-12-03 17:27
The reason this call fails is that there's no locale alias defined for
"ml_IN" in the local_alias dictionary.

While the patch is probably a good idea, it also hides the missing mapping.

I think a better approach would be to check the locale name for
standards compliance (ie. xx_YY format) and only then use it as fallback
solution.

I'll update the locale_alias dictionary to the X.org version 2009-12-08
(unless you know an even more recent version). This includes the missing
'ml_IN' mapping (among a few other additions and updates).
msg99315 - (view) Author: Santhosh Thottingal (santhosh.thottingal) Date: 2010-02-13 13:22
I see that the ml_IN added to locale.alias of X.org. 
lemburg, Do you think that my patch is still required as a fallback solution in case xx_YY mapping not found in locale.alias? 
If you can confirm that it is not required, we can close this bug.
msg132878 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2011-04-03 20:11
Is there another (authoritative) source for locale aliases apart
from X.org? On Ubuntu Lucid, many aliases for installed locales
are missing:


f = open("/var/lib/locales/supported.d/local")
locale_list = [loc.split()[0] for loc in f.readlines() \
               if not loc.startswith('#')]

for loc in locale_list:
    x = locale.setlocale(locale.LC_ALL, loc)
    try:
        y = locale.getlocale()
    except ValueError:
        print(loc)

aa_DJ
aa_ER
aa_ER@saaho
aa_ET
an_ES
ar_IN
ast_ES
ber_DZ
ber_MA
bn_BD
bo_CN
bo_IN
byn_ER
ca_ES@valencia
crh_UA
csb_PL
dv_MV
dz_BT
el_CY
en_AG
en_DK
en_NG
eu_FR
fil_PH
fur_IT
fy_NL
fy_DE
gez_ER
gez_ER@abegede
gez_ET
gez_ET@abegede
ha_NG
hne_IN
hsb_DE
ht_HT
hy_AM
ia
ig_NG
ik_CA
kk_KZ
kk_KZ
ks_IN
ks_IN@devanagari
ku_TR
lg_UG
li_BE
li_NL
mai_IN
mg_MG
ml_IN
mn_MN
my_MM
nan_TW@latin
nds_DE
nds_NL
ne_NP
nl_AW
om_ET
om_KE
or_IN
pa_PK
pap_AN
ps_AF
sa_IN
sc_IT
sd_IN
sd_IN@devanagari
shs_CA
sid_ET
so_DJ
so_ET
so_KE
so_SO
te_IN
ti_ER
ti_ET
tig_ER
tk_TM
tr_CY
tt_RU@iqtelif.UTF-8
ug_CN
wal_ET
wo_SN
yo_NG
zh_SG
msg132928 - (view) Author: Steffen Daode Nurpmeso (sdaoden) Date: 2011-04-04 10:37
Stefan, theoretically this is

        A valid locale description (as understood by S-SYS) is:

                language[_TERRITORY[.CODESET[@Modifier]]]

        where language is indeed a ISO 639 language code (see
        doc/iso639.txt) and _TERRITORY is indeed a ISO 3166 country code
        (see doc/iso3166.txt).
..
        The ISO3166 Maintenance Agency can be found at:
        #       http://www.iso.ch/iso/en/prods-services/iso3166ma/index.html
..
        http://www.loc.gov/standards/iso639-2/

A good UNIX has copies of the files in /usr/share/misc/{iso639,iso3166}.
I may be out-of-date a bit, though.
(And: this is not about Python, of course.)
msg132932 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2011-04-04 10:56
Stefan Krah wrote:
> 
> Stefan Krah <stefan-usenet@bytereef.org> added the comment:
> 
> Is there another (authoritative) source for locale aliases apart
> from X.org? On Ubuntu Lucid, many aliases for installed locales
> are missing:
> 
> f = open("/var/lib/locales/supported.d/local")
> locale_list = [loc.split()[0] for loc in f.readlines() \
>                if not loc.startswith('#')]
> 
> for loc in locale_list:
>     x = locale.setlocale(locale.LC_ALL, loc)
>     try:
>         y = locale.getlocale()
>     except ValueError:
>         print(loc)
> 
> aa_DJ

Hmm, I get:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.6/locale.py", line 513, in setlocale
    return _setlocale(category, locale)
locale.Error: unsupported locale setting

The "local" file you mention only contains "en_US.UTF-8 UTF-8" on
our Ubuntu 10.04.1 default installation.

Have you installed some other package to get support for all those
locales ?
msg132935 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2011-04-04 11:07
Marc-Andre Lemburg <report@bugs.python.org> wrote:
> The "local" file you mention only contains "en_US.UTF-8 UTF-8" on
> our Ubuntu 10.04.1 default installation.
> 
> Have you installed some other package to get support for all those
> locales ?

On Ubuntu it is a bit messy:

cp /usr/share/i18n/SUPPORTED /var/lib/locales/supported.d/local
locale-gen

On Debian it should be:

# Select 'all' in the dialog
dpkg-reconfigure locales

Stefan Krah
msg218513 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-05-14 09:32
For now the output of the code in msg132878 on Ubuntu is empty. May be this issue is outdated.
msg222874 - (view) Author: Mark Lawrence (BreamoreBoy) * Date: 2014-07-12 19:30
I agree with the sentiment expressed in msg218513 and would close this as "out of date".
History
Date User Action Args
2014-08-29 21:28:52terry.reedysetstatus: open -> closed
stage: resolved
resolution: out of date
versions: - Python 2.6, Python 3.1, Python 3.2
2014-07-12 19:30:37BreamoreBoysetstatus: pending -> open
nosy: + BreamoreBoy
messages: + msg222874

2014-05-14 09:32:29serhiy.storchakasetstatus: open -> pending
nosy: + serhiy.storchaka
messages: + msg218513

2014-05-13 21:48:36skrahsetnosy: - skrah
2011-04-04 11:40:28sdaodensetnosy: - sdaoden
2011-04-04 11:07:47skrahsetmessages: + msg132935
title: locale._parse_localename fails when localename does not contain encoding information -> locale._parse_localename fails when localename does not contain encoding information
2011-04-04 10:56:28lemburgsetmessages: + msg132932
title: locale._parse_localename fails when localename does not contain encoding information -> locale._parse_localename fails when localename does not contain encoding information
2011-04-04 10:39:21hayposetnosy: + haypo
2011-04-04 10:37:26sdaodensetnosy: + sdaoden
messages: + msg132928
2011-04-03 20:11:27skrahsetnosy: + skrah
messages: + msg132878
2010-02-13 13:22:31santhosh.thottingalsetmessages: + msg99315
2010-02-09 17:06:13brian.curtinsettype: crash -> behavior
2009-12-03 17:27:10lemburgsetmessages: + msg95937
2009-12-03 13:56:26loewissetassignee: loewis -> lemburg

messages: + msg95932
nosy: + lemburg
2009-12-03 13:24:26barrysetversions: + Python 2.6, Python 2.7, Python 3.2
2009-09-17 21:27:44georg.brandlsetassignee: loewis

nosy: + loewis
2009-09-17 17:20:22santhosh.thottingalsetfiles: + test_locale.py.diff

messages: + msg92783
2009-09-16 08:42:19ggenellinasetnosy: + ggenellina
messages: + msg92668
2009-09-12 14:26:57santhosh.thottingalcreate