Message 202470 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	mfabian
Recipients	mfabian
Date	2013-11-09.08:15:40
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1383984940.58.0.958007806047.issue19534@psf.upfronthosting.co.za>
In-reply-to

Content
in locale.py, the comment above “locale_alias = {” says: # Note that the normalize() function which uses this tables # removes '_' and '-' characters from the encoding part of the # locale name before doing the lookup. This saves a lot of # space in the table. But in normalize(), this is actually not done: # First lookup: fullname (possibly with encoding) norm_encoding = encoding.replace('-', '') norm_encoding = norm_encoding.replace('_', '') lookup_name = langname + '.' + encoding code = locale_alias.get(lookup_name, None) “norm_encoding” holds the locale name with these replacements, but then it is not used in the lookup. The patch in http://bugs.python.org/msg202469 fixes that, using the norm_encoding together with adding the alias + 'sr_rs.utf8@latin': 'sr_RS.UTF-8@latin', makes it work for sr_RS.UTF-8@latin, my test program then outputs: mfabian@ari:~ $ python2 ~/tmp/mike-test.py ja_JP.UTF-8 -> ja_JP.UTF-8 de_DE.SJIS -> de_DE.SJIS de_DE.foobar -> de_DE.foobar sr_RS.UTF-8@latin -> sr_RS.UTF-8@latin sr_rs@latin -> sr_RS.UTF-8@latin sr@latin -> sr_RS.UTF-8@latin sr_yu -> sr_RS.UTF-8@latin sr_yu.SJIS@devanagari -> sr_RS.sjis_devanagari sr@foobar -> sr@foobar sR@foObar -> sR@foObar sR -> sr_RS.UTF-8 mfabian@ari:~ $ But note that the normalization of the “sr_yu.SJIS@devanagari” locale is still weird (of course a “sr_yu.SJIS@devanagari” is quite silly and does not exist anyway, but the code in normalize() does not seem to work as intended.

in locale.py, the comment above “locale_alias = {” says:

# Note that the normalize() function which uses this tables
# removes '_' and '-' characters from the encoding part of the
# locale name before doing the lookup. This saves a lot of
# space in the table.

But in normalize(), this is actually not done:

    # First lookup: fullname (possibly with encoding)
    norm_encoding = encoding.replace('-', '')
    norm_encoding = norm_encoding.replace('_', '')
    lookup_name = langname + '.' + encoding
    code = locale_alias.get(lookup_name, None)

“norm_encoding” holds the locale name with these replacements,
but then it is not used in the lookup.

The patch in http://bugs.python.org/msg202469
fixes that, using the norm_encoding together with adding the alias

+    'sr_rs.utf8@latin':                      'sr_RS.UTF-8@latin',

makes it work for sr_RS.UTF-8@latin, my test program then outputs:

mfabian@ari:~
$ python2 ~/tmp/mike-test.py
ja_JP.UTF-8 -> ja_JP.UTF-8
de_DE.SJIS -> de_DE.SJIS
de_DE.foobar -> de_DE.foobar
sr_RS.UTF-8@latin -> sr_RS.UTF-8@latin
sr_rs@latin -> sr_RS.UTF-8@latin
sr@latin -> sr_RS.UTF-8@latin
sr_yu -> sr_RS.UTF-8@latin
sr_yu.SJIS@devanagari -> sr_RS.sjis_devanagari
sr@foobar -> sr@foobar
sR@foObar -> sR@foObar
sR -> sr_RS.UTF-8
mfabian@ari:~
$ 

But note that the normalization of the “sr_yu.SJIS@devanagari”
locale is still weird (of course a “sr_yu.SJIS@devanagari”
is quite silly and does not exist anyway, but the code in normalize()
does not seem to work as intended.

History
Date	User	Action	Args
2013-11-09 08:15:40	mfabian	set	recipients: + mfabian
2013-11-09 08:15:40	mfabian	set	messageid: <1383984940.58.0.958007806047.issue19534@psf.upfronthosting.co.za>
2013-11-09 08:15:40	mfabian	link	issue19534 messages
2013-11-09 08:15:40	mfabian	create