classification
Title: locale.py doesn't recognize valid locale setting
Type: Stage:
Components: Library (Lib) Versions: Python 2.3
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: lemburg Nosy List: childsplay, lemburg, loewis, zgoda
Priority: normal Keywords:

Created on 2004-12-07 20:23 by childsplay, last changed 2004-12-16 12:33 by childsplay. This issue is now closed.

Messages (20)
msg23584 - (view) Author: stas Z (childsplay) Date: 2004-12-07 20:23
stas@mobi:~$ locale
LANG=nb_NO
[...]

stas@mobi:~$ python
Python 2.3.4 (#2, Sep 24 2004, 08:39:09) 
[GCC 3.3.4 (Debian 1:3.3.4-12)] on linux2
Type "help", "copyright", "credits" or "license" for
more information.
>>> import locale
>>> locale.getdefaultlocale()
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "/usr/lib/python2.3/locale.py", line 346, in
getdefaultlocale
    return _parse_localename(localename)
  File "/usr/lib/python2.3/locale.py", line 280, in
_parse_localename
    raise ValueError, 'unknown locale: %s' % localename
ValueError: unknown locale: nb_NO
>>> 
msg23585 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2004-12-07 20:25
Logged In: YES 
user_id=21627

Why do you want to use getdefaultlocale()?
msg23586 - (view) Author: Jarek Zgoda (zgoda) Date: 2004-12-07 21:39
Logged In: YES 
user_id=92222

getdefaultlocale() is often used to get default encoding for
current system locale.
And, if function is provided in standard library, why
shouldn't one use it?
msg23587 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2004-12-07 22:13
Logged In: YES 
user_id=21627

To get the default encoding for the current locale, you
should use locale.getpreferredencoding(). You should not use
getdefaultlocale becaus it is (IMO) inherently broken, and
should not have been part of the standard library in the
first place.
msg23588 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2004-12-07 23:08
Logged In: YES 
user_id=38388

Of course, I don't agree with you, Martin :-)
locale.getdefaultlocale() does server a purpose, namely that
of getting the default locale setting. The encoding
information is an often used extension when setting the
locale in the OS environment. If not set, the module
provides common defaults.

The locale "nb_NO" is not known to the module alias table.
Which locale, language and encoding would that be ?
msg23589 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2004-12-08 07:04
Logged In: YES 
user_id=21627

There is no "default locale setting" in most operating
systems. In what sense is the value of the LANG environment
variable a "default"?
msg23590 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2004-12-08 09:05
Logged In: YES 
user_id=38388

The LANG (and other similar OS environment variables) define
what the user wishes to see being used as locale in the
applications that are started in that environment. See the
setlocale man page for details.

On some OSes such as Windows these settings are stored
differently, which is why the locale module has provisions
for finding these settings (thanks to Fredrik).
msg23591 - (view) Author: stas Z (childsplay) Date: 2004-12-08 12:29
Logged In: YES 
user_id=638376

The reason I use getdefaultlocale(), is to get a platform
independant way of getting the systems locale setting.
The biggest advantage is that on Windows a "Linux like"
locale is returnt so that I can use the same language
support stuff on all
platforms. (Win,Linux,OSX).
Besides, what's the problem of adding the missing locale?
msg23592 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2004-12-08 13:06
Logged In: YES 
user_id=38388

Please provide some authorative source which describes the
locale your are using (nb_NO) and the commonly used encoding
for that locale (see the existing dictionary in locale.py).
msg23593 - (view) Author: stas Z (childsplay) Date: 2004-12-08 15:09
Logged In: YES 
user_id=638376

This is what I've put into /python2.3/locale.py:

locale_alias = {....
    .......
    'bokmål':			'nb_NO.ISO8859-1',
    'nb':			  'nb_NO.ISO8859-1',
    'nb_no':			'nb_NO.ISO8859-1',
    'nynorsk':			'nn_NO.ISO8859-1',
    'nn':			  'nn_NO.ISO8859-1',
    'nn_no':		        'nn_NO.ISO8859-1',
   ....
   ....
}

I have tested it on a number of apps and it fixes the problem.
msg23594 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2004-12-08 21:28
Logged In: YES 
user_id=21627

MAL: that doesn't answer my question, though: In what sense
does getdefaultlocale get a "default" locale? default as
opposed to what custom setting?

childsplay: the problem with adding additional aliases is
a) we can never hope to get a complete list of locales, so
this is a never-ending maintenance problem, and
b) the dictionary might be wrong on some systems. E.g.
sometimes, 'nb' might denote an ISO-8859-15 locale,  or a
UTF-8 locale (e.g. when UTF-8 becomes the standard encoding
on Unix some day). If so, Python will silently compute an
incorrect default - in particular wrt. the encoding of the
"default" locale.
msg23595 - (view) Author: stas Z (childsplay) Date: 2004-12-09 09:27
Logged In: YES 
user_id=638376

I agree that "default" would probably be called "preferred".

@loewis: 
a)
I agree with your point of view but I as a developer I just
want to get the current locale in use and locale.py serves
that purpose in a platform independant way. The
"never-ending maintenance problem" is the result of the 
locale horror we all have to live with until there's a final
solution/standard for it.

b)
Agree, I now understand your  "getdefaultlocale ..
inherently broken" comment. But we still have a locale
module that supports some of the valid locales but not all,
which is (IMO) worse then having none at all.

BTW: getting the current locale to get a platform
independant language setting, should perhaps be part of
gettext.py instead of locale.py?

msg23596 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2004-12-09 09:42
Logged In: YES 
user_id=38388

Martin: 

"default" as opposed to whatever locale setting is currently
active for the program, i.e. the locale setting the program
would see after a single call to setlocale(LC_ALL, "") right
after the start of the program.

getdefaultlocale() mimics the lookup mechanism of
setlocale(LC_ALL, "").

The fact that the alias table may sometimes not give the
correct encoding is not a fault of Python or the table - if
the user wants to see a different encoding used as default
encoding for the set locale, then the user should include
that information in the LANG (or other) OS environment
variable of the process running the Python program.

Note that this is different than the "preferred" encoding
which a user can set in a window manager (KDE or Gnome) or
browser. Those settings are restricted to certain
application spaces. getdefaultencoding() is targetted at the
OS level setting which may be different from e.g. a KDE
setting (think a program running in a shell vs. a KDE
application run by a user).
msg23597 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2004-12-09 09:56
Logged In: YES 
user_id=38388

childsplay (I wish people would use real names on SF...):

We can add the aliases you gave below, but we need some URLs
to add as reference. Please provide this information, so
that we can document that the aliases are in common use and
why iso-8859-1 is their usually used encoding.

Thanks.
msg23598 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2004-12-09 23:26
Logged In: YES 
user_id=21627

getdefaultencoding might be "targeted" at the OS level - but
the implementation certainly is not. On the OS level, the C
library will often use a different encoding after
locale.setlocale(locale.LC_ALL,"") is called, compared to
what getdefaultencoding returns. The approach takien inside
getdefaultencoding is inherently flawed, and cannot possibly
work in all cases; getpreferredencoding fixes that flaw.
msg23599 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2004-12-10 13:16
Logged In: YES 
user_id=38388

Well, if the alias mapping is good enough for X, then it's
good enough for me :-)

I think we ought to update the alias table with the current
data of the X locale.alias file. This file also includes the
mappings that childsplay mentioned.

There also seems to be a bug in the encoding alias table:
"utf" is no longer recognized by setlocale(). This should be
changed to "utf8".

I'll fix that and post an update here.
msg23600 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2004-12-10 21:58
Logged In: YES 
user_id=38388

Checking in Lib/locale.py;
/cvsroot/python/python/dist/src/Lib/locale.py,v  <--  locale.py
new revision: 1.29; previous revision: 1.28
done
msg23601 - (view) Author: stas Z (childsplay) Date: 2004-12-11 16:18
Logged In: YES 
user_id=638376

I've looked at the CVS/locale.py but is the utf8 entry missing?
Don't know about the X alias mapping, but I suspect that
there should also be an utf8 entry also for nb_NO/nn_NO.
(Just as there is for no_NO)
msg23602 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2004-12-13 19:58
Logged In: YES 
user_id=38388

Thanks. I've noticed that the C lib doesn't seem to like
"UTF-8" but works well with "UTF8" (no hyphen).

Checking in Lib/locale.py;
/cvsroot/python/python/dist/src/Lib/locale.py,v  <--  locale.py
new revision: 1.30; previous revision: 1.29
done
Checking in Tools/i18n/makelocalealias.py;
/cvsroot/python/python/dist/src/Tools/i18n/makelocalealias.py,v
 <--  makelocalealias.py
new revision: 1.2; previous revision: 1.1
done

Please check again with the updated version.
msg23603 - (view) Author: stas Z (childsplay) Date: 2004-12-16 12:33
Logged In: YES 
user_id=638376

Checked and it works ok, thanks.
History
Date User Action Args
2004-12-07 20:23:46childsplaycreate