New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
setlocale error message is confusing #47317
Comments
import locale
locale.setlocale( locale.LC_ALL, u'ja_JP.utf8')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.5/locale.py", line 475, in setlocale
locale = normalize(_build_localename(locale))
File "/usr/lib/python2.5/locale.py", line 383, in _build_localename
language, encoding = localetuple
ValueError: too many values to unpack The problem is line 473: Replacing this with |
I have confirmed this exists on trunk http://svn.python.org/view/python/trunk/Lib/locale.py?rev=63824&view=markup (63824 is the latest) |
FWIW the type("") is gone in Py3, now it is: However Py3.0 now raises the same error when the second arg is a byte
string:
>>> locale.setlocale(locale.LC_ALL, b'ja_JP.utf8')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Programs\Python30\lib\locale.py", line 500, in setlocale
locale = normalize(_build_localename(locale))
File "C:\Programs\Python30\lib\locale.py", line 408, in _build_localename
language, encoding = localetuple
ValueError: too many values to unpack On Py3, locale.setlocale() should allow only unicode strings and reject |
The docs say that the locale arg should be None, tuple, or string, so I take that to mean that Unicode should be OK for 2.x, and that would help porting to 3.x. If bytes are rejected in 3.x, there should be TypeError raised, not ValueError, as is still the case in 3.1.2. |
After more thought and investigation, I have changed my opinions on this issue. Allowing unicode string for locale in 2.7: Expected failure cases could be added to test_locale.py. Options for locale name: def _build_localename(localetuple):
language, encoding = localetuple required a tuple on the right and was called 'tuple unpacking'. "If locale is specified, it may be a None, a string, or an iterable producing two strings, language code and encoding." This is not a feature addition but a recognition of a new feature added versions ago. The parameter name in the private function should then be shortened to just 'locale'. Test cases with non-tuples could be added if not present now. Exception message: Python is known to be inconsistent in its usage of ValueError versus TypeError for builtin function args. Guido has said to leave inconsistencies rather than break code. So I retract the ValueError to TypeError suggestion. The accompanying messages, however, can be improved. The lines above that fails could be wrapped with The scope of the wrapper could be extended to the entire function so that failure of A complication: the doc says locale.setlocale(category, locale=None)
...If the modification of the locale fails, the exception Error is raised." So it seems that either a) the wrapper above should raise Error instead, or the doc could more clearly say "Exception raised when the locale passed to setlocale is not recognized." |
"Since the module predates unicode strings (it is in 1.5) and since the locale string is passed to a C function, 'string' in the doc can just as well be taken to mean ascii byte string only, as the code requires." My only comment is that generally it doesn't seem reasonable to me that developer should need to investigate the history and implementation of a function in order to understand the documentation correctly. |
I agree and it is the current behaviour (of Python 3.3). I don't see any use case of a byte strings in locale.setlocale() with Python 3.3, so I remove Python 3 from the versions of this issue. |
New changeset d370d609d09b by Victor Stinner in branch '2.7': |
I fixed locale.setlocale() of Python 2.7 to accept Unicode string because it helps porting to Python 3... But I think that the commit is just useless because we will have to wait until Python 2.7.3 is released, and if you want to support older Python versions, we will have to encode the locale explicitly to ASCII. Anyway, you should move to Python 3 (3.2 or later if possible) if you want a better Unicode support. |
Victor, the issue for 3.x, which remains, is to improve the error message. I also suggested a doc change, though I would like Mark or Martin's comments before I would make it.
Exactly. 'Older versions' includes older versions of 2.7. This is why I suggested that making the change to 2.7 would be a feature addition, which is not permitted for the very reason you give. I think the commit should be reverted. Certainly, when a another developer says "This patch should be rejected and not committed' after careful review, you should discuss, possibly on pydev, before committing. |
New changeset e72a2a60316f by Victor Stinner in branch '2.7': |
Added a patch that implements two things: setlocale now raises locale.Error('Locale must be None, a string, or an iterable of two strings -- language code, encoding.'). I decided to remove the proposed .format(locale), as it wasa a bit confusing when passing a tuple containing invalid items. I also added two tests, one for bytes and another for a tuple of two bytes. |
Thanks for the patch. Exception messages are considered implementation details, so I would not test them. Testing that an exception is raised is good enough IMO. |
I modified the patch not to contain the tests against exception messages |
I think the reported exception type is incorrect. Given that the error message is 'Locale must be None, a string, or an iterable of two strings -- language code, encoding.', it very much sounds like a TypeError is being reported here. So I think all that's needed is that the ValueError is converted into a TypeError. Also notice that the tuple unpacking may actually succeed: py> locale.setlocale(locale.LC_ALL,u"en")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.6/locale.py", line 513, in setlocale
return _setlocale(category, locale)
locale.Error: unsupported locale setting |
Maybe we should return TypeError with the same message then? That would require some modification of documentation though, as it states: "If the modification of the locale fails, the exception Error is raised.". I don't really understand the "locale unpacking may actually succeed". Isn't that what supposed to happen, to my knowledge "en" is not a valid locale and that's a totally different issue? If I'm wrong, please correct, I've just started wandering in to Python Core development :) |
No, any operation can report TypeError and ValueError without explicit
See my example again: py> locale.setlocale(locale.LC_ALL,u"en")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.6/locale.py", line 513, in setlocale
return _setlocale(category, locale)
locale.Error: unsupported locale setting
py> locale.setlocale(locale.LC_ALL,u"eng")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.6/locale.py", line 512, in setlocale
locale = normalize(_build_localename(locale))
File "/usr/lib/python2.6/locale.py", line 420, in _build_localename
language, encoding = localetuple
ValueError: too many values to unpack So for u"eng" you get the ValueError. For u"en", you get past that |
Thanks for clarification! I see the problem now. So if I get this correctly we should change the _build_localename to raise TypeError? If the given locale is in wrong format, we'll get TypeError, but if it's valid type but otherwise invalid locale (like 'en'), we'll get ValueError (or more specifically locale.Error). |
Yes, that's what I'm proposing.
Ideally, yes. Notice that it will be difficult to produce a TypeError |
Uploaded a new patch that raises TypeError |
New changeset 931ae170e51c by Petri Lehtinen in branch '3.2': New changeset d90d88380aca by Petri Lehtinen in branch 'default': |
Terry: Do you still think there's need for a doc update? |
Yes. I think in locale.rst (assuming that is the name) locale.setlocale(category, locale=None)
If *locale* is specified, it may be a string, a tuple of the form (language code, encoding), or None. If it is a tuple, it is converted to a string using the locale aliasing engine.
'''
should be changed to
'''
exception locale.Error
Exception raised when the locale passed to setlocale() is not recognized.
locale.setlocale(category, locale=None)
If *locale* is specified, it may be a None, a string, or an iterable of two strings, language code and encoding. String pairs are converted to a single string using the locale aliasing engine.
'''
where language code and encoding are gray shaded as they are now. |
What about the possible None value then? Do you think that mentions to I don't think so, because setlocale(locale.LC_ALL, None) is an If None is not dropped, the ", language code and encoding" should |
Yes, parentheses would be better. |
New changeset 34c9465f5023 by Petri Lehtinen in branch '2.7': New changeset 98806dd03506 by Petri Lehtinen in branch '3.2': New changeset 8a27920efffe by Petri Lehtinen in branch 'default': |
I decided to restructure the documentation of setlocale() a bit and I think it's better now overall. It includes Terry's suggestions. I think this issue can now be closed. Thanks for the report and patches! |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: