Issue2651
Created on 2008-04-17 17:24 by rharris, last changed 2008-05-08 18:51 by pitrou.
| File name |
Uploaded |
Description |
Edit |
Remove |
|
testExceptionStringRoundTrip.py
|
rharris,
2008-04-17 17:24
|
A test demonstrating the issue and showing that KeyError is arbitrarily different |
|
|
|
KeyError.patch
|
amaury.forgeotdarc,
2008-04-21 00:14
|
|
|
|
| msg65586 (view) |
Author: Rick Harris (rharris) |
Date: 2008-04-17 17:24 |
|
Here is a bug in Python 2.5 which would be nice to fix for Py3k (since
we are already breaking compatibility):
Take a string:
s = "Hello"
Create a KeyError exception with that string:
e = KeyError(s)
Counterintuitively, casting the exception to a string doesn't return the
same string:
str(e) != s
Instead, when KeyError is cast to a string it affixes single-quotes
around the string.
I have create a test which shows that the other built-in exceptions
(except for 3 Unicode Errors which seem to be unusual in that they don't
accept just a string), do indeed round-trip the string unaltered.
This actually caused a bug (in an old version of zope.DocumentTemplate).
I am including the test case I wrote for now; I will begin looking into
a solution shortly and hopefully whip up a patch.
|
| msg65587 (view) |
Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) |
Date: 2008-04-17 17:42 |
|
Here is a relevant comment inside the KeyError_str function:
/* If args is a tuple of exactly one item, apply repr to args[0].
This is done so that e.g. the exception raised by {}[''] prints
KeyError: ''
rather than the confusing
KeyError
alone. The downside is that if KeyError is raised with an
explanatory
string, that string will be displayed in quotes. Too bad.
If args is anything else, use the default BaseException__str__().
*/
Why is it so important to round trip?
|
| msg65588 (view) |
Author: Rick Harris (rharris) |
Date: 2008-04-17 19:48 |
|
I think it is important to round-trip for at least two reasons:
1) Consistency. Other built-in exceptions behave this way, why should
KeyError be any different? Okay, technically 3 UnicodeErrors don't allow
just strings to be passed in (perhaps they should :-); but for common
exception classes, they all behave the same way.
To quote PEP-20: "Special cases aren't special enough to break the rules"
2) Intuitiveness. Decorating the string with quotes is unexpected; it
has already caused at least one bug and could cause more.
Ensuring intuitive round-trip behavior is an important enough issue that
is explicitly discussed in PEP 327 for the decimal type.
Why can't intuitiveness be restored to KeyError in Py3K?
|
| msg65609 (view) |
Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) |
Date: 2008-04-18 09:36 |
|
Here is another form of the same inconsistency:
>>> [].pop(0)
IndexError: pop from empty list
>>> {}.pop(0)
KeyError: 'pop(): dictionary is empty'
And my preferred one:
>>> unicodedata.lookup('"')
KeyError: 'undefined character name \'"\''
KeyError is special in that dict lookup raises the equivalent of
KeyError(key). Since the key may be any kind of (hashable) object, it's
preferable to repr() it.
I can see 3 solutions to the problem:
1- imitate IndexError for lists: the exception do not contain the key.
2- dict lookup builds the complete string message, and raise it
raise KeyError("key not found: %r" % key)
then KeyError.__str__ can be removed.
3- like IOError, KeyError has "msg" and "key" attributes. then dict
lookup raises
raise KeyError("key not found", key)
and KeyError.__str__ is something like:
if self.key is not None:
return "%s: %r" % (self.msg, self.key)
else
return str(self.msg)
Choice 1 is not an improvement.
Choice 2 has the correct behavior but leads to performance problems;
KeyErrors are very very common in the interpreter (namespace lookups...)
and formatting the message is costly.
Choice 3 may cause regression for code that use exception.args[0], but
otherwise seems the best to me. I'll try to come with a patch.
|
| msg65656 (view) |
Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) |
Date: 2008-04-21 00:14 |
|
Attached patch changes KeyError: when it has two arguments, they are
formatted with "%s: %r". Otherwise the base repr is called, and this
allows the round trip.
Standard objects (dict, set, UserDict, namedtuple, defaultdict, weak
dictionaries) now raise something like KeyError("not in dict", key).
At least one place in the stdlib relied on the key being the first
argument to KeyError() (in ConfigParser.py)
I don't know if this incompatibility will break much code. At least we
can say that the current behavior is not documented.
|
| msg66432 (view) |
Author: Antoine Pitrou (pitrou) |
Date: 2008-05-08 18:51 |
|
Wouldn't it be nice to also store the offending key as a "key" attribute?
Writing key_error.key is a lot intuitive than key_error.args[0] (or is
it key_error.args[1] ? I've already forgotten :-)).
|
|
| Date |
User |
Action |
Args |
| 2008-05-08 18:51:50 | pitrou | set | nosy:
+ pitrou messages:
+ msg66432 |
| 2008-04-21 00:15:01 | amaury.forgeotdarc | set | files:
+ KeyError.patch keywords:
+ patch messages:
+ msg65656 |
| 2008-04-18 09:36:16 | amaury.forgeotdarc | set | messages:
+ msg65609 |
| 2008-04-17 19:48:04 | rharris | set | messages:
+ msg65588 |
| 2008-04-17 17:42:09 | amaury.forgeotdarc | set | nosy:
+ amaury.forgeotdarc messages:
+ msg65587 |
| 2008-04-17 17:24:40 | rharris | create | |
|