Message 96281 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	ezio.melotti
Recipients	barry, cvrebert, exarkun, ezio.melotti, pitrou
Date	2009-12-12.03:21:57
SpamBayes Score	4.707155e-07
Marked as misclassified	No
Message-id	<1260588121.48.0.439515013354.issue6108@psf.upfronthosting.co.za>
In-reply-to

Content
In r64791, BaseException gained a new __unicode__ method that does the equivalent of the following things: * if the number of args is 0, returns u'' * if it's 1 returns unicode(self.args[0]) * if it's >1 returns unicode(self.args) Before this, BaseException only had a __str__ method, so unicode(e) (with e being an exception derived from BaseException) called: * e.__str__().decode(), if e didn't implement __unicode__ * e.__unicode__(), if e implemented an __unicode__ method Now, all the derived exceptions that don't implement their own __unicode__ method inherit the "generic" __unicode__ of BaseException, and they use that instead of falling back on __str__. This is generally ok if the numbers of args is 0 or 1, but if there are more args, there's usually some specific formatting in the __str__ method that is lost when BaseException.__unicode__ returns unicode(self.args). Possible solutions: 1) implement a __unicode__ method that does the equivalent of calling unicode(str(self)) (i.e. converting to unicode the message returned by __str__ instead of converting self.args); 2) implement a __unicode__ method that formats the message as __str__ for all the exceptions with a __str__ that does some specific formatting; Attached there's a proof of concept (issue6108.diff) where I tried to implement the first method with UnicodeDecodeError. This method can be used as long as __str__ always returns only ascii. The patch seems to work fine for me (note: this is my first attempt to use the C API). If the approach is correct I can do the same for the other exceptions too and submit a proper patch.

In r64791, BaseException gained a new __unicode__ method that does the
equivalent of the following things:
 * if the number of args is 0, returns u''
 * if it's 1 returns unicode(self.args[0])
 * if it's >1 returns unicode(self.args)

Before this, BaseException only had a __str__ method, so unicode(e)
(with e being an exception derived from BaseException) called:
 * e.__str__().decode(), if e didn't implement __unicode__
 * e.__unicode__(), if e implemented an __unicode__ method

Now, all the derived exceptions that don't implement their own
__unicode__ method inherit the "generic" __unicode__ of BaseException,
and they use that instead of falling back on __str__.
This is generally ok if the numbers of args is 0 or 1, but if there are
more args, there's usually some specific formatting in the __str__
method that is lost when BaseException.__unicode__ returns
unicode(self.args).

Possible solutions:
 1) implement a __unicode__ method that does the equivalent of calling
unicode(str(self)) (i.e. converting to unicode the message returned by
__str__ instead of converting self.args);
 2) implement a __unicode__ method that formats the message as __str__
for all the exceptions with a __str__ that does some specific formatting;

Attached there's a proof of concept (issue6108.diff) where I tried to
implement the first method with UnicodeDecodeError. This method can be
used as long as __str__ always returns only ascii.

The patch seems to work fine for me (note: this is my first attempt to
use the C API). If the approach is correct I can do the same for the
other exceptions too and submit a proper patch.

History
Date	User	Action	Args
2009-12-12 03:22:01	ezio.melotti	set	recipients: + ezio.melotti, barry, exarkun, pitrou, cvrebert
2009-12-12 03:22:01	ezio.melotti	set	messageid: <1260588121.48.0.439515013354.issue6108@psf.upfronthosting.co.za>
2009-12-12 03:21:58	ezio.melotti	link	issue6108 messages
2009-12-12 03:21:58	ezio.melotti	create