This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author benjamin.peterson
Recipients amaury.forgeotdarc, benjamin.peterson, christoph, davidfraser, georg.brandl, hodgestar, pitrou
Date 2008-06-09.15:56:35
SpamBayes Score 1.4455059e-05
Marked as misclassified No
Message-id <1afaf6160806090856o3b52e866xaeae17e844696e1c@mail.gmail.com>
In-reply-to <1213018824.25.0.137378819627.issue2517@psf.upfronthosting.co.za>
Content
On Mon, Jun 9, 2008 at 8:40 AM, Simon Cross <report@bugs.python.org> wrote:
>
> Simon Cross <hodgestar@gmail.com> added the comment:
>
> One of the examples Christoph tried was
>
>  unicode(Exception(u'\xe1'))
>
> which fails quite oddly with:
>
>  UnicodeEncodeError: 'ascii' codec can't encode character u'\xe1' in
> position 0: ordinal not in range(128)
>
> The reason for this is Exception lacks an __unicode__ method
> implementation so that unicode(e) does something like unicode(str(e))
> which attempts to convert the exception arguments to the default
> encoding (almost always ASCII) and fails.

What version are you using? In Py3k, str is unicode so __str__ can
return a unicode string.

>
> Fixing this seems quite important. It's common to want to raise errors
> with non-ASCII characters (e.g. when the data which caused the error
> contains such characters). Usually the code raising the error has no way
> of knowing how the characters should be encoded (exceptions can end up
> being written to log files, displayed in web interfaces, that sort of
> thing). This means raising exceptions with unicode messages. Using
> unicode(e.message) is unattractive since it won't work in 3.0 and also
> does not duplicate str(e)'s handling of the other exception __init__
> arguments.
>
> I'm attaching a patch which implements __unicode__ for BaseException.
> Because of the lack of a tp_unicode slot to mirror tp_str slot, this
> breaks the test that calls unicode(Exception). The existing test for
> unicode(e) does unicode(Exception(u"Foo")) which is a bit of a non-test.
> My patch adds a test of unicode(Exception(u'\xe1')) which fails without
> the patch.
>
> A quick look through trunk suggests implementing tp_unicode actually
> wouldn't be a huge job. My worry is that this would constitute a change
> to the C API for PyObjects and has little chance of acceptance into 2.6
> (and in 3.0 all these issues disappear anyway). If there is some chance
> of acceptance, I'm willing to write a patch that adds tp_unicode.

Email Python-dev for permission.
History
Date User Action Args
2008-06-09 15:56:38benjamin.petersonsetspambayes_score: 1.44551e-05 -> 1.4455059e-05
recipients: + benjamin.peterson, georg.brandl, amaury.forgeotdarc, davidfraser, pitrou, christoph, hodgestar
2008-06-09 15:56:37benjamin.petersonlinkissue2517 messages
2008-06-09 15:56:35benjamin.petersoncreate