This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author vstinner
Recipients belopolsky, ezio.melotti, vstinner
Date 2010-12-02.01:43:23
SpamBayes Score 1.110223e-16
Marked as misclassified No
Message-id <1291254207.25.0.743173639978.issue10601@psf.upfronthosting.co.za>
In-reply-to
Content
On Windows, the Python interpreter fails to display a result if stdout encoding is unable to encode it. This problem exists since Python 3.0. Eg. see issue #1602.

This problem is not specific to Windows. Even if stdout encoding is UTF-8 (which is the default encoding of Mac OS X and most Linux distributions), it fails on surrogate characters (because the UTF-8 encoder refuses surrogate characters in Python 3). Eg. see issue #5110.

Even if a Python (core? :-)) developer can see this behaviour as expected, it looks like different users (including me) don't like it and would prefer to see the result instead of an unicode exception. The problem is that we don't know directly (except for simple commands) if the error comes from the command or if printing the result failed.

This issue is specific to sys.displayhook, the callback used by the Python interpreter to display the result of a command. It doesn't concern print() or sys.stdout.write().

--

The best solution would be to check if the terminal is able to render a character, but this is not possible for technical reasons. The best that we can do is to catch the UnicodeEncodeError and use another error handler (than sys.stdout.errors) which doesn't fail. 'backslashreplace' is a good candidate.

Ezio Melotti implemented this solution and attached a patch to issue #9198.

I wrote a new version of his patch, changes:

 - Create a subfunction (for better readability)
 - Clear the UnicodeEncodeError before calling sys_displayhook_unencodable() (anyway, the exception will be lost on next error, eg. if PyObject_GetAttrString() fails)
 - Clear the (AttributeError) exception if PyObject_GetAttrString(outf, "buffer") fails
 - Add an unit test: test ASCII, ISO-8859-1 and UTF-8 with non-ASCII, surrogate and non-BMP (printable or not) characters
 - Complete and update sys.displayhook documentation
 - Fix a refleak if stdout_encoding_str == NULL
 - Use PyObject_CallMethod() instead of PyTuple_Pack() + PyEval_CallObject() for a shorter (and more readable) code

--

I don't know how to test the case: sys.stdout.write(repr(value)) fails and sys.stdout has no buffer attribute. A mockup should maybe be used in the unit test?
History
Date User Action Args
2010-12-02 01:43:27vstinnersetrecipients: + vstinner, belopolsky, ezio.melotti
2010-12-02 01:43:27vstinnersetmessageid: <1291254207.25.0.743173639978.issue10601@psf.upfronthosting.co.za>
2010-12-02 01:43:25vstinnerlinkissue10601 messages
2010-12-02 01:43:24vstinnercreate