This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author vstinner
Recipients eric.araujo, ezio.melotti, jammon, jfinkels, michael.foord, r.david.murray, vstinner
Date 2010-11-16.14:10:04
SpamBayes Score 1.463829e-13
Marked as misclassified No
Message-id <1289916607.64.0.409673641159.issue10417@psf.upfronthosting.co.za>
In-reply-to
Content
In Python 3, sys.stderr uses the 'backslashreplace' error handler. With C locale, sys.stderr uses the ASCII encoding and so the é unicode character is printed as \xe9.

In Python 2, sys.stderr.errors is strict by default.

It works if you specify the error handler:

$ ./python -c "import sys; sys.stderr.write(u'\xe9\n')"
é
$ PYTHONIOENCODING=ascii:backslashreplace ./python -c "import sys; sys.stderr.write(u'\xe9\n')"
\xe9

But with ASCII encoding, and the default error handler (strict), it fails:

$ PYTHONIOENCODING=ascii ./python -c "import sys; sys.stderr.write(u'\xe9\n')"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 0: ordinal not in range(128)
$ LANG= ./python -c "import sys; sys.stderr.write(u'\xe9\n')"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 0: ordinal not in range(128)

Change the default error handler in a minor release is not a good idea. But we can emulate the backslashreplace error handler. distutils.log does that in Python3:

class Log:

    def __init__(self, threshold=WARN):
        self.threshold = threshold

    def _log(self, level, msg, args):
        if level not in (DEBUG, INFO, WARN, ERROR, FATAL):
            raise ValueError('%s wrong log level' % str(level))

        if level >= self.threshold:
            if args:
                msg = msg % args
            if level in (WARN, ERROR, FATAL):
                stream = sys.stderr
            else:
                stream = sys.stdout
            if stream.errors == 'strict':
                # emulate backslashreplace error handler
                encoding = stream.encoding
                msg = msg.encode(encoding, "backslashreplace").decode(encoding)
            stream.write('%s\n' % msg)
            stream.flush()
    (...)

_WritelnDecorator() of unittest.runner should maybe use the same code.
History
Date User Action Args
2010-11-16 14:10:07vstinnersetrecipients: + vstinner, ezio.melotti, eric.araujo, r.david.murray, michael.foord, jfinkels, jammon
2010-11-16 14:10:07vstinnersetmessageid: <1289916607.64.0.409673641159.issue10417@psf.upfronthosting.co.za>
2010-11-16 14:10:04vstinnerlinkissue10417 messages
2010-11-16 14:10:04vstinnercreate