classification
Title: Logging: Unicode Error
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: ezio.melotti, guettli, haypo, python-dev, vinay.sajip
Priority: normal Keywords:

Created on 2011-10-20 14:43 by guettli, last changed 2011-10-24 22:26 by python-dev. This issue is now closed.

Files
File name Uploaded Description Edit
unicodedecodeerror-in-logging.py guettli, 2011-10-21 06:36
Messages (4)
msg146018 - (view) Author: Thomas Guettler (guettli) Date: 2011-10-20 14:43
In changeset fe6be0426e0d the format() method was changed. Unfortunately it does not catch all unicode decode errors. 

I think line 482 of logging/__init__.py should be modified:
to this (add 'replace'):

s = s + record.exc_text.decode(sys.getfilesystemencoding(), 'replace')

http://hg.python.org/cpython/file/f35514dfadf8/Lib/logging/__init__.py#l482


Here is the stacktrace we get:
{{{
Traceback (most recent call last):
  File "/usr/lib64/python2.7/logging/__init__.py", line 838, in emit
    msg = self.format(record)
  File "/usr/lib64/python2.7/logging/__init__.py", line 715, in format
    return fmt.format(record)
  File "/home/modbau_esg_p/djangotools/utils/logutils.py", line 32, in format
    msg=logging.Formatter.format(self, record)
  File "/usr/lib64/python2.7/logging/__init__.py", line 482, in format
    s = s + record.exc_text.decode(sys.getfilesystemencoding())
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 662: ordinal not in range(128)
Logged from file base.py, line 209
}}}
msg146019 - (view) Author: Vinay Sajip (vinay.sajip) * (Python committer) Date: 2011-10-20 15:00
Can you tell me what the actual data was which failed to be decoded? Is there more than one encoding in effect (e.g. one for the filesystem, and another for the other data in the exception being logged)?
msg146066 - (view) Author: Thomas Guettler (guettli) Date: 2011-10-21 06:42
I attached a testcase (unicodedecodeerror-in-logging.py). If the filesystemencoding is UTF-8 and the source code is encoded in latin1, then the logging fails. It happens because there is a German umlaut in the comment behind 1/0.

I added 'replace' to the decode() in __init__.py and the it works. The German umlaut gets displayed as inverted question mark. But this is better than no logging message.
msg146332 - (view) Author: Roundup Robot (python-dev) Date: 2011-10-24 22:26
New changeset 4bb1dc4e2cec by Vinay Sajip in branch '2.7':
Closes #13232: Handle multiple encodings in exception logging.
http://hg.python.org/cpython/rev/4bb1dc4e2cec
History
Date User Action Args
2011-10-24 22:26:40python-devsetstatus: open -> closed

nosy: + python-dev
messages: + msg146332

resolution: fixed
stage: test needed -> resolved
2011-10-21 06:42:23guettlisetmessages: + msg146066
2011-10-21 06:36:37guettlisetfiles: + unicodedecodeerror-in-logging.py
2011-10-20 15:48:22hayposetnosy: + haypo
2011-10-20 15:44:06ezio.melottisetnosy: + ezio.melotti

type: behavior
stage: test needed
2011-10-20 15:00:15vinay.sajipsetnosy: + vinay.sajip
messages: + msg146019
2011-10-20 14:43:19guettlicreate