This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: logging to file + encoding
Type: behavior Stage:
Components: Library (Lib) Versions: Python 2.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: vinay.sajip Nosy List: shamilbi, vinay.sajip
Priority: normal Keywords:

Created on 2009-02-06 16:07 by shamilbi, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
encoding-test.zip vinay.sajip, 2009-02-06 20:02 Simple test script and resulting output
unnamed shamilbi, 2009-04-21 08:37
unnamed shamilbi, 2009-04-22 08:47
Messages (11)
msg81274 - (view) Author: (shamilbi) Date: 2009-02-06 16:07
if i configure logging into a file with encoding = 'cp1251' and do
logger.debug(u'...') then i get crash with UnicodeError

i suggest reimplementing method FileHandler.emit():
...
if isinstance(msg, unicode):
    stream.write(f % msg)    # it works!
...
msg81297 - (view) Author: Vinay Sajip (vinay.sajip) * (Python committer) Date: 2009-02-06 20:02
The attached test script and output file appear to show logging working
correctly. The script writes a log message including the Cyrillic text
доброе утро (Good morning) to a CP1251-encoded file, test.log. Opening
this file in a Unicode-aware editor (I used BabelPad,
http://www.babelstone.co.uk/Software/BabelPad.html) appears to show the
Cyrillic characters correctly: see the screenshot at

http://img5.imageshack.us/img5/799/cp1251zw2.png

Marking as pending. Can you give more specifics about your problem?
msg81378 - (view) Author: (shamilbi) Date: 2009-02-08 13:36
test_log.py:
-----------
#! -*- coding: windows-1251 -*-

import logging

logger = logging.getLogger('test_log')
logger.addHandler(logging.FileHandler('test.log', encoding='cp1251'))
logger.setLevel(logging.DEBUG)

logger.debug(u'Привет')    # russian Hello

exception:
---------
Traceback (most recent call last):
  File "e:\bin\python\0\lib\logging\__init__.py", line 765, in emit
    self.stream.write(fs % msg.encode("UTF-8"))
  File "e:\bin\python\0\lib\codecs.py", line 686, in write
    return self.writer.write(data)
  File "e:\bin\python\0\lib\codecs.py", line 351, in write
    data, consumed = self.encode(object, self.errors)
  File "e:\bin\python\0\lib\encodings\cp1251.py", line 12, in encode
    return codecs.charmap_encode(input,errors,encoding_table)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xd0 in position 0:
ordinal not in range(128)
msg81411 - (view) Author: Vinay Sajip (vinay.sajip) * (Python committer) Date: 2009-02-08 19:25
Sorry, misread the original issue and tested with Python 2.5 rather than
2.6. This is a regression; fix and additional test case checked into
trunk and release26-maint.
msg86198 - (view) Author: (shamilbi) Date: 2009-04-20 15:29
(python 2.6.2, WinXP)
logging to console stopped working. Here is a workaround:

logging/__init__.py:
class StreamHandler(Handler):
    ...
    def emit(self, record):
    ...
                    if (isinstance(msg, unicode) or
                        getattr(stream, 'encoding', None) is None):
-                         stream.write(fs % msg)
+                         if stream == sys.stdout:
+                             print(msg)
+                         else:
+                             stream.write(fs % msg)
msg86222 - (view) Author: Vinay Sajip (vinay.sajip) * (Python committer) Date: 2009-04-21 06:07
Can you retry with setting the "encoding" attribute of the file to
"cp1251"? That should work and that should be the appropriate method to
avoid the problem.

test_logging.py in the Python distribution has a test which exercises
Unicode functionality using cp1251, does that test work in your environment?
msg86224 - (view) Author: (shamilbi) Date: 2009-04-21 08:37
>
> Can you retry with setting the "encoding" attribute of the file to
> "cp1251"? That should work and that should be the appropriate method to
> avoid the problem.
>
> test_logging.py in the Python distribution has a test which exercises
> Unicode functionality using cp1251, does that test work in your
> environment?

logging to file is OK (since 2.6.2), logging to console __was__ OK (2.6.1),
not now (2.6.2)

shamil
msg86226 - (view) Author: Vinay Sajip (vinay.sajip) * (Python committer) Date: 2009-04-21 09:25
Trunk and release26-maint were recently changed (r71657, r71658) to use
the following logic, which differs from the code snippet you posted.

                    if (isinstance(msg, unicode) and
                        getattr(stream, 'encoding', None)):
                        stream.write(fs.decode(stream.encoding) % msg)
                    else:
                        stream.write(fs % msg)

If the stream is stderr and you are passing a unicode msg, the else
branch will not be taken; as long as the stream has an encoding
attribute, it should output correctly.

The change was made when another, similar issue was posted by another
user (issue #5768).

Can you confirm what happens with the current code as it is in
release26-maint? спасибо!
msg86282 - (view) Author: (shamilbi) Date: 2009-04-22 08:47
>
> Trunk and release26-maint were recently changed (r71657, r71658) to use
> the following logic, which differs from the code snippet you posted.
>
>                    if (isinstance(msg, unicode) and
>                        getattr(stream, 'encoding', None)):
>                        stream.write(fs.decode(stream.encoding) % msg)
>                     else:
>                        stream.write(fs % msg)
>
> If the stream is stderr and you are passing a unicode msg, the else
> branch will not be taken; as long as the stream has an encoding
> attribute, it should output correctly.
>
> The change was made when another, similar issue was posted by another
> user (issue #5768).
>
> Can you confirm what happens with the current code as it is in
> release26-maint? спасибо!

it still doesn't work for console (but OK for files).

the following works in both cases:
if (isinstance(msg, unicode) and
    getattr(stream, 'encoding', None) and
    (stream == sys.stdout or stream == sys.stderr)):
    stream.write(fs % msg.encode(stream.encoding))
else:
     stream.write(fs % msg)

i think it's all about the difference betwin print(msg) and
sys.stdout.write('%s\n' % msg)

shamil
msg86287 - (view) Author: Vinay Sajip (vinay.sajip) * (Python committer) Date: 2009-04-22 11:35
> i think it's all about the difference betwin print(msg) and
> sys.stdout.write('%s\n' % msg)

Yes. Actually, it's about what streams will accept. For example, a
stream opened with codecs.open(encoding='cp1251') will accept a Unicode
string and encode it before output. Likewise, a StringIO wrapped inside
a writer obtained from the codecs module.

However, stdout and stderr are of type file and seem to behave
differently - they behave as expected if I encode the value written to
them, but fail if I write a Unicode string to them.

I don't think special-casing for sys.stdout and sys.stderr in the
logging code is the correct approach. Better to put a conditional
encoding step when it's needed - now I just need to figure out all the
cases when it's needed ;-)
msg86291 - (view) Author: Vinay Sajip (vinay.sajip) * (Python committer) Date: 2009-04-22 12:14
Fix checked into trunk and release26-maint.
History
Date User Action Args
2022-04-11 14:56:45adminsetgithub: 49420
2009-04-22 12:14:26vinay.sajipsetstatus: open -> closed
resolution: accepted -> fixed
messages: + msg86291
2009-04-22 11:35:17vinay.sajipsetresolution: fixed -> accepted
messages: + msg86287
2009-04-22 08:47:36shamilbisetfiles: + unnamed

messages: + msg86282
2009-04-21 09:25:05vinay.sajipsetmessages: + msg86226
2009-04-21 08:37:57shamilbisetfiles: + unnamed

messages: + msg86224
2009-04-21 06:07:32vinay.sajipsetmessages: + msg86222
2009-04-20 15:29:28shamilbisetstatus: closed -> open
type: crash -> behavior
messages: + msg86198
2009-02-08 19:25:36vinay.sajipsetstatus: pending -> closed
resolution: works for me -> fixed
messages: + msg81411
2009-02-08 13:36:22shamilbisetmessages: + msg81378
2009-02-06 20:02:51vinay.sajipsetstatus: open -> pending
resolution: works for me
messages: + msg81297
files: + encoding-test.zip
2009-02-06 16:12:16benjamin.petersonsetassignee: vinay.sajip
nosy: + vinay.sajip
2009-02-06 16:07:46shamilbicreate