classification
Title: SMTPHandler in the logging module does not handle unicode strings
Type: behavior Stage: resolved
Components: Unicode Versions: Python 2.7
process
Status: closed Resolution: out of date
Dependencies: Superseder:
Assigned To: vinay.sajip Nosy List: cjw296, norbidur, r.david.murray, simon04, vinay.sajip
Priority: normal Keywords:

Created on 2010-07-08 22:16 by norbidur, last changed 2015-10-13 13:38 by r.david.murray. This issue is now closed.

Messages (7)
msg109621 - (view) Author: norbidur (norbidur) Date: 2010-07-08 22:16
SMTPHandler fails when receiving unicode strings.

example : 
import logging,logging.handlers
smtpHandler = logging.handlers.SMTPHandler(
    mailhost=("smtp.free.fr",25),
    fromaddr="from@free.fr", toaddrs="to@free.fr",
    subject=u"error message")
LOG = logging.getLogger()
LOG.addHandler(smtpHandler)
LOG.error(u"accentu\u00E9")

-> UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 108 : ordinal not in range(128)

There has been a discuss on this in http://groups.google.com/group/comp.lang.python/browse_thread/thread/759df42f9374d1b6/05ad55c388c746e3?lnk=raot&pli=1

FileHandler does not behave the same way : for this handler's family an encoding can be specified, and if this encoding fails, there is a fallback to UTF-8.
msg114691 - (view) Author: Vinay Sajip (vinay.sajip) * (Python committer) Date: 2010-08-22 18:34
SMTPHandler provides an implementation for the simplest/most common case. Full support for encoding in emails is likely to be application-specific, i.e. no one-size-fits-all can be easily specified. For example, different encodings could be used for headers, subject and body - say, quoted-printable for the body and base64 for the subject. Unfortunately, support for quoted-printable requires a global state change, see for instance

http://radix.twistedmatrix.com/2010/07/how-to-send-good-unicode-email-with.html

so it seems not to be a good idea to implement in the logging package itself.

I would suggest implementing a handler which subclasses SMTPHandler and does the appropriate formatting as per (for example) the above post. To facilitate this, I coukd add two methods to SMTPHandler:

class SMTPHandler(logging.Handler):
    def prepareEmail(self, record):
        """
        Prepare a record for emailing, including setting up mail
        headers doing all necessary encodings. Return a value
        suitable for passing to the sendEmail method.

        The default implementation will assume all-ASCII content
        for headers and body, do no special encoding, and return a
        string.
        """

    def sendMail(self, smtp, msg):
        """
        Send a message via the provided SMTP instance.

        The default implementation will call smtplib.sendmail(),
        passing the result from the prepareEmail method.
        """

I'm not yet sure if this would meet all use cases.

Marking as pending, awaiting feedback. Will close in a couple of weeks if no feedback received.
msg114924 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010-08-25 18:11
Given Vinay's last comment I don't think this needs addressing in 2.x, and it is not a problem in 3.x.
msg151524 - (view) Author: Chris Withers (cjw296) * (Python committer) Date: 2012-01-18 08:45
Just as a post-fix to this, the email handlers for the python logging framework that I maintain as a package on PyPI now handle unicode email correctly:

http://pypi.python.org/pypi/mailinglogger/3.7.0

I'd suggest people looking for fully-featured email log handlers use mailinglogger...
msg252928 - (view) Author: (simon04) * Date: 2015-10-13 13:09
I don't see why/how this should be fixed in Python 3.

Using the example from msg109621 and Python 3.5.0, I get:
--- Logging error ---
Traceback (most recent call last):
  File "/usr/lib/python3.5/logging/handlers.py", line 985, in emit
    smtp.sendmail(self.fromaddr, self.toaddrs, msg)
  File "/usr/lib/python3.5/smtplib.py", line 846, in sendmail
    msg = _fix_eols(msg).encode('ascii')
UnicodeEncodeError: 'ascii' codec can't encode character '\xe9' in position 108: ordinal not in range(128)
Call stack:
  File "/tmp/x.py", line 8, in <module>
    LOG.error(u"accentu\u00E9")
Message: 'accentué'
Arguments: ()

The problem is that an SMTP message is constructed and non-ASCII characters are not escaped in SMTPHandler.emit. A robust fix would be to use email.mime.text.MIMEText instead:

msg = MIMEText(msg)
msg['Subject'] = self.getSubject(record)
msg['From'] = self.fromaddr
msg['To'] = ",".join(self.toaddrs)
msg252931 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2015-10-13 13:36
In 3.4/3.5 a better fix would be to use EmailMessage instead of Message, and smtp.send_message instead of smtp.sendmail.  That will do the right thing, where "the right thing" is defined as defaulting to utf-8 for both headers and body.  A specific application might want other defaults, in which case one should proceed as Vinay suggests.

You could open a new issue requesting the use of EmailMessage/send_message in python3 logging if you want to pursue this.  Note that this would also allow unicode in the display name part of the addresses.
msg252932 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2015-10-13 13:38
To clarify: it will default to utf-8 encoded for email transport.  (Since email now supports SMTPUTF8, what I said could have meant defaulting to that, which it does *not* do.)
History
Date User Action Args
2015-10-13 13:38:30r.david.murraysetmessages: + msg252932
2015-10-13 13:36:24r.david.murraysetmessages: + msg252931
2015-10-13 13:09:10simon04setnosy: + simon04
messages: + msg252928
2012-01-18 08:45:34cjw296setnosy: + cjw296
messages: + msg151524
2010-08-25 18:11:42r.david.murraysetstatus: pending -> closed

versions: + Python 2.7, - Python 3.2
nosy: + r.david.murray

messages: + msg114924
resolution: out of date
stage: resolved
2010-08-22 18:34:06vinay.sajipsetstatus: open -> pending

messages: + msg114691
2010-07-09 09:49:29orsenthilsettitle: SMTPHandler does not handle unicode strings -> SMTPHandler in the logging module does not handle unicode strings
2010-07-09 09:41:38pitrousetassignee: vinay.sajip

nosy: + vinay.sajip
versions: + Python 3.2, - Python 2.6, Python 2.5
2010-07-08 22:16:01norbidurcreate