Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SMTPHandler in the logging module does not handle unicode strings #53454

Closed
norbidur mannequin opened this issue Jul 8, 2010 · 7 comments
Closed

SMTPHandler in the logging module does not handle unicode strings #53454

norbidur mannequin opened this issue Jul 8, 2010 · 7 comments
Assignees
Labels
topic-unicode type-bug An unexpected behavior, bug, or error

Comments

@norbidur
Copy link
Mannequin

norbidur mannequin commented Jul 8, 2010

BPO 9208
Nosy @vsajip, @cjw296, @bitdancer, @simon04

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = 'https://github.com/vsajip'
closed_at = <Date 2010-08-25.18:11:42.436>
created_at = <Date 2010-07-08.22:16:01.639>
labels = ['type-bug', 'expert-unicode']
title = 'SMTPHandler in the logging module does not handle unicode strings'
updated_at = <Date 2015-10-13.13:38:30.116>
user = 'https://bugs.python.org/norbidur'

bugs.python.org fields:

activity = <Date 2015-10-13.13:38:30.116>
actor = 'r.david.murray'
assignee = 'vinay.sajip'
closed = True
closed_date = <Date 2010-08-25.18:11:42.436>
closer = 'r.david.murray'
components = ['Unicode']
creation = <Date 2010-07-08.22:16:01.639>
creator = 'norbidur'
dependencies = []
files = []
hgrepos = []
issue_num = 9208
keywords = []
message_count = 7.0
messages = ['109621', '114691', '114924', '151524', '252928', '252931', '252932']
nosy_count = 5.0
nosy_names = ['vinay.sajip', 'norbidur', 'cjw296', 'r.david.murray', 'simon04']
pr_nums = []
priority = 'normal'
resolution = 'out of date'
stage = 'resolved'
status = 'closed'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue9208'
versions = ['Python 2.7']

@norbidur
Copy link
Mannequin Author

norbidur mannequin commented Jul 8, 2010

SMTPHandler fails when receiving unicode strings.

example :
import logging,logging.handlers
smtpHandler = logging.handlers.SMTPHandler(
mailhost=("smtp.free.fr",25),
fromaddr="from@free.fr", toaddrs="to@free.fr",
subject=u"error message")
LOG = logging.getLogger()
LOG.addHandler(smtpHandler)
LOG.error(u"accentu\u00E9")

-> UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 108 : ordinal not in range(128)

There has been a discuss on this in http://groups.google.com/group/comp.lang.python/browse_thread/thread/759df42f9374d1b6/05ad55c388c746e3?lnk=raot&pli=1

FileHandler does not behave the same way : for this handler's family an encoding can be specified, and if this encoding fails, there is a fallback to UTF-8.

@norbidur norbidur mannequin added topic-unicode type-bug An unexpected behavior, bug, or error labels Jul 8, 2010
@orsenthil orsenthil changed the title SMTPHandler does not handle unicode strings SMTPHandler in the logging module does not handle unicode strings Jul 9, 2010
@vsajip
Copy link
Member

vsajip commented Aug 22, 2010

SMTPHandler provides an implementation for the simplest/most common case. Full support for encoding in emails is likely to be application-specific, i.e. no one-size-fits-all can be easily specified. For example, different encodings could be used for headers, subject and body - say, quoted-printable for the body and base64 for the subject. Unfortunately, support for quoted-printable requires a global state change, see for instance

http://radix.twistedmatrix.com/2010/07/how-to-send-good-unicode-email-with.html

so it seems not to be a good idea to implement in the logging package itself.

I would suggest implementing a handler which subclasses SMTPHandler and does the appropriate formatting as per (for example) the above post. To facilitate this, I coukd add two methods to SMTPHandler:

class SMTPHandler(logging.Handler):
    def prepareEmail(self, record):
        """
        Prepare a record for emailing, including setting up mail
        headers doing all necessary encodings. Return a value
        suitable for passing to the sendEmail method.
    The default implementation will assume all-ASCII content
    for headers and body, do no special encoding, and return a
    string.
    """
    def sendMail(self, smtp, msg):
        """
        Send a message via the provided SMTP instance.
    The default implementation will call smtplib.sendmail(),
    passing the result from the prepareEmail method.
    """

I'm not yet sure if this would meet all use cases.

Marking as pending, awaiting feedback. Will close in a couple of weeks if no feedback received.

@bitdancer
Copy link
Member

Given Vinay's last comment I don't think this needs addressing in 2.x, and it is not a problem in 3.x.

@cjw296
Copy link
Contributor

cjw296 commented Jan 18, 2012

Just as a post-fix to this, the email handlers for the python logging framework that I maintain as a package on PyPI now handle unicode email correctly:

http://pypi.python.org/pypi/mailinglogger/3.7.0

I'd suggest people looking for fully-featured email log handlers use mailinglogger...

@simon04
Copy link
Mannequin

simon04 mannequin commented Oct 13, 2015

I don't see why/how this should be fixed in Python 3.

Using the example from msg109621 and Python 3.5.0, I get:
--- Logging error ---

Traceback (most recent call last):
  File "/usr/lib/python3.5/logging/handlers.py", line 985, in emit
    smtp.sendmail(self.fromaddr, self.toaddrs, msg)
  File "/usr/lib/python3.5/smtplib.py", line 846, in sendmail
    msg = _fix_eols(msg).encode('ascii')
UnicodeEncodeError: 'ascii' codec can't encode character '\xe9' in position 108: ordinal not in range(128)
Call stack:
  File "/tmp/x.py", line 8, in <module>
    LOG.error(u"accentu\u00E9")
Message: 'accentué'
Arguments: ()

The problem is that an SMTP message is constructed and non-ASCII characters are not escaped in SMTPHandler.emit. A robust fix would be to use email.mime.text�.MIMEText instead:

msg = MIMEText�(msg)
msg['Subject'] = self.getSubject(record)
msg['From'] = self.fromaddr
msg['To'] =",".join(self.toaddrs)

@bitdancer
Copy link
Member

In 3.4/3.5 a better fix would be to use EmailMessage instead of Message, and smtp.send_message instead of smtp.sendmail. That will do the right thing, where "the right thing" is defined as defaulting to utf-8 for both headers and body. A specific application might want other defaults, in which case one should proceed as Vinay suggests.

You could open a new issue requesting the use of EmailMessage/send_message in python3 logging if you want to pursue this. Note that this would also allow unicode in the display name part of the addresses.

@bitdancer
Copy link
Member

To clarify: it will default to utf-8 encoded for email transport. (Since email now supports SMTPUTF8, what I said could have meant defaulting to that, which it does *not* do.)

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic-unicode type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

3 participants