classification
Title: smtplib can't send 8bit encoded utf-8 message
Type: behavior Stage: resolved
Components: Versions: Python 3.4
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: airween, r.david.murray
Priority: normal Keywords:

Created on 2015-11-26 07:42 by airween, last changed 2015-11-26 17:54 by airween. This issue is now closed.

Messages (5)
msg255401 - (view) Author: Ervin Hegedüs (airween) Date: 2015-11-26 07:46
Looks like smtplib can send only messages, which contains only 7bit (ascii) characters. Here is the example:

# -*- coding: utf8 -*-

import time
import smtplib

mailfrom = "my@mydomain.com"
rcptto = "me@otherdomain.com"

msg = """%s
From: Me <%s>
To: %s
Subject: Plain text e-mail
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
Content-Transfer-Encoding: 8bit

happy New Year

Ευτυχισμένο το Νέο Έτος

明けましておめでとうございます

с Новым годом
""" % (time.strftime('%a, %d %b %Y %H:%M:%S +0100', time.localtime()), mailfrom, rcptto)

server = smtplib.SMTP('localhost')
server.sendmail(mailfrom, rcptto, msg)
server.quit()


With Python2 (Python 2.7), this script finished succesfully. With Python3 (Python 3.4), I've got this execption:

Traceback (most recent call last):
  File "8bittest.py", line 28, in <module>
    server.sendmail(mailfrom, rcptto, msg)
  File "/usr/lib/python3.4/smtplib.py", line 765, in sendmail
    msg = _fix_eols(msg).encode('ascii')
UnicodeEncodeError: 'ascii' codec can't encode characters in position 261-271: ordinal not in range(128)


Basicly, I don't understand, why smtplib allows only ascii encoded messages in Python 3. That worked (and works) in Python 2, and I think, that's the correct behavior.
msg255403 - (view) Author: Ervin Hegedüs (airween) Date: 2015-11-26 08:44
Here is a workaround:

server.sendmail(mailfrom, rcptto, msg.encode("utf8"))

May be this would be better inside of smtplib?
msg255425 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2015-11-26 17:05
Although that will work for text-only messages if you know what RFC format looks like, you really don't want to do that in the general case, since you can't express messages that have binary non-text content using unicode.  What you want to do is prepare your message in correct RFC form, which is what the email library is for.  With the new API (provisional now, but any changes will be minor when it becomes final in 3.6), this is even easy (see the 'contentmanager' docs).  Then you call smtplib.send_message, and the encoding to RFC format is taken care of for you.  In 3.5 it even supports SMTPUTF8, if you know any servers that do :)
msg255426 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2015-11-26 17:06
Oh, and as for why this worked in python2: in python2 strings were binary, not unicode, so the non-ascii stuff was already in bytes form.
msg255433 - (view) Author: Ervin Hegedüs (airween) Date: 2015-11-26 17:54
David,

many thanks for your information.

I think my e-mail format was correct - I've copied it from a maildir, as an "email file".

As I wrote, there is a solution: before the code passes the 'msg' argument to sendmail() function, it needs to encode() it as "utf-8", then it will be a bytestream, instead of unicode (which is the default type of any string in Py3). Meanwhile I realized it :).

Thanks again, and sorry for my mistake.
History
Date User Action Args
2015-11-26 17:54:14airweensetmessages: + msg255433
2015-11-26 17:06:32r.david.murraysetmessages: + msg255426
2015-11-26 17:05:13r.david.murraysetstatus: open -> closed

nosy: + r.david.murray
messages: + msg255425

resolution: not a bug
stage: resolved
2015-11-26 08:44:38airweensetmessages: + msg255403
2015-11-26 07:46:32airweensetmessages: + msg255401
2015-11-26 07:42:20airweencreate