classification
Title: regression from 2.6: smtplib.py requiring ascii for sending messages
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.0, Python 3.1
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: ajaksu2, ccgus, loewis, r.david.murray
Priority: normal Keywords:

Created on 2008-11-24 03:59 by ccgus, last changed 2010-11-05 09:12 by r.david.murray. This issue is now closed.

Messages (7)
msg76295 - (view) Author: August Mueller (ccgus) Date: 2008-11-24 03:59
smtplib requires that messages being sent be in ascii, and throws an exception otherwise.  
Python 2.6 doesn't require this.  Here's the diff where it was introduced:
http://svn.python.org/view/python/branches/py3k/Lib/smtplib.py?rev=59102&r1=58495&r2=59102

Is there a good reason for this?  I use python for a webstore, and send out emails for folks 
with multibyte names (for instance, if a name has an umlaut).

Here's a code snippit + exception:

Python 3.0rc3 (r30rc3:67312, Nov 22 2008, 18:45:57) 
[GCC 3.4.6 [FreeBSD] 20060305] on freebsd6
Type "help", "copyright", "credits" or "license" for more information.
>>> import smtplib
>>> server = smtplib.SMTP("localhost")
>>> server.sendmail("gus@flyingmeat.com", "gus@flyingmeat.com", "Ümlaut")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/mu.org/home/gus/unix/python3/lib/python3.0/smtplib.py", line 713, in sendmail
    (code,resp) = self.data(msg)
  File "/home/mu.org/home/gus/unix/python3/lib/python3.0/smtplib.py", line 481, in data
    self.send(q)
  File "/home/mu.org/home/gus/unix/python3/lib/python3.0/smtplib.py", line 305, in send
    s = s.encode("ascii")
UnicodeEncodeError: 'ascii' codec can't encode character '\xdc' in position 0: ordinal not in 
range(128)

Is there a workaround or a new way of using it?  I couldn't seem to find it.

Thanks!
msg76301 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2008-11-24 05:50
> Is there a good reason for this?

Most definitely. In Python 2.x, the string literal denotes
a byte string, whereas in 3.x, it is a character string.
It's not possible to send a character string directly over
the network; try encoding it.

It might be considered a bug that sendmail accepts a string
at all as long as the string only consists of ASCII characters;
it should reject such strings as well.
msg76303 - (view) Author: August Mueller (ccgus) Date: 2008-11-24 06:26
Encoding the message first doesn't work either:

>>> import smtplib
>>> server = smtplib.SMTP("localhost")
>>> server.sendmail("gus@flyingmeat.com", "gus@flyingmeat.com", "Ümlaut".encode("UTF-8"))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/mu.org/home/gus/unix/python3/lib/python3.0/smtplib.py", line 713, in sendmail
    (code,resp) = self.data(msg)
  File "/home/mu.org/home/gus/unix/python3/lib/python3.0/smtplib.py", line 477, in data
    q = quotedata(msg)
  File "/home/mu.org/home/gus/unix/python3/lib/python3.0/smtplib.py", line 157, in quotedata
    re.sub(r'(?:\r\n|\n|\r(?!\n))', CRLF, data))
  File "/home/mu.org/home/gus/unix/python3/lib/python3.0/re.py", line 165, in sub
    return _compile(pattern, 0).sub(repl, string, count)
TypeError: can't use a string pattern on a bytes-like object

Should a check be done in data() first, before it tries to try string operations on it?
msg76306 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2008-11-24 06:59
I see. It seems Python 3.0 just won't support that usage, then.

You still should be able to use MIME to send non-ASCII characters.
msg76425 - (view) Author: August Mueller (ccgus) Date: 2008-11-25 20:35
For completeness, if anyone runs across this in the future, the following seems to work for sending utf-8 mail 
in python 3:

import smtplib
import email.mime.text

msg = email.mime.text.MIMEText("Ümlaut", _charset="UTF-8")

smtp = smtplib.SMTP('localhost')
smtp.sendmail('gus@flyingmeat.com', 'gus@flyingmeat.com', "Subject: This is your mail\n" + msg.as_string())
smtp.quit()
msg86592 - (view) Author: Daniel Diniz (ajaksu2) Date: 2009-04-26 02:18
We might want to add the workaround to docs or even to extend smtplib to
support it.
msg120478 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010-11-05 09:12
I'm closing this issue as invalid, since as Martin pointed out you can't send unicode over the wire.

However, see issue 10321 where I've attached a patch that adds support for sending binary data as a by-product of adding support for Message objects.

From Martin's msg76301 I gather he initially expected sendmail to take only binary data, which would it seems to me probably be the better API.  However at this point we are stuck with supporting ASCII-only strings in the API as well.

As a further note, however, the original example in this issue would have produced a non-RFC-conformant message when used in python 2.x.  You can't just send 8bit data willy-nilly, there are rules that should be followed.  Which is why the issue 10321 patch adds support for using Message objects, since the email package knows what those rules are...
History
Date User Action Args
2010-11-05 09:12:26r.david.murraysetmessages: - msg120476
2010-11-05 09:12:19r.david.murraysetmessages: + msg120478
2010-11-05 09:08:30r.david.murraysetstatus: open -> closed

nosy: + r.david.murray
messages: + msg120476

resolution: not a bug
stage: test needed -> resolved
2009-04-26 02:18:29ajaksu2setpriority: normal

type: behavior
versions: + Python 3.1
nosy: + ajaksu2

messages: + msg86592
stage: test needed
2008-11-25 20:35:08ccgussetmessages: + msg76425
2008-11-24 06:59:20loewissetmessages: + msg76306
2008-11-24 06:26:30ccgussetmessages: + msg76303
2008-11-24 05:50:13loewissetnosy: + loewis
messages: + msg76301
2008-11-24 03:59:03ccguscreate