classification
Title: smtplib.send_message does not implement corectly rfc 2822
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.2, Python 3.3
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: r.david.murray Nosy List: Nicolas.Estibals, jcea, python-dev, r.david.murray
Priority: normal Keywords: patch

Created on 2011-05-22 07:54 by Nicolas.Estibals, last changed 2011-07-04 07:02 by Nicolas.Estibals. This issue is now closed.

Files
File name Uploaded Description Edit
send_message_rfc2822.patch Nicolas.Estibals, 2011-05-22 07:54 review
send_message_rfc2822_v2.patch Nicolas.Estibals, 2011-06-28 20:39 review
Messages (11)
msg136508 - (view) Author: Nicolas Estibals (Nicolas.Estibals) Date: 2011-05-22 07:54
smtplib.send_message permits to send messages that are in python Message representation by selecting smtp's "from" and "to" in the message headers. Most of the time the implementation is correct but if the message is bounced (Resent-* headers have to be considered) or if there is a Sender field (this one supersede the From field if present), the current implementation does not select the right addresses (not as specified in rfc 2822).

I have wrote a patch to make the method compliant. This is my first patch to python but I followed the "Lifecycle of a Patch" webpage and hope my code will be usable. Please ask me if some revision of the code is needed.
msg136583 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011-05-23 02:52
Thanks for the patch.  Adding support for this is a good idea.  There could be a question about whether this constitutes a bug fix or a feature; I'll canvass some other devs and see if we have a consensus on it.

It may take me a bit to find time to do a detailed review, but in broad outline the patch looks sensible and has all the requisite parts (thanks!).
msg138593 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011-06-18 19:57
I think we can treat this as a bug.  However...

What if there is more than one set of Resent- headers?  I think that it is not possible to guarantee we only look at the most recent set, since the RFC provides no way to identify a "set".  Heuristically we could probably guess right 99% of the time, but we'd still be guessing.  Further, the RFC actually prohibits using the Resent- headers for "automated processing", though, granted, it is talking about the receiving end, not the sending end.

My inclination at this point is to fix the Sender bug, and to refuse to guess when there are any Resent- headers (throw a ValueError).  I've queried the email-sig to see what they think.
msg138740 - (view) Author: Nicolas Estibals (Nicolas.Estibals) Date: 2011-06-20 17:02
Hi,

Treating this as a bug is a good news, if we don't user of the function will ask for python 3.3

I also think the part concerning the Sender header is pretty clear and we can fix it easily.

About the Resent-* fields, I'm not sure of the right thing to do. But I haven't found the mention of no automatic processing for them but I found that RFC 2822 specify more exactly how to use them.

Contrary to the other fields, they have to be in block and the more recent block have to be at the beginning of the mail, moreover they must not be reordered during transfer. Thus I think we have to consider the first block of Resent-* fields if present. (cf. RFC 2822 third paragraph in section 3.6 and appendix A.3) However perhaps we have to wait for an answer from email-sig.

I have one more concern about the send_mesage method: if the Bcc field is present this one is deleted, thus we lose information if we copy it in a sent directory for instance. What do you think about the idea that send_message method should not modify the message ? (The sent message should get rid of the Bcc header but not the one the user keep after using the method.)

Best regards.
msg138742 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011-06-20 17:22
Section 3.6.6 says:

   Resent fields are strictly informational.  They MUST NOT be used in
   the normal processing of replies or other such automatic actions on
   messages.

Further, since there is no specified order for the headers within a block, there is no deterministic algorithmic way to determine where one block ends and the next begins.  A human (or a well thought out set of heuristics) can almost always figure it out, but it isn't guaranteed to be non-ambiguous.

The conclusion on the email-sig is that we should do the right thing when it is unambiguous (no or only one set of Resent- headers), and throw a ValueError if there are two copies of any Resent- header (refuse to guess).  For 3.3 we could implement heuristics and provide an option to turn them on, but that is an API change and so can't go into the 3.2 fix.

You have a good point about the method mutating the object passed to it.  This will probably come as a surprise even if documented, so it is indeed probably better to ensure that the object is unchanged after the call.  This can be done by mutating and restoring the object, but that would (I presume) not be thread safe.  Better would be a generator option to skip bcc fields, but that again is an API change.  I suppose that making a shallow copy of the Message object will be safe and not too inefficient.  Hopefully it will work :)

There are unresolved release blockers for the 3.2.1 RC, so we have at least a week to get this fixed and still make 3.2.1.  Do you want to update your patch?
msg138743 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011-06-20 17:25
Note that that RFC language is clearly directed at automatic processing on *receipt*, not during sending.  The RFC doesn't address automatic processing during sending, it leaves that the to the SMTP RFC.
msg138745 - (view) Author: Nicolas Estibals (Nicolas.Estibals) Date: 2011-06-20 18:30
I wasn't aware of the problem of guessing which are the correct Resent-* field, tthis does not seem to be that easy, however taking only the first one should be a good heuristic for next release.

I think we now agree on the "automatic processing" part, this only for the receipt part and section 3.6.6 mostly means that From, To, ... fields schould be considered (not the Resent-* fields) while sending an answer for example. For our case it schould be ok to use them as long as ther is no ambiguity (only one Resent-* block).

I'm ok for modifying my patch to reflect our discussion. You'll heard some news from it very soon. I'll also add the patch for not mutating the original message.

After 3.2.1, I'll work on implementing the few api change you discuss this will make the code very usable.
msg139375 - (view) Author: Nicolas Estibals (Nicolas.Estibals) Date: 2011-06-28 20:39
Sorry for the late, my week-end was more busy than expected. Here is the corrected version of the patch.
msg139666 - (view) Author: Roundup Robot (python-dev) Date: 2011-07-03 01:14
New changeset 0f5ea42fb46c by R David Murray in branch '3.2':
#12147: make send_message correctly handle Sender and Resent- headers.
http://hg.python.org/cpython/rev/0f5ea42fb46c

New changeset b8cec4f3faaa by R David Murray in branch 'default':
merge #12147: make send_message correctly handle Sender and Resent- headers.
http://hg.python.org/cpython/rev/b8cec4f3faaa
msg139667 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011-07-03 01:16
I tweaked the patch a bit (mostly style/cosmetic) and added some additional tests.  Thanks, Nicolas!
msg139741 - (view) Author: Nicolas Estibals (Nicolas.Estibals) Date: 2011-07-04 07:02
Thanks for your help and the interesting discussion with this issue.
History
Date User Action Args
2011-07-04 07:02:38Nicolas.Estibalssetmessages: + msg139741
2011-07-03 01:16:24r.david.murraysetstatus: open -> closed
resolution: fixed
messages: + msg139667

stage: patch review -> resolved
2011-07-03 01:14:31python-devsetnosy: + python-dev
messages: + msg139666
2011-06-28 20:39:17Nicolas.Estibalssetfiles: + send_message_rfc2822_v2.patch

messages: + msg139375
2011-06-20 18:30:46Nicolas.Estibalssetmessages: + msg138745
2011-06-20 17:25:25r.david.murraysetmessages: + msg138743
2011-06-20 17:22:38r.david.murraysetmessages: + msg138742
2011-06-20 17:02:19Nicolas.Estibalssetmessages: + msg138740
2011-06-18 19:57:59r.david.murraysetmessages: + msg138593
versions: + Python 3.3
2011-05-23 02:52:42r.david.murraysetassignee: r.david.murray
2011-05-23 02:52:26r.david.murraysetnosy: + r.david.murray

messages: + msg136583
stage: patch review
2011-05-23 00:19:48jceasetnosy: + jcea
2011-05-22 07:54:41Nicolas.Estibalscreate