classification
Title: Inconsistent newline handling in email module
Type: behavior Stage: needs patch
Components: email Versions: Python 3.3
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: ajaksu2, barry, davbo, iko, r.david.murray
Priority: normal Keywords:

Created on 2004-06-18 12:50 by iko, last changed 2012-05-28 18:40 by r.david.murray.

Files
File name Uploaded Description Edit
Charset.diff iko, 2004-06-30 13:32 Patch to change email.Charset as described
crlf-body-encode.patch r.david.murray, 2011-04-05 13:18 review
base_encode_tests.patch davbo, 2011-06-25 16:24 Base encode patch tests review
crlf_base64_body_encode.patch r.david.murray, 2012-05-28 18:40
Messages (5)
msg21213 - (view) Author: Anders Hammarquist (iko) Date: 2004-06-18 12:50
text/* parts of email messages must use \r\n as the
newline separator. For unencoded messages. smtplib and
friends take care of the translation from \n to \r\n in
the SMTP processing.

Parts which are unencoded (i.e. 7bit character sets)
MUST use \n line endings, or smtplib with translate to
\r\r\n.

Parts that get encoded using quoted-printable can use
either, because the qp-encoder assumes input data is
text and reencodes with \n.

However, parts which get encoded using base64 are NOT
translated, and so must use \r\n line endings.

This means you have to guess whether your text is going
to get encoded or not (admittedly, usually not that
hard), and translate the line endings appropriately
before generating a Message instance.

I think the fix would be for Charset.encode_body() to
alway force the encoder to text mode
(i.e.binary=False), since it seems unlikely to have a
Charset for something which is not text.
msg82058 - (view) Author: Daniel Diniz (ajaksu2) (Python triager) Date: 2009-02-14 13:58
Email sprint candidate.
msg133028 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011-04-05 13:18
Well, it's two years later, but I did look at this during the sprints at PyCon, though I didn't get as far as posting it then (I only just now rediscovered the patch on my laptop).

Python3 no longer has a "binary" flag on base64mime.encode, so here is a proposed patch for Python3.  I'm not sure if this should be backported or not, but I'm leaning that way.  Theoretically it should be only an improvement, but I can easily imagine unix-only programs unknowingly depending on the previous non-translation of newlines.  Still, since email is about intermachine communication and this clearly makes it more RFC compliant, the change is a legitimate bug fix and the chance of breakage is relatively small.

Tests are still needed.
msg139098 - (view) Author: Dave King (davbo) Date: 2011-06-25 16:24
Added some tests against the patch provided by R. David Murray. See attached patch.

Tests pass against default.
msg161798 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012-05-28 18:40
I almost applied this patch, but my gut is having second thoughts about it.  I don't think this is the correct solution.  The correct solution would be to delay the encoding of the body part until the message generation phase, and use the requested linesep at that point.  That is, in 3.2 and 3.3 I've changed the paradigm from "always use \n and convert the final string at need" to "specify the linesep when flattening the message".

I'm uploading the completed patch here so I don't lose it, but I don't think I'm going to use it in this form.
History
Date User Action Args
2012-05-28 18:40:32r.david.murraysetfiles: + crlf_base64_body_encode.patch
messages: + msg161798

assignee: r.david.murray ->
components: + email, - Library (Lib)
keywords: - patch, easy
stage: test needed -> needs patch
2011-06-25 16:24:55davbosetfiles: + base_encode_tests.patch
nosy: + davbo
messages: + msg139098

2011-04-05 13:18:18r.david.murraysetfiles: + crlf-body-encode.patch

messages: + msg133028
2011-03-24 14:38:28r.david.murraysetversions: - Python 3.1, Python 2.7, Python 3.2
2011-03-13 22:28:57r.david.murraysetnosy: barry, iko, ajaksu2, r.david.murray
versions: + Python 3.1, Python 2.7, Python 3.2, Python 3.3, - Python 2.6
2010-05-05 13:49:36barrysetassignee: barry -> r.david.murray

nosy: + r.david.murray
2009-02-14 13:58:47ajaksu2setversions: + Python 2.6
nosy: + ajaksu2
messages: + msg82058
keywords: + patch, easy
type: behavior
stage: test needed
2004-06-18 12:50:50ikocreate