classification
Title: email modifies the message structure when the parsed email is invalid without registering defects
Type: enhancement Stage: resolved
Components: email Versions: Python 3.3
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: barry, python-dev, r.david.murray, xavierd
Priority: normal Keywords: patch

Created on 2011-07-07 16:37 by xavierd, last changed 2012-05-28 02:23 by r.david.murray. This issue is now closed.

Files
File name Uploaded Description Edit
sample.tgz xavierd, 2011-07-07 16:37 python script to reproduce the issue
orig.eml xavierd, 2011-08-12 09:53 email to reproduce the issue
test.py xavierd, 2011-08-12 09:54 python script to test the patch
email.patch xavierd, 2011-08-17 13:04
orig.eml xavierd, 2011-08-17 13:05 email without a header/body separator
Messages (7)
msg139982 - (view) Author: xavierd (xavierd) Date: 2011-07-07 16:37
the function 'email.message_from_file' modifies the message structure when the parsed is invalid (for example, when a closed boudary is missing). The attribute defects is also empty

In the attachment (sample.tgz) you will find:
   - orig.eml : an email with an invalid structure The boundary
"000101020201080900040301" isn't closed
   - after_parsing.eml: same email after calling email.message_from_file()
The boundary is now closed. And the defects attribute is empty
   - test.py: python script to reproduce.
msg141947 - (view) Author: xavierd (xavierd) Date: 2011-08-12 09:52
This patch does: 
 - when a close boundary isn't found then the error 
'email.errors.CloseBoundaryNotFoundDefect' is added to the defects list.
 - it doesn't modify the current behaviour of the feedparser 
(eg: the function email.message_from_file still modifies the message 
structure)
msg141948 - (view) Author: xavierd (xavierd) Date: 2011-08-12 09:54
with the patch applied: 

{{{
$ ./test.py
PARSER INVALID EMAIL
defects found !
[<email.errors.CloseBoundaryNotFoundDefect instance at 0x7f41421c0488>]
}}}
msg142273 - (view) Author: xavierd (xavierd) Date: 2011-08-17 13:04
I also noticed that 'email' modifies the message structure when the header/body separator is missing. And nothing is added to the defect list.
In the attachment, you'll find : 
 - email.patch: this patch add the following error to the defects list :
   - the error 'email.errors.CloseBoundaryNotFoundDefect' when a boundary isn't closed.
   - the error 'email.errors.MissingHeaderBodySeparator' when the header/body isn't found
(patch for python 2.7.2)
 - orig.email: a email without a header/body separator
msg161509 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012-05-24 14:20
Thanks for the patch.  I haven't forgotten about it, but it will probably still be a while yet before I get to it.  Hopefully before 3.3 is released, though.
msg161750 - (view) Author: Roundup Robot (python-dev) Date: 2012-05-28 02:20
New changeset 81e008f13b4f by R David Murray in branch 'default':
#12515: email now registers a defect if the MIME end boundary is missing.
http://hg.python.org/cpython/rev/81e008f13b4f
msg161751 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012-05-28 02:23
I didn't wind up using your patch (for one thing I forgot that there were two separate issues in this patch and independently rediscovered and fixed the MissingHeaderBodySeparatorDefect one).  However, this is now fixed in 3.3.  Unfortunately, since it introduces a new defect, it is an enhancement and by our rules can't be backported.
History
Date User Action Args
2012-05-28 02:23:47r.david.murraysetstatus: open -> closed
versions: - Python 2.7, Python 3.2
type: behavior -> enhancement
messages: + msg161751

resolution: fixed
stage: needs patch -> resolved
2012-05-28 02:20:54python-devsetnosy: + python-dev
messages: + msg161750
2012-05-24 14:20:38r.david.murraysettitle: email modifies the message structure when the parsed email is invalid -> email modifies the message structure when the parsed email is invalid without registering defects
nosy: + barry

messages: + msg161509

assignee: r.david.murray ->
components: + email, - Library (Lib)
2011-08-17 13:05:43xavierdsetfiles: + orig.eml
2011-08-17 13:04:20xavierdsetfiles: - email.patch
2011-08-17 13:04:05xavierdsetfiles: + email.patch

messages: + msg142273
2011-08-12 09:54:10xavierdsetfiles: + test.py

messages: + msg141948
2011-08-12 09:53:18xavierdsetfiles: + orig.eml
2011-08-12 09:52:25xavierdsetfiles: + email.patch
keywords: + patch
messages: + msg141947
2011-07-08 13:56:29eric.araujosetversions: + Python 3.2, Python 3.3, - Python 2.6
title: The email package modifies the message structure (when the parsed email is invalid) -> email modifies the message structure when the parsed email is invalid
assignee: r.david.murray
components: + Library (Lib), - None
type: behavior
stage: needs patch
2011-07-07 16:37:22xavierdcreate