classification
Title: mail message parsing glitch
Type: behavior Stage: test needed
Components: email Versions: Python 3.3
process
Status: open Resolution:
Dependencies: 14731 Superseder:
Assigned To: Nosy List: barry, mcspang, petri.lehtinen, r.david.murray
Priority: normal Keywords: patch

Created on 2006-11-05 09:21 by mcspang, last changed 2012-06-18 08:03 by petri.lehtinen.

Files
File name Uploaded Description Edit
mbox mcspang, 2006-11-05 09:21 mailbox
preserve_leading_whitespace.patch r.david.murray, 2011-03-24 13:55 review
Messages (4)
msg30445 - (view) Author: Mike (mcspang) Date: 2006-11-05 09:21
There's something wrong with the handling of line
continuation in headers.

In the attached mbox the 'Message-id' header is read
and stored with a leading space. So message_id = '
<9B09D75DF5B3494BA06E6FE478CE9CC10526CFAF@mgtserver3.ontario.int.ec.gc.ca>'
when it's first read.

If you write this message out to a new mbox, it is
written with the leading space and without the newline.

THEN, if you read it in AGAIN, the parser ignores the
leading space.

One of these steps is buggy, I'm not sure which. It
seems to me that the value returned by
msg['Message_id'] should not change when the file is
written then re-read. In the initial read and final
reads above the value differs by a space.
msg30446 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2007-03-12 13:51
This makes some sense, although it may not be ideal.  The "leading space" is really an RFC 2822 continuation line, so that whitespace at the start of the second line is folding whitespace.  Technically those two lines should (and do) get collapsed.

The question is whether this message should be printed idempotently.  In general the email package tries hard to maintain this principle, so I think it's probably a valid bug.  However, it doesn't affect the semantics of the message so it's not a high priority fix.
msg131976 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011-03-24 13:55
I needed an airplane-trip-sized problem to work on on the way back from PyCon and the sprints, so I tried my hand at "fixing" this.  The attached patch is really just a proof of concept.  Because it is so invasive of the email package machinery I doubt that I will apply it, but it does serve to prove that it is quite practical, given the right design, to preserve the leading whitespace in message headers, and this does enable the email package to read and write the messages in the sample mbox without changing them.  I will incorporate what I learned from this exercise into the header management in email6.

On the other hand, if anyone else thinks this *is* worth tidying up an applying I could be convinced.

Note that after this patch one test fails, but that test failure is actually a buggy test that hides a bug in the header formatter (a failure to provide folding white space at the start of a continuation line).  That bug I may revisit.
msg161476 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012-05-24 03:10
I think I can actually fix this once the patch in issue 14731 is applied.
History
Date User Action Args
2012-06-18 08:03:55petri.lehtinensetnosy: + petri.lehtinen
2012-05-24 03:23:03r.david.murraylinkissue968430 dependencies
2012-05-24 03:10:03r.david.murraysetversions: - Python 3.1, Python 2.7, Python 3.2
messages: + msg161476

assignee: r.david.murray ->
dependencies: + Enhance Policy framework in preparation for adding email6 policies as provisional
components: + email, - Library (Lib)
2011-03-24 13:55:43r.david.murraysetfiles: + preserve_leading_whitespace.patch
keywords: + patch
messages: + msg131976
2011-03-13 22:26:07r.david.murraysetnosy: barry, mcspang, r.david.murray
versions: + Python 3.1, Python 2.7, Python 3.2, Python 3.3, - Python 2.6
2010-05-05 13:47:31barrysetassignee: barry -> r.david.murray

nosy: + r.david.murray
2009-03-30 17:23:26ajaksu2setstage: test needed
type: behavior
versions: + Python 2.6, - Python 2.5
2006-11-05 09:21:04mcspangcreate