Author akuchling
Recipients akuchling, barry
Date 2010-05-19.20:55:17
SpamBayes Score 0.0011326
Marked as misclassified No
Message-id <1274302519.83.0.146401120937.issue8769@psf.upfronthosting.co.za>
In-reply-to
Content
The attached test program shows how parsing an e-mail message with the email package, then converting the resulting message to a string, fails to round-trip properly.  Instead it breaks the encoding of the subject line.

The root of the problem: the subject is RFC-2047 quoted, long enough to require line wrapping, and it contains one of the splitchars used by Header.encode() -- meaning a semi-colon or comma.  In my example, this is:

Subject: =?utf-8?Q?2010_Foundation_Salary_and_Benefits_Report;_Important_Legislative_Efforts?=

Parsing the message turns that into a string S.  generator.Generator._write_headers() then outputs Header(S).encode(), so it keeps treating the value as an ASCII string, and therefore breaks the header at the semicolon, resulting in:
  
Subject: =?utf-8?Q?2010_Foundation_Salary_and_Benefits_Report;<NEWLINE><SPACE>_Important_Legislative_Efforts?=

Newline and space aren't legal in Q encoding, so MUAs give up and display all the =?utf-8?Q? stuff.
History
Date User Action Args
2010-05-19 20:55:20akuchlingsetrecipients: + akuchling, barry
2010-05-19 20:55:19akuchlingsetmessageid: <1274302519.83.0.146401120937.issue8769@psf.upfronthosting.co.za>
2010-05-19 20:55:18akuchlinglinkissue8769 messages
2010-05-19 20:55:17akuchlingcreate