This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author jishac
Recipients barry, jishac, r.david.murray
Date 2021-04-12.20:17:27
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1618258648.79.0.562101023353.issue43818@roundup.psfhosted.org>
In-reply-to
Content
I have noticed an issue (Python 3.8.5) where an email can be read in as bytes, but will not be returned as such with the as_bytes() call if the message is multipart, has a boundary which is not properly quoted, and the message has non-ascii text. It seems to be a result of how multipart messages are treated if the NoBoundaryInMultipartDefect is encountered [See Test #1].

I would argue that attempting to output the test message in the sample script with an 8bit, utf-8 enabled policy should return the original bytes as the 8bit policy should be applied to the "body" portion (any part after the null lines) of the email (as would be the case if it were not multipart) [See Test #4]

Currently it appears that the entire message is treated as headers, applying the strict 7bit, ascii requirement to the entire contents of the message. Furthermore, the msg.preamble is None.

I am also uncertain that attempting to leverage the handle_defect hooks would be helpful as correcting the boundary doesn't seem to work unless you re-parse the message [See Tests #2 and #3]

So the requested change would be to apply the encoding output policy to all portions of a message after the null line ending the headers.
History
Date User Action Args
2021-04-12 20:17:28jishacsetrecipients: + jishac, barry, r.david.murray
2021-04-12 20:17:28jishacsetmessageid: <1618258648.79.0.562101023353.issue43818@roundup.psfhosted.org>
2021-04-12 20:17:28jishaclinkissue43818 messages
2021-04-12 20:17:28jishaccreate