This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author msapiro
Recipients msapiro
Date 2016-04-01.17:46:29
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1459532790.12.0.415003614976.issue26686@psf.upfronthosting.co.za>
In-reply-to
Content
Given an admittedly defective (the folded Content-Type: isn't indented) message part with the following headers/body

-------------------------------
Content-Disposition: inline; filename="04EBD_xxxx.xxxx_A546BB.zip"
Content-Type: application/x-rar-compressed; x-unix-mode=0600;
name="04EBD_xxxx.xxxx_A546BB.zip"
Content-Transfer-Encoding: base64

UmFyIRoHAM+QcwAADQAAAAAAAABKRXQgkC4ApAMAAEAHAAACJLrQXYFUfkgdMwkAIAAAAGEw
ZjEwZi5qcwDwrrI/DB2NDI0TzcGb3Gpb8HzsS0UlpwELvdyWnVaBQt7Sl2zbJpx1qqFCGGk6
...
-------------------------------

email.parser parses the headers as

-------------------------------
Content-Disposition: inline; filename="04EBD_xxxx.xxxx_A546BB.zip"
Content-Type: application/x-rar-compressed; x-unix-mode=0600;
-------------------------------

and the body as

-------------------------------
name="04EBD_xxxx.xxxx_A546BB.zip"
Content-Transfer-Encoding: base64

UmFyIRoHAM+QcwAADQAAAAAAAABKRXQgkC4ApAMAAEAHAAACJLrQXYFUfkgdMwkAIAAAAGEw
ZjEwZi5qcwDwrrI/DB2NDI0TzcGb3Gpb8HzsS0UlpwELvdyWnVaBQt7Sl2zbJpx1qqFCGGk6
...
-------------------------------

and shows no defects.

This is wrong. RFC5322 section 2.1 is clear that everything up to the first empty line is headers. Even the docstring in the email/parser.py module says "The header block is terminated either by the end of the string or by a blank line."

Since the message is defective, it isn't clear what the correct result should be, but I think

Headers:
Content-Disposition: inline; filename="04EBD_xxxx.xxxx_A546BB.zip"
Content-Type: application/x-rar-compressed; x-unix-mode=0600;
Content-Transfer-Encoding: base64

Body:
UmFyIRoHAM+QcwAADQAAAAAAAABKRXQgkC4ApAMAAEAHAAACJLrQXYFUfkgdMwkAIAAAAGEw
ZjEwZi5qcwDwrrI/DB2NDI0TzcGb3Gpb8HzsS0UlpwELvdyWnVaBQt7Sl2zbJpx1qqFCGGk6
...

Defects:
name="04EBD_xxxx.xxxx_A546BB.zip"

would be more appropriate. The problem is that the Content-Transfer-Encoding: base64 header is not in the headers so that get_payload(decode=True) doesn't decode the base64 encoded body making malware recognition difficult.
History
Date User Action Args
2016-04-01 17:46:30msapirosetrecipients: + msapiro
2016-04-01 17:46:30msapirosetmessageid: <1459532790.12.0.415003614976.issue26686@psf.upfronthosting.co.za>
2016-04-01 17:46:30msapirolinkissue26686 messages
2016-04-01 17:46:29msapirocreate