Message 358533 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	mkaiser
Recipients	barry, mkaiser, r.david.murray
Date	2019-12-17.08:32:29
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1576571549.52.0.292971717442.issue39071@roundup.psfhosted.org>
In-reply-to

Content
I used email.parser.BytesParser for parsing mails. In one programm I used parse, because the email was stored in a file. In a second programm the email was stored in memory as a bytes object. I created hash values from each parts an compared them, to check if a part is already known to my programs. This works for attachments, but not for html and plain text parts. Documentation for parsebytes: Similar to the parse() method, except it takes a bytes-like object instead of a file-like object. Calling this method on a bytes-like object is equivalent to wrapping bytes in a BytesIO instance first and calling parse(). When I read the documentation, I expected that both methods will produce the same output. The testmail contains 2 mimeparts. One with html and one with plain text. The parse method with a file and the parse method with bytes-data, wrapped in a BytesIO produces the same hashes. The paesebytes method creates different hashes. Output of my testprogram: MD5 sums with parsebytes with bytes data 3f4ee7303378b62f723a8d958797507a 45c72465b931d32c7e700d2dd96f8383 ------------------------ MD5 sums with parse and BytesIO with bytes data fb0599d92750b72c25923139670e5127 9a54b64425b9003a9e6bf199ab6ba603 ------------------------ MD5 sums with parse from file fb0599d92750b72c25923139670e5127 9a54b64425b9003a9e6bf199ab6ba603 Is this an expected behavior or is this an error?

I used email.parser.BytesParser for parsing mails. 

In one programm I used parse, because the email was stored in a file.
In a second programm the email was stored in memory as a bytes object.

I created hash values from each parts an compared them, to check if a part is already known to my programs. This works for attachments, but not for html and plain text parts.

Documentation for parsebytes:

Similar to the parse() method, except it takes a bytes-like object instead of a file-like object. Calling this method on a bytes-like object is equivalent to wrapping bytes in a BytesIO instance first and calling parse().

When I read the documentation, I expected that both methods will produce the same output.

The testmail contains 2 mimeparts. One with html and one with plain text.

The parse method with a file and the parse method with bytes-data, wrapped in a BytesIO produces the same hashes. The paesebytes method creates different hashes.

Output of my testprogram:

MD5 sums with parsebytes with bytes data
3f4ee7303378b62f723a8d958797507a
45c72465b931d32c7e700d2dd96f8383
------------------------
MD5 sums with parse and BytesIO with bytes data
fb0599d92750b72c25923139670e5127
9a54b64425b9003a9e6bf199ab6ba603
------------------------
MD5 sums with parse from file
fb0599d92750b72c25923139670e5127
9a54b64425b9003a9e6bf199ab6ba603



Is this an expected behavior or is this an error?

History
Date	User	Action	Args
2019-12-17 08:32:29	mkaiser	set	recipients: + mkaiser, barry, r.david.murray
2019-12-17 08:32:29	mkaiser	set	messageid: <1576571549.52.0.292971717442.issue39071@roundup.psfhosted.org>
2019-12-17 08:32:29	mkaiser	link	issue39071 messages
2019-12-17 08:32:29	mkaiser	create