Issue 8769: Straightforward usage of email package fails to round-trip

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/53015

classification

Title:	Straightforward usage of email package fails to round-trip
Type:	behavior	Stage:	resolved
Components:	Library (Lib)	Versions:	Python 3.1, Python 3.2, Python 3.3, Python 2.7

process

Status:	closed	Resolution:	duplicate
Dependencies:		Superseder:	email.header.Header doesn't fold headers correctly View: 11492
Assigned To:	r.david.murray	Nosy List:	akuchling, barry, r.david.murray
Priority:	normal	Keywords:	patch

Created on 2010-05-19 20:55 by akuchling, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name	Uploaded	Description	Edit
email-roundtrip-failure.py	akuchling, 2010-05-19 20:55
issue8769.txt	akuchling, 2010-05-19 20:59

Messages (4)
msg106100 - (view)	Author: A.M. Kuchling (akuchling) *	Date: 2010-05-19 20:55
The attached test program shows how parsing an e-mail message with the email package, then converting the resulting message to a string, fails to round-trip properly. Instead it breaks the encoding of the subject line. The root of the problem: the subject is RFC-2047 quoted, long enough to require line wrapping, and it contains one of the splitchars used by Header.encode() -- meaning a semi-colon or comma. In my example, this is: Subject: =?utf-8?Q?2010_Foundation_Salary_and_Benefits_Report;_Important_Legislative_Efforts?= Parsing the message turns that into a string S. generator.Generator._write_headers() then outputs Header(S).encode(), so it keeps treating the value as an ASCII string, and therefore breaks the header at the semicolon, resulting in: Subject: =?utf-8?Q?2010_Foundation_Salary_and_Benefits_Report;<NEWLINE><SPACE>_Important_Legislative_Efforts?= Newline and space aren't legal in Q encoding, so MUAs give up and display all the =?utf-8?Q? stuff.
msg106101 - (view)	Author: A.M. Kuchling (akuchling) *	Date: 2010-05-19 20:59
The attached patch is a possible fix; it uses the decode_header() and make_header() functions to figure out the encoding properly; it fixes my example, at least. But does it increase the odds of crashing on messages with malformed headers? Should it go into 2.7 given that we're at the RC stage? What about 2.6? (BTW, Barry, I noticed this because messages being sent through Mailman were coming out with broken subject lines. The system generating the messages is slightly weird -- doing the UTF-8 quoting is unnecessary since the subject contains no special characters -- but I think Mailman shouldn't be breaking subject lines. I haven't verified that this Python fix actually fixes Mailman, but I think this is a Python bug, not a Mailman bug.)
msg106102 - (view)	Author: A.M. Kuchling (akuchling) *	Date: 2010-05-19 21:00
Minor fix to the patch: the import of Header could actually be removed, since the class is no longer referenced at all with this change.
msg133972 - (view)	Author: R. David Murray (r.david.murray) *	Date: 2011-04-18 14:45
This is fixed in 3.2/3.3 by the fix for issue 11492. The suggested fix for 2.7 is more radical than I'm comfortable with for a point release. I'm open to argument on that, but in the meantime I'm closing the issue with 11492 as the superseder.

History
Date	User	Action	Args
2022-04-11 14:57:01	admin	set	github: 53015
2011-04-18 14:45:33	r.david.murray	set	status: open -> closed resolution: duplicate messages: + msg133972 superseder: email.header.Header doesn't fold headers correctly stage: resolved
2011-03-14 03:30:26	r.david.murray	set	versions: + Python 3.3
2010-12-27 18:27:41	r.david.murray	set	versions: + Python 3.1, Python 3.2
2010-12-14 18:41:06	r.david.murray	set	type: behavior
2010-10-25 19:35:22	barry	set	assignee: barry -> r.david.murray nosy: + r.david.murray
2010-05-19 21:00:28	akuchling	set	messages: + msg106102
2010-05-19 20:59:43	akuchling	set	files: + issue8769.txt messages: + msg106101
2010-05-19 20:55:18	akuchling	create