Issue 21095: EmailMessage should support Header objects

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/65294

classification

Title:	EmailMessage should support Header objects
Type:		Stage:
Components:	email	Versions:	Python 3.4

process

Status:	open	Resolution:
Dependencies:		Superseder:
Assigned To:		Nosy List:	barry, brandon-rhodes, r.david.murray
Priority:	normal	Keywords:

Created on 2014-03-29 03:33 by brandon-rhodes, last changed 2022-04-11 14:58 by admin.

Messages (3)
msg215112 - (view)	Author: Brandon Rhodes (brandon-rhodes) *	Date: 2014-03-29 03:33
Currently, the new wonderful EmailMessage class ignores the encoding specified in any Header objects that are provided to it. import email.message, email.header m = email.message.Message() m['Subject'] = email.header.Header('Böðvarr'.encode('latin-1'), 'latin-1') print(m.as_string()) Subject: =?iso-8859-1?q?B=F6=F0varr?= m = email.message.EmailMessage() m['Subject'] = email.header.Header('Böðvarr'.encode('latin-1'), 'latin-1') print(m.as_string()) Traceback (most recent call last): ... TypeError: 'Header' object does not support indexing If the EmailMessage came to recognize and support Header objects, then Python programmers under specific constraints regarding what encodings their customers' email clients will recognize and support would be able to hand-craft the selection of the correct encoding instead of being forced to either ASCII or UTF-8 with binary as the two predominant choices that EmailMessage makes on its own.
msg220637 - (view)	Author: Mark Lawrence (BreamoreBoy) *	Date: 2014-06-15 13:36
@David can we have your comments please.
msg220646 - (view)	Author: R. David Murray (r.david.murray) *	Date: 2014-06-15 15:11
I have to look at the implementation to remind myself how hard this would be to implement. The goal was to leave Header a legacy API...if you need that level of control, you use the old API. But I can see the functionality argument, and Header is a reasonable API for building such a custom header. It may be a while before I have time to take a look at it, though, so if anyone else wants to take a look, feel free :) One problem is that while the parser does retain the cte of each encoded word, if the header is refolded for any reason the cte is (often? always? I don't remember) ignored because encoded words may be recombined during folding. And if you are creating the header inside a program, that header is going to get refolded on serialization, unless max_line_length is set to 0/None or the header fits on one line. So it's not obvious to me that this can work at all. What could work would be to have a policy setting to use something other than utf-8 for the CTE for encoding headers, but that would be a global setting (applying to all headers that are refolded during serialization). Basically, controlling the CTE of encoded words on an individual basis goes directly against the model used by the new Email API: in that model, the "model" of the email message is the decoded version of the message, and serialization is responsible for doing whatever CTE encoding is appropriate. The goal is to hide the details of the RFCs from the library user. So, if you want control at that level, you have to go back to the old API, which required you do understand what you were doing at the RFC level...

History
Date	User	Action	Args
2022-04-11 14:58:00	admin	set	github: 65294
2019-03-15 23:14:34	BreamoreBoy	set	nosy: - BreamoreBoy
2014-06-15 15:11:59	r.david.murray	set	messages: + msg220646
2014-06-15 13:36:19	BreamoreBoy	set	nosy: + BreamoreBoy messages: + msg220637
2014-03-29 03:33:26	brandon-rhodes	create