New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
email module should not allow some header field repetitions #55048
Comments
Hi, The following script shows two problems with email.mime.text.MIMEText:
#########################################"
from email.mime.text import MIMEText
msg = MIMEText("""Hello World""")
dest = ["one@example.com", "two@example.com", "three@example.com", "four@example.com"]
for d in dest:
msg["From"] = "myself@example.com"
msg["To"] = d
msg["subject"] = "just a test"
print (msg)
# + send the buggy mail...
################################### the last sent mail will looks like this: --------------------- Hello World I see some possible modifications:
|
The behaviour you observe is by design, and documented. The email package needs to be able to handle RFC-invalid input, which includes messages with multiple instances of fields that are supposed to be singletons. It also needs to keep track of the order of headers. Thus its interface is, as documented, a "mapping-like" interface with duplicable keys and an element order. That said, it would be a valid feature request to have a way to have it generate errors if a field that is supposed to be a singleton per-RFC is added more than once. This will require a registry of such headers...a registry of headers is planned for the next version of the email package (email6), so that would be an appropriate time for this to be implemented. email6 will also have strict and lenient modes, which will also be useful in this context. |
I think really ill/strange is that kind of item _assignments_ do _add_ multiple. If msg[field] = xy would just add-first/replace-frist , and only msg.add_xxxx/.append(field, xy) would add multiples that would be clear and understandable/readable. e.g. I remember a bug like msg[field] = xy Never ever expected a double header here! "=" with adding behavior is absurd IMHO. Certainly doesn't allow readable code. |
Regardless of what anybody thinks about the design, it is what it is and can't be changed for backward compatibility reasons. The best we can do is reject creating duplicate headers for headers that may only appear once. That feature has already been coded in the new version of the email package (see http://pypi.python.org/pypi/email), but has not yet been committed to the trunk, which is why this issue is still open. |
My original fix for this for email6 got lost in a refactoring. Here is a patch that fixes it in the code I recently checked in. It may not cover all the headers that should be unique, since I haven't implemented parsers for all structured headers yet, but they will all be there before the new code moves from provisional to stable. |
New changeset d7881a371c41 by R David Murray in branch 'default': |
Committed. It is almost never the right thing to do to allow duplicates of unique headers, but an application that does need it can create a policy subclass and override the header_max_count method. |
New changeset 08857f4e9709 by R David Murray in branch 'default': |
I think we should consider this as an API design bug and backport the fix. This seems to be the exact cause of this week's email address leak at LetsEncrypt: |
The API in Python 3 is very different and I'm not sure we can backport the expected behavior without breaking other people's code (unless we treat this as a security issue). Here is a naive backport for 2.7. Known test failures: test_get_all, test_get_decoded_uu_payload, test_multipart_no_boundary |
On Jun 11, 2016, at 09:25 PM, Raymond Hettinger wrote:
No, it's deliberate, required, and expected in some cases as RDM explains. Other policies can prevent multiple additions of some headers. Probably those |
Would you consider raising an exception at least for the case of a "To:" header or perhaps a warning or someother failsafe. Using __setitem__ for appending instead of replacement is surprising and in the case of LetsEncrypt was a small disaster. There is a docstring explaining what is going on but that typically isn't visible to the user of the square brackets operator. For Python3.6, I think there should be an alternative API that doesn't use the square brackets operator: add_header, replace_header, remove_header or somesuch. The problem is that square brackets never suggests appending which is what is actually happening. |
There are already the makings of an alternative API: https://docs.python.org/3.6/library/email.message.html#email.message.Message.add_header There is also replace_header(), but it only replaces the _first_ header field, and leaves later ones untouched. However there is only __del__(), which deletes all matching header fields; there is no remove_header() or similar. I think I would support deprecating the __setitem__() etc methods, perhaps with a cleanup of the alternatives, e.g. add remove_all(). Also, __getitem__() is equivalent to get(), which does not raise KeyError. Although according to bpo-12111, David said things are unlikely to change. |
On Jun 13, 2016, at 12:09 AM, Martin Panter wrote:
I do not support deprecating __setitem__(). I'm okay with developing an |
On Jun 12, 2016, at 09:19 PM, Raymond Hettinger wrote:
No, not for compat32 policy. Seriously, I do not want to change the semantics By all means, let's develop API alternatives for new code, or stricter RFC |
I don't think a new API is needed. But we need to promote the policy keyword better in docs. See https://twitter.com/aksiksi/status/741769504817045508 for an example of confusion. I don't know if it's a good idea or API but can we add a 'policy' keyword argument to email.mime.base.MIMEBase? Right now, this is the only way to change the default policy without using high level functions like email.message_from_string(): m = MIMEMultipart()
m.policy = email.policy.default |
On Jun 13, 2016, at 06:38 AM, Berker Peksag wrote:
I think we just need to plumb a |
That's already possible: https://docs.python.org/dev/library/email.message.html#email.message.Message It would be nice to be able to customize 'policy' via BaseMime: https://github.com/python/cpython/blob/master/Lib/email/mime/base.py#L23 |
On Jun 13, 2016, at 08:34 AM, Berker Peksag wrote:
Right, that's what I meant by "plumb". :) Basically we want all the |
Sorry, apparently I can't read in the mornings :) I have just opened bpo-27331 to implement this idea. |
In the new API there's no real reason to use the old MIME classes. If you want to add the keyword I have no objection, though. I started a documentation revision last year but haven't had time to get back to it. Hopefully I'll dust it off Real Soon Now. |
I've committed Berker's patch from bpo-27331, and I'm about to take the new email API out of provisional status. Barry is committed to not changing this behavior in 2.7 and I agree. In any case 2.7 doesn't differentiate between headers being added by the user and headers coming from the parsed message, and the latter *have* to allow duplicates even if the fields aren't supposed to be. The python3 code does make a distinction between these two cases, which is what allowed me to do the fix in the new policies. (Yes, you *could* fix feedparser and message in 2.7 so it also could tell, but that is an invasive change). So, I'm re-closing this as fixed. It's "won't fix" for 2.7 and compat32. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: