This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: email.headers wraps headers badly
Type: behavior Stage: resolved
Components: email, Library (Lib) Versions: Python 3.7, Python 3.6, Python 3.4, Python 3.5
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: barry, jribbens, r.david.murray
Priority: normal Keywords:

Created on 2019-01-30 21:23 by jribbens, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Messages (7)
msg334594 - (view) Author: Jon Ribbens (jribbens) * Date: 2019-01-30 21:23
email.headers can wrap headers by putting a FWS as the very first thing in the output:

>>> from email.header import Header
>>> Header("a" * 67, header_name="Content-ID").encode() 
'\n aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'

i.e. it produces headers that look like this:

    Content-ID:
        blah

It is unclear to me whether this is compliant with the spec, but there seems to be little reason to do this, and good reason not to in that at the very least Outlook does not understand such headers. (e.g. if you have an HTML email with an inline image referenced by Content-ID then Outlook will not find it if the Content-ID header is wrapped as above.)
msg334635 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2019-01-31 22:50
That is correct folding.  The word is too long to fit within the 78 character default if put on the same line as the label, but does fit on a line by itself.

If Outlook can't understand such a header it is even more broken than I thought it was :(  You can work around the outlook but by specifying a longer-than-standard (but still RFC compliant) line length when serializing the message.
msg334636 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2019-01-31 22:51
Also note that you might want to switch to the new API, the folder it uses is smarter, although in this case I think it will produce the same result because it is the "best" rendering of the header under the circumstances.
msg334637 - (view) Author: Jon Ribbens (jribbens) * Date: 2019-01-31 23:07
It is not correct folding. It might not be explicitly forbidden, but it is clearly unwise, and is breaking 'conservative in what you send'. Outlook will not be the only program that fails to parse Python's output.
msg334645 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2019-02-01 02:30
The rules are: lines should be less than 78 characters; and that lines may be broken only at FWS (folding whitespace), not in the middle of words.  Putting these rules together, you get the result that the email library produces.  "Conservative in what you send" means *following the RFC rules*, which is what the code does.  The failure here is on the Outlook side, which is supposed to be being "liberal in what you accept".  Which it is clearly not doing.

In case you want to read the RFCs, which I just reviewed, Content-ID is defined to have the same syntax as Message-ID, and Message-Id is defined as "Message-ID:" msg-id CRLF, while 'msg-id' is defined as:

    msg-id          =       [CFWS] "<" id-left "@" id-right ">" [CFWS]

Which means that a fold is permitted before the id itself.

We could consider an "enhancement" request to cater to Outlook's deficiency, since email clients that are actually limited to 78 character lines are vanishingly rare these days.  The change would only be made in the new API folder, and I myself wouldn't have the time (or desire :) to work on it, but if you want to submit an issue as see what the other email team members think and produce a PR if the vote is positive, that's fine by me.

Do you know if it is all headers that Outlook has this problem with, or only some?  I will admit that it has been long enough since I implemented this that I can't confirm that I made sure it was legal to fold *every* header after the colon, but I'm pretty sure I did.
msg334675 - (view) Author: Jon Ribbens (jribbens) * Date: 2019-02-01 13:22
I did read the RFCs. I suspect the [CFWS] in the msg-id is for the benefit of the references production which contains a list of msg-ids. The 78-character suggested line length limit is explicitly noted as being for display purposes, and therefore is of little application to headers which are not displayed in user interfaces.

Also consider that the Python wrapping code produces "\n ..." when given a header that is 80 indivisible characters long, when there is no possibility of avoiding a line over 78 characters.

Outlook seems to cope alright with other headers (I tried From and Subject) being wrapped like this; I shudder to think what their code must be like in order to produce this bug.
msg334691 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2019-02-01 15:47
Well, "display" in the context of email includes looking at the raw email serialized as a text file.  This is something one can do in most mailers. I use nmh as my mailer, which only shows raw headers, so I myself would be personally affected if headers were not normally wrapped at 78 characters when possible :)

The >80 characters issue you mention is fixed by the folder used by the new API.  That folder will use encoded words to wrap overlong tokens in text portions of headers, which may or may not have been the best decision (jury is still out on that one), and for non-text headers it does not put in that /n if the word won't fit on the next line if wrapped.  (Or at least its not supposed to, so if you find a case where it does, please submit a bug report.)

email.Header is a legacy module and no longer maintained.  And yes, I realize it is used by default.  There should be an open issue about going through a deprecation cycle to make the new API the default, but I've lost track and have no time to push for it myself.
History
Date User Action Args
2022-04-11 14:59:10adminsetgithub: 80044
2019-02-01 15:47:50r.david.murraysetmessages: + msg334691
2019-02-01 13:22:32jribbenssetmessages: + msg334675
2019-02-01 02:30:34r.david.murraysetmessages: + msg334645
2019-01-31 23:07:04jribbenssetmessages: + msg334637
2019-01-31 22:51:15r.david.murraysetmessages: + msg334636
2019-01-31 22:50:09r.david.murraysetstatus: open -> closed
type: behavior
messages: + msg334635

resolution: not a bug
stage: resolved
2019-01-31 06:25:20xtreaksetnosy: + barry, r.david.murray
components: + email
2019-01-30 21:23:57jribbenscreate