classification
Title: Email address display name fails with both encoded words and special chars
Type: behavior Stage: resolved
Components: email Versions: Python 3.9, Python 3.8, Python 3.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: barry, bsiem, maxking, miss-islington, ned.deily, r.david.murray
Priority: normal Keywords: patch

Created on 2019-07-02 11:51 by bsiem, last changed 2019-08-29 06:56 by bsiem. This issue is now closed.

Files
File name Uploaded Description Edit
email_header_test.py bsiem, 2019-07-02 11:51
Pull Requests
URL Status Linked Edit
PR 14561 merged python-dev, 2019-07-02 18:51
PR 15371 merged miss-islington, 2019-08-21 23:00
PR 15380 merged bsiem, 2019-08-22 06:58
Messages (10)
msg347136 - (view) Author: B Siemerink (bsiem) * Date: 2019-07-02 11:51
Special characters in email headers are normally put within double quotes. However, encoded words (=?charset?x?...?=) are not allowed withing double quotes. When the header contains a word with special characters and another word that must be encoded, the first one must also be encoded.

In the next example, The From header is quoted and therefore the comma is allowed; in the To header, the comma is not within quotes and not encoded, which is not allowed and rejected.

From: "Foo Bar, France" <foo@example.com>
To: Foo Bar, =?utf-8?q?Espa=C3=B1a?= <foo@example.com>
msg347628 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2019-07-10 17:57
FYI, it would have been most helpful if you had posted your example in the issue text instead of as an attached file, as it explains the problem better than your text does :)

Here is a minimal reproducer:

>>> m = EmailMessage(policy=strict)
>>> m['From'] = '"Foo Bar, España" <foo@example.com>'
>>> bytes(m)
b'From: Foo Bar, =?utf-8?q?Espa=C3=B1a?= <foo@example.com>\n\n'

This serialization of the header is, as you say, invalid.  Either the comma should be encoded, or the "Foo Bar," should be in quotes.
msg347634 - (view) Author: B Siemerink (bsiem) * Date: 2019-07-10 18:43
Hello David, thank you for the suggestion.

Regarding your comment:
> Either the comma should be encoded, or the "Foo Bar," should be in quotes.

According to RFC5322 the display name cannot contain both a quoted part and an encoded word, so the only option is to encode the comma.

Please let me know if I can do anything else.
msg347637 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2019-07-10 20:11
The display name is a phrase, and a phrase is a sequence of words, and a word is either a quoted string or an atom.  So it is legal to mix quoted strings and encoded words in a display name.  I'd vote to do whichever one is easier to implement :)  (I haven't looked at your PR yet and unfortunately my time is limited :(
msg347925 - (view) Author: B Siemerink (bsiem) * Date: 2019-07-14 16:35
Yes, you are right! The fix is to encode the special characters.
msg350128 - (view) Author: miss-islington (miss-islington) Date: 2019-08-21 23:00
New changeset df0c21ff46c5c37b6913828ef8c7651f523432f8 by Miss Islington (bot) (bsiem) in branch 'master':
bpo-37482: Fix email address name with encoded words and special chars (GH-14561)
https://github.com/python/cpython/commit/df0c21ff46c5c37b6913828ef8c7651f523432f8
msg350130 - (view) Author: miss-islington (miss-islington) Date: 2019-08-21 23:21
New changeset c5bba853d5e7836f6d4340e18721d3fb3a6ee0f7 by Miss Islington (bot) in branch '3.7':
bpo-37482: Fix email address name with encoded words and special chars (GH-14561)
https://github.com/python/cpython/commit/c5bba853d5e7836f6d4340e18721d3fb3a6ee0f7
msg350707 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2019-08-29 04:47
New changeset bd127b1b7dd50c76c4419d9c87c12901527d19da by Ned Deily (bsiem) in branch '3.8':
[3.8] bpo-37482: Fix email address name with encoded words and special chars (GH-14561) (GH-15380)
https://github.com/python/cpython/commit/bd127b1b7dd50c76c4419d9c87c12901527d19da
msg350711 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2019-08-29 04:58
I manually merged the stalled 3.8 backport to make 3.8.0b4.  Can this issue now be closed?
msg350736 - (view) Author: B Siemerink (bsiem) * Date: 2019-08-29 06:56
Thank you all!
History
Date User Action Args
2019-08-29 06:56:13bsiemsetstatus: open -> closed
resolution: fixed
messages: + msg350736

stage: patch review -> resolved
2019-08-29 04:58:57ned.deilysetmessages: + msg350711
2019-08-29 04:47:19ned.deilysetnosy: + ned.deily
messages: + msg350707
2019-08-22 06:58:48bsiemsetpull_requests: + pull_request15090
2019-08-21 23:21:52miss-islingtonsetmessages: + msg350130
2019-08-21 23:00:53miss-islingtonsetpull_requests: + pull_request15083
2019-08-21 23:00:42miss-islingtonsetnosy: + miss-islington
messages: + msg350128
2019-07-14 16:35:42bsiemsetmessages: + msg347925
2019-07-10 20:11:31r.david.murraysetmessages: + msg347637
2019-07-10 18:43:38bsiemsetmessages: + msg347634
2019-07-10 17:57:37r.david.murraysetmessages: + msg347628
2019-07-02 18:55:38bsiemsettitle: Email header fails with both encoded words and special chars -> Email address display name fails with both encoded words and special chars
2019-07-02 18:51:56python-devsetkeywords: + patch
stage: patch review
pull_requests: + pull_request14378
2019-07-02 15:22:56xtreaksetnosy: + maxking
2019-07-02 11:53:55SilentGhostsetversions: - Python 3.5, Python 3.6
2019-07-02 11:51:15bsiemcreate