This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author r.david.murray
Recipients barry, cnicodeme, jwilk, msapiro, r.david.murray
Date 2018-11-06.19:23:24
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <>
>>> m = message_from_string("From: John Doe <>\n\n", policy=default)
    >>> m['From'].addresses(Address(display_name='', username='John Doe jdoe', domain=''),)

The new policies have more error recovery for non-RFC compliant addresses than decode_header, but the two agree in this case.  What is happening here is that (1) an unquoted/unencoded '@' is not allowed in a display name (2) if the address is not '<>' quoted, then everything before the @ is the username and (3) in the absence of a comma after the end of the fqdn (which is not allowed to contain blanks) any additional tokens are discarded.

One could argue that we could treat the blank after the FQDN as a "missing comma", and there would be some merit to that argument.  You could also argue that a "<>" quoted string would trump the occurrence of the @ earlier in the token list.  However, the RFC822 grammar is designed to be parsed character by character, so that would not be a typical way for an RFC822 parser to try to do postel-style error recovery.

So, I don't think there is a bug here, but I'd be curious what other email address parsing libraries do, and that could influence whether extensions to the "make a guess when the string doesn't conform to the RFC" code would be acceptable.
Date User Action Args
2018-11-06 19:23:24r.david.murraysetrecipients: + r.david.murray, barry, msapiro, jwilk, cnicodeme
2018-11-06 19:23:24r.david.murraysetmessageid: <>
2018-11-06 19:23:24r.david.murraylinkissue34155 messages
2018-11-06 19:23:24r.david.murraycreate