This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: empty local-part in addr_spec displayed incorrectly
Type: behavior Stage:
Components: email Versions: Python 3.9, Python 3.8, Python 3.7, Python 3.6
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: andreitroiebbc, barry, maxking, r.david.murray
Priority: normal Keywords:

Created on 2019-09-20 14:34 by andreitroiebbc, last changed 2022-04-11 14:59 by admin.

Files
File name Uploaded Description Edit
example_parser.py andreitroiebbc, 2019-09-20 14:34
Messages (3)
msg352852 - (view) Author: Andrei Troie (andreitroiebbc) * Date: 2019-09-20 14:34
Given an (RFC-legal) email address with the local part consisting of a quoted empty string (e.g. 'Nobody <""@example.org>'), when I call the 'addr_spec' property, the result no longer includes the quoted empty string (so, in this case, addr_spec would return '@example.org').
msg353003 - (view) Author: Andrei Troie (andreitroiebbc) * Date: 2019-09-23 10:13
As far as I understand it, this is due to the following code in email.headerregistry.Address.addr_spec (in 3.8 and below):

if len(nameset) > len(nameset-parser.DOT_ATOM_ENDS):
    lp = parser.quote_string(self.username)

or, in the current version on master:

lp = self.username
if not parser.DOT_ATOM_ENDS.isdisjoint(lp):
    lp = parser.quote_string(lp)

Both of these tests will not work with the empty string since the empty string is always disjoint from anything, so it will never get quoted.
msg353983 - (view) Author: Abhilash Raj (maxking) * (Python committer) Date: 2019-10-05 01:18
It is actually parsed correctly and serialized back when you try to convert it to a string representation:

from email.parser import BytesFeedParser
import email.policy

def main():
    eml_string = 'From: Nobody <""@example.org>'
    parser = BytesFeedParser(policy = email.policy.default)
    parser.feed(eml_string.encode())
    msg = parser.close()
    print(msg.get('From').addresses[0].addr_spec)
    print(repr(msg.get('From')._parse_tree))
    print(msg.as_string())

Running this gives me:


@example.org
AddressList([Address([Mailbox([NameAddr([DisplayName([Atom([ValueTerminal('Nobody'), CFWSList([WhiteSpaceTerminal(' ')])])]), AngleAddr([ValueTerminal('<'), AddrSpec([LocalPart([QuotedString([BareQuotedString([ValueTerminal('')])])]), ValueTerminal('@'), Domain([DotAtom([DotAtomText([ValueTerminal('example'), ValueTerminal('.'), ValueTerminal('org')])])])]), ValueTerminal('>')])])])])])
From: Nobody <""@example.org>


Notice the : AddrSpec([LocalPart([QuotedString([BareQuotedString([ValueTerminal('')])])])

print() converts the addr-spec into a string, which omits the quotes. This is true for any non-none string too:


hello@example.org
AddressList([Address([Mailbox([NameAddr([DisplayName([Atom([ValueTerminal('Nobody'), CFWSList([WhiteSpaceTerminal(' ')])])]), AngleAddr([ValueTerminal('<'), AddrSpec([LocalPart([QuotedString([BareQuotedString([ValueTerminal('hello')])])]), ValueTerminal('@'), Domain([DotAtom([DotAtomText([ValueTerminal('example'), ValueTerminal('.'), ValueTerminal('org')])])])]), ValueTerminal('>')])])])])])
From: Nobody <"hello"@example.org>


If you prefer the string representation of the header's parsed value, you can try:

    print(msg.get('From').fold(policy=email.policy.default))

Which prints:

    From: Nobody <""@example.org>
History
Date User Action Args
2022-04-11 14:59:20adminsetgithub: 82413
2019-10-05 01:18:00maxkingsetmessages: + msg353983
2019-09-23 10:46:18xtreaksetnosy: + maxking
2019-09-23 10:13:03andreitroiebbcsetmessages: + msg353003
2019-09-20 15:42:05andreitroiebbcsetversions: + Python 3.9
2019-09-20 14:34:10andreitroiebbccreate