According to RFC 5322, an email address like this isn't valid:
user@example.com <user@example.com>
(The display-name "user@example.com" contains "@", which isn't in the set of atext characters used to form an atom.)
How it's handled by the email package varies by policy:
>>> import email
>>> from email.policy import default
>>> email.message_from_bytes(b'To: user@example.com <user@example.com>')['to']
'user@example.com <user@example.com>'
>>> email.message_from_bytes(b'To: user@example.com <user@example.com>', policy=default)['to']
'user@example.com'
>>> email.message_from_bytes(b'To: user@example.com <user@example.com>', policy=default).defects
[]
The difference between the behaviour under the compat32 vs "default" policy may or may not be significant.
However, if coupled with a further invalid feature, namely a space after the ">", here's what happens:
>>> email.message_from_bytes(b'To: user@example.com <user@example.com> ')['to']
'user@example.com <user@example.com> '
>>> email.message_from_bytes(b'To: user@example.com <user@example.com> ', policy=default)['to']
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/email/message.py", line 391, in __getitem__
return self.get(name)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/email/message.py", line 471, in get
return self.policy.header_fetch_parse(k, v)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/email/policy.py", line 162, in header_fetch_parse
return self.header_factory(name, value)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/email/headerregistry.py", line 586, in __call__
return self[name](name, value)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/email/headerregistry.py", line 197, in __new__
cls.parse(value, kwds)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/email/headerregistry.py", line 337, in parse
kwds['parse_tree'] = address_list = cls.value_parser(value)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/email/headerregistry.py", line 328, in value_parser
address_list, value = parser.get_address_list(value)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/email/_header_value_parser.py", line 2368, in get_address_list
token, value = get_invalid_mailbox(value, ',')
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/email/_header_value_parser.py", line 2166, in get_invalid_mailbox
token, value = get_phrase(value)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/email/_header_value_parser.py", line 1770, in get_phrase
token, value = get_word(value)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/email/_header_value_parser.py", line 1745, in get_word
if value[0]=='"':
IndexError: string index out of range
>>> email.message_from_bytes(b'To: user@example.com <user@example.com> ', policy=default).defects
[]
I believe that the preferred behaviour would be to add a defect to the message object during parsing instead of throwing an exception when the invalid header value is accessed.
|