Email addresses with a display name starting with a dot ("."), or ending with a dot without whitespace before the angle bracket trigger exceptions when accessing the header, after creating the message object with the "default" policy.
For example:
>>> import email
>>> from email.policy import default
>>> email.message_from_bytes(b'To: . Doe <jxd@example.com>')['to']
'. Doe <jxd@example.com>'
>>> email.message_from_bytes(b'To: . Doe <jxd@example.com>', policy=default)['to']
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/bhat/git/cpython/Lib/email/message.py", line 391, in __getitem__
return self.get(name)
File "/Users/bhat/git/cpython/Lib/email/message.py", line 471, in get
return self.policy.header_fetch_parse(k, v)
File "/Users/bhat/git/cpython/Lib/email/policy.py", line 162, in header_fetch_parse
return self.header_factory(name, value)
File "/Users/bhat/git/cpython/Lib/email/headerregistry.py", line 586, in __call__
return self[name](name, value)
File "/Users/bhat/git/cpython/Lib/email/headerregistry.py", line 197, in __new__
cls.parse(value, kwds)
File "/Users/bhat/git/cpython/Lib/email/headerregistry.py", line 344, in parse
for mb in addr.all_mailboxes]))
File "/Users/bhat/git/cpython/Lib/email/headerregistry.py", line 344, in <listcomp>
for mb in addr.all_mailboxes]))
File "/Users/bhat/git/cpython/Lib/email/_header_value_parser.py", line 834, in display_name
return self[0].display_name
File "/Users/bhat/git/cpython/Lib/email/_header_value_parser.py", line 768, in display_name
return self[0].display_name
File "/Users/bhat/git/cpython/Lib/email/_header_value_parser.py", line 931, in display_name
if res[0][0].token_type == 'cfws':
AttributeError: 'str' object has no attribute 'token_type'
>>>
>>> email.message_from_bytes(b'To: John X.<jxd@example.com>')['to']
'John X.<jxd@example.com>'
>>> email.message_from_bytes(b'To: John X.<jxd@example.com>', policy=default)['to']
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/bhat/git/cpython/Lib/email/message.py", line 391, in __getitem__
return self.get(name)
File "/Users/bhat/git/cpython/Lib/email/message.py", line 471, in get
return self.policy.header_fetch_parse(k, v)
File "/Users/bhat/git/cpython/Lib/email/policy.py", line 162, in header_fetch_parse
return self.header_factory(name, value)
File "/Users/bhat/git/cpython/Lib/email/headerregistry.py", line 586, in __call__
return self[name](name, value)
File "/Users/bhat/git/cpython/Lib/email/headerregistry.py", line 197, in __new__
cls.parse(value, kwds)
File "/Users/bhat/git/cpython/Lib/email/headerregistry.py", line 344, in parse
for mb in addr.all_mailboxes]))
File "/Users/bhat/git/cpython/Lib/email/headerregistry.py", line 344, in <listcomp>
for mb in addr.all_mailboxes]))
File "/Users/bhat/git/cpython/Lib/email/_header_value_parser.py", line 834, in display_name
return self[0].display_name
File "/Users/bhat/git/cpython/Lib/email/_header_value_parser.py", line 768, in display_name
return self[0].display_name
File "/Users/bhat/git/cpython/Lib/email/_header_value_parser.py", line 936, in display_name
if res[-1][-1].token_type == 'cfws':
AttributeError: 'str' object has no attribute 'token_type'
|
I can also confirm this bug, with Python 3.9.1 on Debian GNU/Linux ('testing' distro up-to-date as of 2020-12-21).
1) Create a parser `p` with `p = email.parser.HeaderParser(policy=email.policy.default)`.
2) Parse a single problematic (as described below) message: `msg_headers = p.parsestr(msg_str)`
3) Try to get the To field: `msg_headers.get_all('to', [ ])` and get an exception raised: `AttributeError: 'str' object has no attribute 'token_type'` (with the same stack trace as OP shows).
Here is a minimal problematic message you can reproduce this with (i.e., just make `msg_str` have this string value):
```
From nobody Mon Dec 21 12:00:00 2020
From: sender@example.com
To: . <jrandom@example.com>
Date: Mon, 21 Dec 2020 12:00:00 -0000
Message-ID: <87ab5rvds7.fsf@example.com>
Subject: This is the Subject header.
Here is the body of the message.
```
Note that *any* number of dots for the recipient's name would also result in an error. The above example uses just ".", but it could be "..", "...", ".................", etc.
|