classification
Title: email "default" policy raises exception iterating over unparseable date headers
Type: behavior Stage: resolved
Components: email Versions: Python 3.8, Python 3.7, Python 3.6
process
Status: open Resolution: duplicate
Dependencies: Superseder: email.utils.parsedate_to_datetime() should return None when date cannot be parsed
View: 30681
Assigned To: Nosy List: barry, r.david.murray, rptb1
Priority: normal Keywords:

Created on 2018-11-28 17:37 by rptb1, last changed 2018-11-28 18:17 by r.david.murray.

Messages (2)
msg330621 - (view) Author: Richard Brooksby (rptb1) Date: 2018-11-28 17:37
It is not possible to loop over the headers of a message with an unparseable date field using the "default" policy.  This means that a poison email can break email processing.

I expect to be able to process an email with an unparseable date field using the "default" policy.

$ python3 --version
Python 3.6.7
$ python3
Python 3.6.7 (default, Oct 22 2018, 11:32:17) 
[GCC 8.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import email
>>> import email.policy
>>> email.message_from_string('Date: not a parseable date', policy=email.policy.default).items()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.6/email/message.py", line 460, in items
    for k, v in self._headers]
  File "/usr/lib/python3.6/email/message.py", line 460, in <listcomp>
    for k, v in self._headers]
  File "/usr/lib/python3.6/email/policy.py", line 162, in header_fetch_parse
    return self.header_factory(name, value)
  File "/usr/lib/python3.6/email/headerregistry.py", line 589, in __call__
    return self[name](name, value)
  File "/usr/lib/python3.6/email/headerregistry.py", line 197, in __new__
    cls.parse(value, kwds)
  File "/usr/lib/python3.6/email/headerregistry.py", line 306, in parse
    value = utils.parsedate_to_datetime(value)
  File "/usr/lib/python3.6/email/utils.py", line 210, in parsedate_to_datetime
    *dtuple, tz = _parsedate_tz(data)
TypeError: 'NoneType' object is not iterable
>>> 

Related: https://docs.python.org/3/library/email.headerregistry.html#email.headerregistry.DateHeader does not specify what happens to the datetime field if a date header cannot be parsed.

Related: https://docs.python.org/3/library/email.utils.html#email.utils.parsedate_to_datetime does not specify what happens if a date cannot be parsed.

Suggested tests: random fuzz testing of the contents of all email headers, especially those with parsers in the header registry.
msg330627 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2018-11-28 18:17
This is effectively a duplicate of #30681, which has a solution, although it is not yet in final form per the last couple of comments on the issue and the PR.
History
Date User Action Args
2018-11-28 18:17:59r.david.murraysetversions: + Python 3.8
resolution: duplicate
messages: + msg330627

superseder: email.utils.parsedate_to_datetime() should return None when date cannot be parsed
stage: resolved
2018-11-28 17:48:15rptb1setversions: + Python 3.7
2018-11-28 17:45:26rptb1setversions: + Python 3.6, - Python 3.7
2018-11-28 17:37:09rptb1create