This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: parse_message_id in email module is very buggy / crashy
Type: behavior Stage: resolved
Components: email Versions: Python 3.9, Python 3.8
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: barry, maxking, miss-islington, r.david.murray, xnox, xtreak
Priority: normal Keywords: patch

Created on 2019-11-05 22:48 by xnox, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 17504 merged maxking, 2019-12-08 06:29
PR 17515 merged miss-islington, 2019-12-09 01:37
Messages (6)
msg356072 - (view) Author: Dimitri John Ledkov (xnox) * Date: 2019-11-05 22:48
email module has recently got parse_message_id which is more strict now, then before.

However, it's not programmed as defensively as expected. Given bogus message-id, it crashes with unbound local variable, or like accessing a non-existing index.

So hyperkitty had a Message-ID "X"*260 in the testsuite that used to pass with 3.7, but fails with 3.8.

======================================================================
ERROR: test_long_message_id (hyperkitty.tests.lib.test_incoming.TestAddToList)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "./hyperkitty/tests/lib/test_incoming.py", line 295, in test_long_message_id
    msg["Message-ID"] = "X" * 260
  File "/usr/lib/python3.8/email/message.py", line 409, in __setitem__
    self._headers.append(self.policy.header_store_parse(name, val))
  File "/usr/lib/python3.8/email/policy.py", line 148, in header_store_parse
    return (name, self.header_factory(name, value))
  File "/usr/lib/python3.8/email/headerregistry.py", line 602, in __call__
    return self[name](name, value)
  File "/usr/lib/python3.8/email/headerregistry.py", line 197, in __new__
    cls.parse(value, kwds)
  File "/usr/lib/python3.8/email/headerregistry.py", line 530, in parse
    kwds['parse_tree'] = parse_tree = cls.value_parser(value)
  File "/usr/lib/python3.8/email/_header_value_parser.py", line 2116, in parse_message_id
    message_id.append(token)
UnboundLocalError: local variable 'token' referenced before assignment

Similarly another user, surkova reports that value[0] in get_msg_id function is buggy too (doesn't check that value has a member)

First reported https://github.com/python/cpython/pull/13397#discussion_r341968031

Ideally, I'd like the function to raise a documented Exception for invalid Message-id, but not fail with what look like regular programming bugs in the email module. Expectation is that email module is either more permissive or is coded more defence-in-depth with more checking in place.
msg356109 - (view) Author: Karthikeyan Singaravelan (xtreak) * (Python committer) Date: 2019-11-06 09:46
This seems to be the same as issue38698.
msg356123 - (view) Author: Dimitri John Ledkov (xnox) * Date: 2019-11-06 12:22
Yes, issue38698 covers the UnboundLocalError,

but doesn't cover inside get_msg_id there is also this gem:

    def get_msg_id(value):
        msg_id = MsgID()
        if value[0] in CFWS_LEADER:

It should test value before accessing value[0] like it is done in other places, ie.:

    if value and value[0] in CFWS_LEADER:

or indent the whole block to iterate over value with:

    while value:
        ...

which also tests that value has [0] index.

I guess I want to repurpose this issue for the value[0] indexerror in get_msg_id().
msg358045 - (view) Author: Abhilash Raj (maxking) * (Python committer) Date: 2019-12-09 01:37
New changeset 3ae4ea1931361dd2743e464790e739d9285501bf by Abhilash Raj in branch 'master':
bpo-38708: email: Fix a potential IndexError when parsing Message-ID (GH-17504)
https://github.com/python/cpython/commit/3ae4ea1931361dd2743e464790e739d9285501bf
msg358047 - (view) Author: miss-islington (miss-islington) Date: 2019-12-09 02:12
New changeset 2abd3a8f580e6c7b1ce88b2ae9f9a783f4aea5d3 by Miss Islington (bot) in branch '3.8':
bpo-38708: email: Fix a potential IndexError when parsing Message-ID (GH-17504)
https://github.com/python/cpython/commit/2abd3a8f580e6c7b1ce88b2ae9f9a783f4aea5d3
msg358048 - (view) Author: Abhilash Raj (maxking) * (Python committer) Date: 2019-12-09 02:35
Closing this as fixed.
History
Date User Action Args
2022-04-11 14:59:22adminsetgithub: 82889
2019-12-09 02:35:20maxkingsetstatus: open -> closed
resolution: fixed
messages: + msg358048

stage: patch review -> resolved
2019-12-09 02:12:53miss-islingtonsetnosy: + miss-islington
messages: + msg358047
2019-12-09 01:37:47miss-islingtonsetpull_requests: + pull_request16992
2019-12-09 01:37:38maxkingsetmessages: + msg358045
2019-12-08 06:29:36maxkingsetkeywords: + patch
stage: patch review
pull_requests: + pull_request16981
2019-11-06 12:47:26xtreaksetnosy: + barry, r.david.murray, maxking
type: behavior
components: + email
2019-11-06 12:22:07xnoxsetmessages: + msg356123
2019-11-06 09:46:56xtreaksetnosy: + xtreak
messages: + msg356109
2019-11-05 22:48:04xnoxcreate