Author epicfaace
Recipients barry, epicfaace, maxking, mytran, r.david.murray
Date 2019-08-12.23:09:39
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1565651379.61.0.982482113101.issue37764@roundup.psfhosted.org>
In-reply-to
Content
Oh, both the Travis links I sent actually ended up reproducing the bug.

I've made a PR that fixes with an even smaller test case:

get_unstructured('=?utf-8?q?somevalue?=aa')

It looks like this is caused because "aa" is thought to be an encoded word escape in https://github.com/python/cpython/blob/fd5a82a7685d1599aab12e722a383cb0a2adfd8a/Lib/email/_header_value_parser.py#L1042 -- thus, get_encoded_word fails, which ends up making get_unstructured go in an infinite loop.

My PR makes the parser parse "=?utf-8?q?somevalue?=aa" as "=?utf-8?q?somevalue?=aa". However, the existing test cases make sure it parses "=?utf-8?q?somevalue?=nowhitespace" as "somevaluenowhitespace". I'm not too familiar with RFC 2047, but why are "aa" and "nowhitespace" treated differently? Should they be?
History
Date User Action Args
2019-08-12 23:09:39epicfaacesetrecipients: + epicfaace, barry, r.david.murray, maxking, mytran
2019-08-12 23:09:39epicfaacesetmessageid: <1565651379.61.0.982482113101.issue37764@roundup.psfhosted.org>
2019-08-12 23:09:39epicfaacelinkissue37764 messages
2019-08-12 23:09:39epicfaacecreate