This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author maxking
Recipients barry, epicfaace, maxking, mytran, r.david.murray
Date 2019-08-16.05:55:46
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <>
You have correctly identified that "=aa" is detected as a encoded word and causes the get_encoded_word to fail.

However, "=?utf-8?q?somevalue?=aa" should ideally get parsed as "somevalueaa" and not "=?utf-8?q?somevalue?=aa". This is because "=?utf-8?q?somevalue?=" is a valid encoded word, it is just not followed by an empty whitespace. 

modified   Lib/email/
@@ -1037,7 +1037,10 @@ def get_encoded_word(value):
         raise errors.HeaderParseError(
             "expected encoded word but found {}".format(value))
     remstr = ''.join(remainder)
-    if len(remstr) > 1 and remstr[0] in hexdigits and remstr[1] in hexdigits:
+    if (len(remstr) > 1 and
+        remstr[0] in hexdigits and
+        remstr[1] in hexdigits and
+        tok.count('?') < 2):
         # The ? after the CTE was followed by an encoded word escape (=XX).
         rest, *remainder = remstr.split('?=', 1)

This can be avoided by checking `?` occurs twice in the `tok`.

The 2nd bug, which needs a better test case, is that if the encoded_word is invalid, you will keep running into infinite loop, which you correctly fixed in your PR. However, the test case you used is more appropriate for the first issue.

You can fix both the issues, for which, you need to add a test case for 2nd issue and fix for the first issue.

Looking into the PR now.
Date User Action Args
2019-08-16 05:55:46maxkingsetrecipients: + maxking, barry, r.david.murray, epicfaace, mytran
2019-08-16 05:55:46maxkingsetmessageid: <>
2019-08-16 05:55:46maxkinglinkissue37764 messages
2019-08-16 05:55:46maxkingcreate