Message 347607 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	aldwinaldwin
Recipients	aldwinaldwin, barry, maxking, r.david.murray, yunlee
Date	2019-07-10.08:00:16
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1562745616.54.0.31583456645.issue37532@roundup.psfhosted.org>
In-reply-to

Content
Changing everything to utf-8 breaks a lot of tests, so here a less invasive solution? diff --git a/Lib/email/header.py b/Lib/email/header.py index 4ab0032bc6..1e71eeae7f 100644 --- a/Lib/email/header.py +++ b/Lib/email/header.py @@ -136,7 +136,14 @@ def decode_header(header): last_word = last_charset = None for word, charset in decoded_words: if isinstance(word, str): - word = bytes(word, 'raw-unicode-escape') + word_tmp = bytes(word, 'raw-unicode-escape') + input_charset = charset or 'us-ascii' + try: + _ = word_tmp.decode(input_charset, errors='strict') + word = word_tmp + except UnicodeDecodeError: + word = str(word).encode('utf-8') + charset = 'utf-8' if last_word is None: last_word = word last_charset = charset

Changing everything to utf-8 breaks a lot of tests, so here a less invasive solution?

diff --git a/Lib/email/header.py b/Lib/email/header.py
index 4ab0032bc6..1e71eeae7f 100644
--- a/Lib/email/header.py
+++ b/Lib/email/header.py
@@ -136,7 +136,14 @@ def decode_header(header):
     last_word = last_charset = None
     for word, charset in decoded_words:
         if isinstance(word, str):
-            word = bytes(word, 'raw-unicode-escape')
+            word_tmp = bytes(word, 'raw-unicode-escape')
+            input_charset = charset or 'us-ascii'
+            try:
+                _ = word_tmp.decode(input_charset, errors='strict')
+                word = word_tmp
+            except UnicodeDecodeError:
+                word = str(word).encode('utf-8')
+                charset = 'utf-8'
         if last_word is None:
             last_word = word
             last_charset = charset

History
Date	User	Action	Args
2019-07-10 08:00:16	aldwinaldwin	set	recipients: + aldwinaldwin, barry, r.david.murray, maxking, yunlee
2019-07-10 08:00:16	aldwinaldwin	set	messageid: <1562745616.54.0.31583456645.issue37532@roundup.psfhosted.org>
2019-07-10 08:00:16	aldwinaldwin	link	issue37532 messages
2019-07-10 08:00:16	aldwinaldwin	create