Issue 14291: Regression in Python3 of email handling of unicode strings in headers

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/58499

classification

Title:	Regression in Python3 of email handling of unicode strings in headers
Type:	behavior	Stage:	resolved
Components:	Library (Lib)	Versions:	Python 3.2, Python 3.3

process

Status:	closed	Resolution:	fixed
Dependencies:		Superseder:
Assigned To:	r.david.murray	Nosy List:	aikinci, python-dev, r.david.murray
Priority:	high	Keywords:	easy, patch

Created on 2012-03-13 19:55 by r.david.murray, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name	Uploaded	Description	Edit
Issue14291.patch	aikinci, 2012-03-14 00:51		review

Messages (4)
msg155656 - (view)	Author: R. David Murray (r.david.murray) *	Date: 2012-03-13 19:55
In Python2, this works: >>> from email.mime.text import MIMEText >>> m = MIMEText('abc') >>> str(m) 'From nobody Tue Mar 13 15:44:59 2012\nContent-Type: text/plain; charset="us-ascii"\nMIME-Version: 1.0\nContent-Transfer-Encoding: 7bit\n\nabc' >>> m['Subject'] = u'É test' >>> str(m) 'From nobody Tue Mar 13 15:48:11 2012\nContent-Type: text/plain; charset="us-ascii"\nMIME-Version: 1.0\nContent-Transfer-Encoding: 7bit\nSubject: =?utf-8?q?=C3=89_test?=\n\nabc' That is, unicode string automatically get turned into encoded words. In Python3 this no longer works: >>> from email.mime.text import MIMEText >>> m = MIMEText('abc') >>> str(m) 'Content-Type: text/plain; charset="us-ascii"\nMIME-Version: 1.0\nContent-Transfer-Encoding: 7bit\n\nabc' >>> m['Subject'] = u'É test' >>> str(m) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/home/rdmurray/python/p33/Lib/email/message.py", line 154, in __str__ return self.as_string() File "/home/rdmurray/python/p33/Lib/email/message.py", line 168, in as_string g.flatten(self, unixfrom=unixfrom) File "/home/rdmurray/python/p33/Lib/email/generator.py", line 99, in flatten self._write(msg) File "/home/rdmurray/python/p33/Lib/email/generator.py", line 152, in _write self._write_headers(msg) File "/home/rdmurray/python/p33/Lib/email/generator.py", line 186, in _write_headers header_name=h) File "/home/rdmurray/python/p33/Lib/email/header.py", line 205, in __init__ self.append(s, charset, errors) File "/home/rdmurray/python/p33/Lib/email/header.py", line 286, in append s.encode(output_charset, errors) UnicodeEncodeError: 'ascii' codec can't encode character '\xc9' in position 0: ordinal not in range(128) Presumably the problem is that the Python2 code tests for 'string' and if it isn't string handles it by CTE encoding it. In Python3 everything is a string. Probably what should happen is the encoding error should be caught, and the CTE encoding done at that point, based on the model of how Python2 handled unicode strings.
msg155700 - (view)	Author: Ali Ikinci (aikinci)	Date: 2012-03-14 00:51
Together with David we have worked on a fix and test for this. Thanks David.
msg155728 - (view)	Author: Roundup Robot (python-dev)	Date: 2012-03-14 07:03
New changeset fd4b4650856f by R David Murray in branch '3.2': #14291: if a header has non-ascii unicode, default to CTE using utf-8 http://hg.python.org/cpython/rev/fd4b4650856f New changeset f5dcb2d58893 by R David Murray in branch 'default': Merge #14291: if a header has non-ascii unicode, default to CTE using utf-8 http://hg.python.org/cpython/rev/f5dcb2d58893
msg155729 - (view)	Author: R. David Murray (r.david.murray) *	Date: 2012-03-14 07:04
Fix committed. Thanks Ali.

History
Date	User	Action	Args
2022-04-11 14:57:27	admin	set	github: 58499
2012-03-14 07:04:58	r.david.murray	set	status: open -> closed resolution: fixed messages: + msg155729 stage: needs patch -> resolved
2012-03-14 07:03:46	python-dev	set	nosy: + python-dev messages: + msg155728
2012-03-14 00:51:38	aikinci	set	files: + Issue14291.patch keywords: + patch messages: + msg155700
2012-03-13 19:55:33	r.david.murray	create