Issue 8132: email.header.decode_header makes mistakes

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/52379

classification

Title:	email.header.decode_header makes mistakes
Type:	behavior	Stage:	resolved
Components:	Library (Lib)	Versions:	Python 2.6

process

Created on 2010-03-13 15:29 by grmtz, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Messages (2)
msg101004 - (view)	Author: grumetz (grmtz)	Date: 2010-03-13 15:29
Examples: s = '=?UTF-8?B?QWNjdXPDqSBkZSByw6ljZXB0aW9uIChhZmZpY2jDqSkgLSA=?=Arobase !' decode_header(s) ---> [('=?UTF-8?B?QWNjdXPDqSBkZSByw6ljZXB0aW9uIChhZmZpY2jDqSkgLSA=?=Arobase !', None)] which seems bad... but ss ='=?UTF-8?B?QWNjdXPDqSBkZSByw6ljZXB0aW9uIChhZmZpY2jDqSkgLSA=?= Arobase !' decode_header(ss) ---> [('Accus\xc3\xa9 de r\xc3\xa9ception (affich\xc3\xa9) - ', 'utf-8'), ('Arobase !', None)] which seems good...
msg101039 - (view)	Author: R. David Murray (r.david.murray) *	Date: 2010-03-14 02:49
Per the RFC, this is the correct behavior. An encoded word must begin and end either at the field boundary or with whitespace. So ...?=Arobase, with no whitespace between the = and Arobase, makes your first example into an invalid encoded word, and thus it is returned as if it were plain ASCII. One could argue that email could be smarter and interpret this string as an encoded word anyway, following the Postel principle (be generous in what you accept), but it currently does not do so, and not doing so is not a bug. email6 will handle such non-RFC compliant examples better, if all goes well.

History
Date	User	Action	Args
2022-04-11 14:56:58	admin	set	github: 52379
2010-03-14 02:49:15	r.david.murray	set	status: open -> closed priority: normal nosy: + r.david.murray messages: + msg101039 resolution: not a bug stage: resolved
2010-03-13 15:29:46	grmtz	create