This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: email.header.decode_header makes mistakes
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 2.6
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: grmtz, r.david.murray
Priority: normal Keywords:

Created on 2010-03-13 15:29 by grmtz, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Messages (2)
msg101004 - (view) Author: grumetz (grmtz) Date: 2010-03-13 15:29
Examples:

s = '=?UTF-8?B?QWNjdXPDqSBkZSByw6ljZXB0aW9uIChhZmZpY2jDqSkgLSA=?=Arobase !'
decode_header(s) --->
[('=?UTF-8?B?QWNjdXPDqSBkZSByw6ljZXB0aW9uIChhZmZpY2jDqSkgLSA=?=Arobase !', None)]
which seems bad...
but
ss ='=?UTF-8?B?QWNjdXPDqSBkZSByw6ljZXB0aW9uIChhZmZpY2jDqSkgLSA=?= Arobase !'
decode_header(ss) --->
[('Accus\xc3\xa9 de r\xc3\xa9ception (affich\xc3\xa9) - ', 'utf-8'), ('Arobase !', None)]
which seems good...
msg101039 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010-03-14 02:49
Per the RFC, this is the correct behavior.  An encoded word *must* begin and end either at the field boundary or with whitespace.  So ...?=Arobase, with no whitespace between the = and Arobase, makes your first example into an invalid encoded word, and thus it is returned as if it were plain ASCII.

One could argue that email could be smarter and interpret this string as an encoded word anyway, following the Postel principle (be generous in what you accept), but it currently does not do so, and not doing so is not a bug.

email6 will handle such non-RFC compliant examples better, if all goes well.
History
Date User Action Args
2022-04-11 14:56:58adminsetgithub: 52379
2010-03-14 02:49:15r.david.murraysetstatus: open -> closed
priority: normal


nosy: + r.david.murray
messages: + msg101039
resolution: not a bug
stage: resolved
2010-03-13 15:29:46grmtzcreate