classification
Title: decode_header() fails on multiline headers
Type: behavior Stage: resolved
Components: email, Library (Lib) Versions: Python 3.3
process
Status: closed Resolution: duplicate
Dependencies: Superseder: decode_header does not follow RFC 2047
View: 1079
Assigned To: Nosy List: barry, cschnee, python-dev, r.david.murray
Priority: normal Keywords:

Created on 2008-04-19 12:41 by cschnee, last changed 2012-06-03 16:28 by r.david.murray. This issue is now closed.

Messages (3)
msg65630 - (view) Author: Christoph Schneeberger (cschnee) Date: 2008-04-19 12:41
email.Header.decode_header() does not correctly deal with multiline
Headerlines.
header.py in revision 54371 (1) changes the behaviour, whereas
previously multiline headers where parsed correctly, header.py 54371
introduced a new regex part, that renders such headers invalid and they
won't be parsed as expected.
Given the following header line (doesn't matter if its parsed from a
mail or read from a string) which represents IMHO a valid RFC2047 header
line:

from email.Header import decode_header
decode_header('=?windows-1252?Q?=22M=FCller_T=22?=\r\n <T.Mueller@xxx.com>')

this will result in:
header.py (54371):
[('=?windows-1252?Q?=22M=FCller_T=22?=\r\n <T.Mueller@xxx.com>', None)]

resp. with header.py (54370):
[('"M\xfcller T"', 'windows-1252'), (' <T.Mueller@xxx.com>', None)]

Actually both seem parsed wrong, but with 54370 the result looks more
sane (the space should be IMO removed). 
Once the CRLF sequence is removed from the header it works fine and all
looks as expected:
>>> decode_header('=?windows-1252?Q?=22M=FCller_T=22?= <T.Mueller@xxx.com>')
[('"M\xfcller T"', 'windows-1252'), ('<T.Mueller@xxx.com>', None)]

This problem might or might not be related to 
- issue 1372770 
- issue 1467619

(1) http://svn.python.org/view?rev=54371&view=rev
msg162220 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2012-06-03 16:27
New changeset 0808cb8c60fd by R David Murray in branch 'default':
#2658: Add test for issue fixed by fix for #1079.
http://hg.python.org/cpython/rev/0808cb8c60fd
msg162222 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012-06-03 16:28
This is fixed by the fix for issue 1079.  I've added the test to the test suite.
History
Date User Action Args
2012-06-03 16:28:42r.david.murraysetstatus: open -> closed
versions: - Python 3.1, Python 2.7, Python 3.2
superseder: decode_header does not follow RFC 2047
messages: + msg162222

resolution: duplicate
stage: resolved
2012-06-03 16:27:21python-devsetnosy: + python-dev
messages: + msg162220
2012-05-16 01:37:52r.david.murraysetassignee: r.david.murray ->
components: + email
2011-03-13 22:41:51r.david.murraysetversions: + Python 3.1, Python 2.7, Python 3.3
2010-08-05 00:06:31terry.reedysetversions: + Python 3.2, - Python 2.5, Python 2.4
2010-05-05 13:51:58barrysetassignee: barry -> r.david.murray

nosy: + r.david.murray
2008-04-19 15:33:03benjamin.petersonsetassignee: barry
nosy: + barry
2008-04-19 12:41:33cschneecreate