classification
Title: email/header.py doesn't handle Base64 headers that have been insufficiently padded.
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.1, Python 3.2, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: r.david.murray Nosy List: barry, jasonjwwilliams, r.david.murray, terry.reedy, tony_nelson
Priority: normal Keywords: patch

Created on 2008-06-21 23:58 by jasonjwwilliams, last changed 2010-08-04 00:07 by r.david.murray. This issue is now closed.

Files
File name Uploaded Description Edit
header_B_padding.patch tony_nelson, 2009-04-02 22:09 decode_header tolerate B bad padding
Messages (6)
msg68553 - (view) Author: Jason Williams (jasonjwwilliams) Date: 2008-06-21 23:58
email/header.py:decode_header() Line 95: dec = 
email.base64mime.decode(encoded)

Headers that contain Subject or From headers which are Base64 encoded 
and are insufficiently padded raise a HeaderParseError. The actual 
padding error is begin generated in binascii.a2b_base64 and bubbling up 
as a HeaderParseError in header.py. 

decode_header() should detect the padding error (Base64 text length does 
not evenly divide by 3) and automatically add padding before handing off 
to a2b_base64. The problem usually occurs with spam.

Example problem header:
Subject: =?iso-8859-1?B?
UHJldmVudCBGb3JlY2xvc3VyZSAmIFNhdmUgWW91ciBIb21lIA=?=


Properly Padded:
Subject: =?iso-8859-1?B?
UHJldmVudCBGb3JlY2xvc3VyZSAmIFNhdmUgWW91ciBIb21lIA==?=
msg85271 - (view) Author: Tony Nelson (tony_nelson) Date: 2009-04-02 22:09
Postel's law suggests that, as bad padding can be repaired,
decode_header ought to do so.  The patch does that, adds a test for it,
and alters another test to still properly fail on really bad encoded data.

The test doesn't check a single character encoded string, as such does
not specify a complete octet and I felt that base64 decoders might
reasonably differ on what to do then.

The issue exists in Python2.6.1 (where I made it) and trunk.
msg112662 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2010-08-03 20:00
In the absence of a specific doc citation saying otherwise, this strikes me as a feature request to be considered for the 3.2 email update.

Patch includes a test.
msg112666 - (view) Author: Jason Williams (jasonjwwilliams) Date: 2010-08-03 20:27
I'd argue that since the recipient has little control over incorrect header padding, the traditional approach with e-mail is to fix-up bad encoding...and this ia bug.
msg112667 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010-08-03 20:29
The quote of Postel's Law is in the RFCs, actually.  So I think we can choose to consider this a bug.  There is a effort/benefit tradeoff when deciding to handle dirty data, but this one is simple enough.  Unless someone can think of a reason why the slight change in behavior might break existing code (other than by letting more spam through :(
msg112728 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010-08-04 00:07
Committed to py3k in r83690, 3.1 in r83694, and 2.7 in r83695.

Thanks Jason.
History
Date User Action Args
2010-08-04 00:07:19r.david.murraysetstatus: open -> closed
versions: + Python 3.1, Python 2.7
type: enhancement -> behavior
messages: + msg112728

resolution: fixed
stage: patch review -> resolved
2010-08-03 20:29:02r.david.murraysetassignee: r.david.murray
messages: + msg112667
2010-08-03 20:27:24jasonjwwilliamssetmessages: + msg112666
2010-08-03 20:00:49terry.reedysetversions: + Python 3.2, - Python 2.5
nosy: + terry.reedy, r.david.murray

messages: + msg112662

type: behavior -> enhancement
stage: patch review
2009-04-02 22:12:39tony_nelsonsetnosy: + barry
2009-04-02 22:09:19tony_nelsonsetfiles: + header_B_padding.patch

nosy: + tony_nelson
messages: + msg85271

keywords: + patch
2008-06-21 23:58:57jasonjwwilliamscreate