Message 28181 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	mgoutell
Recipients
Date	2006-04-10.10:33:54
SpamBayes Score
Marked as misclassified
Message-id
In-reply-to

Content
The Header.decode_header function eats up spaces in non-encoded part of a header. See the following source: # -- coding: iso-8859-1 -- from email.Header import Header, decode_header h = Header('Essai ', None) h.append('éè', 'iso-8859-1') print h print decode_header(h) This prints: Essai =?iso-8859-1?q?=E9=E8?= [('Test', None), ('\xe9\xe8', 'iso-8859-1')] This should print: Essai =?iso-8859-1?q?=E9=E8?= [('Test ', None), ('\xe9\xe8', 'iso-8859-1')] ^ This space disappears This appears in Python 2.3 but the source code of the function didn't change in 2.4 so the same problem should still exist. Bug "[ 1372770 ] email.Header should preserve original FWS" may be linked to that one although I'm not sure this is exactly the same. This patch (not extensively tested though) seems to solve this problem: --- /usr/lib/python2.3/email/Header.py 2005-09-05 00:20:03.000000000 +0200 +++ Header.py 2006-04-10 12:27:27.000000000 +0200 @@ -90,7 +90,7 @@ continue parts = ecre.split(line) while parts: - unenc = parts.pop(0).strip() + unenc = parts.pop(0).rstrip() if unenc: # Should we continue a long line? if decoded and decoded[-1][1] is None:

The Header.decode_header function eats up spaces in
non-encoded part of a header.

See the following source:
# -*- coding: iso-8859-1 -*-
from email.Header import Header, decode_header
h = Header('Essai ', None)
h.append('éè', 'iso-8859-1')
print h
print decode_header(h)

This prints:
Essai =?iso-8859-1?q?=E9=E8?=
[('Test', None), ('\xe9\xe8', 'iso-8859-1')]

This should print:
Essai =?iso-8859-1?q?=E9=E8?=
[('Test ', None), ('\xe9\xe8', 'iso-8859-1')]
       ^ This space disappears

This appears in Python 2.3 but the source code of the
function didn't change in 2.4 so the same problem
should still exist. Bug "[ 1372770 ] email.Header
should preserve original FWS" may be linked to that one
although I'm not sure this is exactly the same.

This patch (not extensively tested though) seems to
solve this problem:

--- /usr/lib/python2.3/email/Header.py  2005-09-05
00:20:03.000000000 +0200
+++ Header.py   2006-04-10 12:27:27.000000000 +0200
@@ -90,7 +90,7 @@
             continue
         parts = ecre.split(line)
         while parts:
-            unenc = parts.pop(0).strip()
+            unenc = parts.pop(0).rstrip()
             if unenc:
                 # Should we continue a long line?
                 if decoded and decoded[-1][1] is None:

History
Date	User	Action	Args
2007-08-23 14:39:16	admin	link	issue1467619 messages
2007-08-23 14:39:16	admin	create