Author runtux
Recipients barry, ggenellina, ishimoto, jafo, kael, leromarinvit, r.david.murray, runtux, tkikuchi, tlynn, tony_nelson
Date 2012-01-02.16:09:52
SpamBayes Score 8.89875e-07
Marked as misclassified No
Message-id <1325520593.45.0.455559184503.issue1079@psf.upfronthosting.co.za>
In-reply-to
Content
maybe it would be a good start to include the examples at the end of RFC2047 into the regression tests? These examples at least support the case that a '?' may immediately follow an encoded string:

encoded form                                displayed as
(=?ISO-8859-1?Q?a?= =?ISO-8859-1?Q?b?=)     (ab)

when trying this in python 2.7:

>>> decode_header ('(=?ISO-8859-1?Q?a?= =?ISO-8859-1?Q?b?=)')
[('(', None), ('a', 'iso-8859-1'), ('=?ISO-8859-1?Q?b?=)', None)]

this fails. So I consider this a bug.

Note that although RFC2047 is vague concerning the interpretation if two encoded strings could follow each other without a whitespace, these *are* seen in the wild and *are* interpreted correctly by the mailers I've tested: mutt, thunderbird, exchange in various versions, even lotus notes seems to get this right. So I guess python should be "liberal in what you accept" and parse something like 
'(=?ISO-8859-1?Q?a?==?ISO-8859-1?Q?b?=)'
into
[ ('(', None)
, ('a', 'iso-8859-1')
, ('b', 'iso-8859-1')
, (')', None)
]
History
Date User Action Args
2012-01-02 16:09:53runtuxsetrecipients: + runtux, barry, jafo, ishimoto, tlynn, ggenellina, tkikuchi, tony_nelson, kael, r.david.murray, leromarinvit
2012-01-02 16:09:53runtuxsetmessageid: <1325520593.45.0.455559184503.issue1079@psf.upfronthosting.co.za>
2012-01-02 16:09:52runtuxlinkissue1079 messages
2012-01-02 16:09:52runtuxcreate