Message 150461 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	runtux
Recipients	barry, ggenellina, ishimoto, jafo, kael, leromarinvit, r.david.murray, runtux, tkikuchi, tlynn, tony_nelson
Date	2012-01-02.16:09:52
SpamBayes Score	8.89875e-07
Marked as misclassified	No
Message-id	<1325520593.45.0.455559184503.issue1079@psf.upfronthosting.co.za>
In-reply-to

Content
maybe it would be a good start to include the examples at the end of RFC2047 into the regression tests? These examples at least support the case that a '?' may immediately follow an encoded string: encoded form displayed as (=?ISO-8859-1?Q?a?= =?ISO-8859-1?Q?b?=) (ab) when trying this in python 2.7: >>> decode_header ('(=?ISO-8859-1?Q?a?= =?ISO-8859-1?Q?b?=)') [('(', None), ('a', 'iso-8859-1'), ('=?ISO-8859-1?Q?b?=)', None)] this fails. So I consider this a bug. Note that although RFC2047 is vague concerning the interpretation if two encoded strings could follow each other without a whitespace, these are seen in the wild and are interpreted correctly by the mailers I've tested: mutt, thunderbird, exchange in various versions, even lotus notes seems to get this right. So I guess python should be "liberal in what you accept" and parse something like '(=?ISO-8859-1?Q?a?==?ISO-8859-1?Q?b?=)' into [ ('(', None) , ('a', 'iso-8859-1') , ('b', 'iso-8859-1') , (')', None) ]

maybe it would be a good start to include the examples at the end of RFC2047 into the regression tests? These examples at least support the case that a '?' may immediately follow an encoded string:

encoded form                                displayed as
(=?ISO-8859-1?Q?a?= =?ISO-8859-1?Q?b?=)     (ab)

when trying this in python 2.7:

>>> decode_header ('(=?ISO-8859-1?Q?a?= =?ISO-8859-1?Q?b?=)')
[('(', None), ('a', 'iso-8859-1'), ('=?ISO-8859-1?Q?b?=)', None)]

this fails. So I consider this a bug.

Note that although RFC2047 is vague concerning the interpretation if two encoded strings could follow each other without a whitespace, these *are* seen in the wild and *are* interpreted correctly by the mailers I've tested: mutt, thunderbird, exchange in various versions, even lotus notes seems to get this right. So I guess python should be "liberal in what you accept" and parse something like 
'(=?ISO-8859-1?Q?a?==?ISO-8859-1?Q?b?=)'
into
[ ('(', None)
, ('a', 'iso-8859-1')
, ('b', 'iso-8859-1')
, (')', None)
]

History
Date	User	Action	Args
2012-01-02 16:09:53	runtux	set	recipients: + runtux, barry, jafo, ishimoto, tlynn, ggenellina, tkikuchi, tony_nelson, kael, r.david.murray, leromarinvit
2012-01-02 16:09:53	runtux	set	messageid: <1325520593.45.0.455559184503.issue1079@psf.upfronthosting.co.za>
2012-01-02 16:09:52	runtux	link	issue1079 messages
2012-01-02 16:09:52	runtux	create