Message 85448 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	tony_nelson
Recipients	barry, ggenellina, jafo, kael, r.david.murray, tlynn, tony_nelson
Date	2009-04-04.23:17:21
SpamBayes Score	0.00018219215
Marked as misclassified	No
Message-id	<1238887043.59.0.394644173784.issue1079@psf.upfronthosting.co.za>
In-reply-to

Content
The email package does not follow the RFCs in anything to do with header parsing or decoding. This is a known deficiency. So no, I am not thinking of atoms at all -- and neither is email.header.decode_header()! :-( Until email.header actually parses headers into atoms and then decodes atoms, it doesn't matter what parsed atoms would look like. Currently, email.header.decode_header() just stumbles through raw text, and doesn't know if it is looking at atoms or not, or usually even what header the text came from. In order to interpret the RFC correctly, email.header.decode_header() needs either a parser and the name of the header it is decoding, or parsed header data. I think the latter is being considered for a redesign of the email package for 3.1 or 3.2 (3 months to a year or so, and not for 2.x at all), but until then, it is better to decode every likely encoded-word than to skip encoded-words that, for example, have a parenthesis on one side or the other.

The email package does not follow the RFCs in anything to do with header
parsing or decoding.  This is a known deficiency.  So no, I am not
thinking of atoms at all -- and neither is email.header.decode_header()! :-(

Until email.header actually parses headers into atoms and then decodes
atoms, it doesn't matter what parsed atoms would look like.  Currently,
email.header.decode_header() just stumbles through raw text, and doesn't
know if it is looking at atoms or not, or usually even what header the
text came from.

In order to interpret the RFC correctly, email.header.decode_header()
needs either a parser and the name of the header it is decoding, or
parsed header data.  I think the latter is being considered for a
redesign of the email package for 3.1 or 3.2 (3 months to a year or so,
and not for 2.x at all), but until then, it is better to decode every
likely encoded-word than to skip encoded-words that, for example, have a
parenthesis on one side or the other.

History
Date	User	Action	Args
2009-04-04 23:17:24	tony_nelson	set	recipients: + tony_nelson, barry, jafo, tlynn, ggenellina, kael, r.david.murray
2009-04-04 23:17:23	tony_nelson	set	messageid: <1238887043.59.0.394644173784.issue1079@psf.upfronthosting.co.za>
2009-04-04 23:17:22	tony_nelson	link	issue1079 messages
2009-04-04 23:17:21	tony_nelson	create