Message 133935 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	torsten.becker
Recipients	r.david.murray, sdaoden, torsten.becker
Date	2011-04-17.19:06:29
SpamBayes Score	1.0926186e-08
Marked as misclassified	No
Message-id	<1303067191.51.0.627601440961.issue11783@psf.upfronthosting.co.za>
In-reply-to

Content
Hi, here is my revised patch with email.utils.getaddresses() also decoding IDNs. I decided to integrate IDN decoding in AddrlistClass.getaddress() instead of AddrlistClass.getaddrlist() since that function is one level lower and if somebody should ever all it directly, the conversion would not happen. I also fixed a glitch in the docs, "versionchanged" seems to need two colons to end up in the generated HTML. As a follow up, wouldn't it be helpful if email.Message would do the conversions directly? So when you parse a mail into a Message and access the "To" field, you get a list of tuples which are decoded properly? For example the following test currently still fails because the quoted header value is not decoded by email.feedparser.FeedParser nor email.Message: def test_email_decodes_idns_and_unicode(self): text = '''\ To: =?utf-8?b?SMOkbnMgV8O8cnN0?= <hans@xn--dm-fka.ain> Hello World!''' msg = Parser().parsestr(text) self.assertEqual(utils.getaddresses(msg.get_all('To')), [('H\xe4ns W\xfcrst', 'hans@d\xf6m.ain')]) Am I using the package wrong here or is this actually missing? email.header.decode_header seems to be able to do this already but it is not used. Would it be safe to integrate this into the email.message._sanitize_header helper?

Hi, here is my revised patch with email.utils.getaddresses() also decoding IDNs.

I decided to integrate IDN decoding in AddrlistClass.getaddress() instead of AddrlistClass.getaddrlist() since that function is one level lower and if somebody should ever all it directly, the conversion would not happen.

I also fixed a glitch in the docs, "versionchanged" seems to need two colons to end up in the generated HTML.


As a follow up, wouldn't it be helpful if email.Message would do the conversions directly?  So when you parse a mail into a Message and access the "To" field, you get a list of tuples which are decoded properly?

For example the following test currently still fails because the quoted header value is not decoded by email.feedparser.FeedParser nor email.Message:

    def test_email_decodes_idns_and_unicode(self):
        text = '''\
To: =?utf-8?b?SMOkbnMgV8O8cnN0?= <hans@xn--dm-fka.ain>

Hello World!'''
        msg = Parser().parsestr(text)
        self.assertEqual(utils.getaddresses(msg.get_all('To')),
            [('H\xe4ns W\xfcrst', 'hans@d\xf6m.ain')])

Am I using the package wrong here or is this actually missing?  email.header.decode_header seems to be able to do this already but it is not used.  Would it be safe to integrate this into the email.message._sanitize_header helper?

History
Date	User	Action	Args
2011-04-17 19:06:32	torsten.becker	set	recipients: + torsten.becker, r.david.murray, sdaoden
2011-04-17 19:06:31	torsten.becker	set	messageid: <1303067191.51.0.627601440961.issue11783@psf.upfronthosting.co.za>
2011-04-17 19:06:30	torsten.becker	link	issue11783 messages
2011-04-17 19:06:30	torsten.becker	create