Author r.david.murray
Recipients barry, bgamari, l0nwlf, maxua, r.david.murray, tony_nelson
Date 2010-06-03.16:31:20
SpamBayes Score 2.04859e-05
Marked as misclassified No
Message-id <1275582683.23.0.384513249134.issue4487@psf.upfronthosting.co.za>
In-reply-to
Content
For various reasons the email module has a table of character sets.  What might be most effective would be for the email module to look a character set name up in the codecs module and find out the cannonical name of the character set, and then look that up in its table (ie: remove the aliases table from email completely, and instead depend on codecs to resolve the cannonical name).  Unfortunately the codecs module does not recognize all of the aliases used by email, nor is there necessarily any guarantee that the two modules will agree on the proper cannonical name.

The attached patch instead uses the codecs module as a fallback if the charset name does not appear in the email package's ALIASES or CHARSETS tables.  It therefore makes both utf8 and utf_8 work, as well as all the other variants the codec module accepts.  The unit test just tests 'utf8', since if that one works all the others should too.

I'm tentatively reclassifying this as a bug rather than a feature request, since I think it is a reasonable expectation that email would support at least the same set of encoding names that the rest of Python does.
History
Date User Action Args
2010-06-03 16:31:23r.david.murraysetrecipients: + r.david.murray, barry, tony_nelson, maxua, bgamari, l0nwlf
2010-06-03 16:31:23r.david.murraysetmessageid: <1275582683.23.0.384513249134.issue4487@psf.upfronthosting.co.za>
2010-06-03 16:31:21r.david.murraylinkissue4487 messages
2010-06-03 16:31:20r.david.murraycreate