classification
Title: Unicode email address helper
Type: feature request Stage: patch review
Components: Library (Lib), Unicode Versions: Python 2.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: loewis, zenzen (2)
Priority: normal Keywords patch

Created on 2004-05-31 23:36 by zenzen, last changed 2009-02-14 13:56 by ajaksu2.

Files
File name Uploaded Description Edit Remove
EmailAddress.py zenzen, 2004-09-07 05:34
test_EmailAddress.py zenzen, 2004-09-07 05:38
Messages (4)
msg46105 - (view) Author: Stuart Bishop (zenzen) Date: 2004-05-31 23:36
Converting email addresses between Unicode and ASCII is non 
trivial, as three different encodings are used (RFC2047, IDNA and 
ASCII). Here is an EmailAddress I use and a test suite, which I feel 
should be integrated into the email package. I'm quite happy to 
implement a different interface if the 'unicode subclass' design is 
unsuitable, although I got no feedback from the Email-SIG so they 
are either happy with it or asleep ;)

msg46106 - (view) Author: Martin v. Löwis (loewis) Date: 2004-08-25 13:18
Logged In: YES 
user_id=21627

I think it is inappropriate to create new API for this.
Instead, one of the many functions that already deal with
address parsing need to be enhanced. For example,
email.Util.formataddr should learn to format unicode
strings, too. Likewise, parseaddr could grow a parameter
do_unicode, or a second function parseaddr_unicode could be
added. There is IMO no need for a new class.

In addition, this patch lacks documentation and test cases.
msg46107 - (view) Author: Martin v. Löwis (loewis) Date: 2004-08-25 13:19
Logged In: YES 
user_id=21627

Oops, test cases are there - only documentation is lacking.
msg46108 - (view) Author: Stuart Bishop (zenzen) Date: 2004-09-07 05:30
Logged In: YES 
user_id=46639

I think that adding options to the existing APIs simply makes the Unicode 
support feel tacked on (as it would be). It is also error prone, where if 
you are following best practice and using Unicode everywhere, you have 
to remember to explicitly pass the 'do_unicode=True' parameter to this 
one particular function.

I think the alternative approach would be to use a codec, similar to how 
Unicode DNS domains are handled 
('foo@example.com'.decode('emailaddress')).

I still prefer the OO approach though, as it allows the programmer to 
treat email addresses as a standard Unicode string with a few extra 
features, such as the __cmp__ method I've since added to 
EmailAddress.py and the test suite:

>>> e = EmailAddress(u'renee@ol\u00e9.de', u'Rene\u00e9 Acut\u00e9')
>>> e == str(e)
True
>>> e == unicode(e)
True
>>> e == str(EmailAddress(e.upper()))
True
>>> e == unicode(EmailAddress(e.upper()))
True
History
Date User Action Args
2009-03-30 22:56:23ajaksu2linkissue1685453 dependencies
2009-02-14 13:56:03ajaksu2setstage: patch review
type: feature request
components: + Unicode
versions: + Python 2.7, - Python 2.4
2004-05-31 23:36:08zenzencreate