classification
Title: rfc822.parseaddr is broken, breaks sendmail call in smtplib
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.1, Python 3.2, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: r.david.murray Nosy List: barry, jfinkels, r.david.murray, sdossey
Priority: normal Keywords: patch

Created on 2004-10-19 19:51 by sdossey, last changed 2010-10-02 16:28 by r.david.murray. This issue is now closed.

Files
File name Uploaded Description Edit
issue1050268.testcase.patch jfinkels, 2010-10-01 05:41 Test case for escaped quotes in email addresses.
parseaddr_quote.diff r.david.murray, 2010-10-02 04:06
Messages (5)
msg22785 - (view) Author: Scott Dossey (sdossey) Date: 2004-10-19 19:51

THe following email address is legal according to RFC:

<"\"quoted string\" somebody"@somedomain.com">

I've got a python mail handling back end server that
handles mail coming in from Postfix.  Postfix properly
accepts mail of this type, but when it goes to relay
this through my Python server it fails.

The problem is inside smtplib.py inside "quoteaddr". 
Here's a source code snippet:

def quoteaddr(addr)
    """Quote a subset of the email addresses defined by
RFC 821.

    Should be able to handle anything rfc822.parseaddr
can handle.
    """
    m = (None, None)
    try:
        m=rfc822.parseaddr(addr)[1]
    except AttributeError:
        pass
    if m == (None, None): # Indicates parse failure or
AttributeError
        #something weird here.. punt -ddm
        return "<%s>" % addr

Basically quoteaddr ends up wrapping whatever parseaddr
returns to it in brackets and sending that  out on the
wire for the RCPT TO command.

however, if I call rfc822.parseaddr it does bizarre
things to email addresses. 

For instance the following calls all yield the same
thing (some not surprisingly):

rfc822.parseaddr('""test" test"@test.com')
rfc822.parseaddr('"\"test\" test"@test.com')
rfc822.parseaddr('"\\"test\\" test"@test.com')
rfc822.parseaddr('"\\\"test\\\" test"@test/com')

the above all yield:
('', '""test" test"@test.com')

rfc822.parseaddr('"\\\\"test\\\\" test"@test/com')
yields the VERY surprising result:
('', '"\\"test\\\\" test"@test.com')

I submitted this as a new bug report even though there
are two similar bugs regarding parseAddr because it is
a slightly separate issue.

-Scott Dossey <seveirein /at/ yahoo.com> 
msg117778 - (view) Author: Jeffrey Finkelstein (jfinkels) * Date: 2010-10-01 05:41
I can confirm this bug. Attached is the test case.
msg117818 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010-10-01 17:31
It does appear as though parseaddr is dropping quoting information from the returned parsed address.  Fixing this is likely to create backward compatibility issues, and I'm very curious to know why parseaddr drops the quoting info.

Note that I do not observe the change from test\com to test.com, so I'm assuming that was a typo and ignoring that part (or it represents a bug that is already fixed).

The "weird" example is actually consistent with the rest of parseaddr's behavior, if you understand that behavior as turning quoted pairs inside quoted strings into their literal value, but leaving the quotes around the quoted string(s) in place.

Consider the example:

  parseaddr('"\\\\"test\\\\" test"@test.com')

If we remove the Python quoting from this input string we have:

  "\\"test\\" test"@test.com

Interpreting this according to RFC rules we have a quoted string "\\" containing a quoted pair (\\).  The quoted pair resolves to a single \.  Then we have the unquoted text
 
   test\\

This parseaddr copies literally (I'm not sure if that is strictly RFC compliant, but given that we are supposed to be liberal in what we except it is as reasonable a thing to do as any.)  Finally we have another quoted string

   " test"

So putting those pieces together according to the rules above, we end up with:

   "\"test\\" test"@test.com

which is the observed output once you remove the Python quoting.  So, parseaddr is working as designed.  The question is, what is the design decision behind resolving the quoted pairs but leaving the quotes?
msg117860 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010-10-02 04:06
After working my way through the code I no longer think that parseaddr is working as designed.  I think that this is a bug, and that there is a missing call to quote in getaddrspec.  Attached is a revised set of unit tests and a fix.  The full python test suite passes with this fix in place, but note that initially I made a mistake in the patch and running test_email passed...that is, before the attached tests there were no tests of parseaddr in the email test suite.

I don't know if this patch is safe for backport, but I'm inclined that way.  It is hard to see how 3rd party code could be compensating for this bug, since it looses quoting information that doesn't appear to be algorithmically recoverable.
msg117884 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010-10-02 16:28
Fix committed to py3k in r85179, 3.1 in r85170, and 2.7 in r85181.  I modified the unit tests, deleting the ones that were redundant because they were just two different python spellings of the same input string, and adding a comment about the third test case's quoting pattern.
History
Date User Action Args
2010-10-02 16:28:26r.david.murraysetstatus: open -> closed
resolution: fixed
messages: + msg117884

stage: test needed -> resolved
2010-10-02 04:06:34r.david.murraysetfiles: + parseaddr_quote.diff

messages: + msg117860
2010-10-01 17:31:55r.david.murraysetmessages: + msg117818
2010-10-01 05:41:32jfinkelssetfiles: + issue1050268.testcase.patch

nosy: + jfinkels
messages: + msg117778

keywords: + patch
2010-08-19 17:47:56BreamoreBoysetversions: + Python 3.1, Python 2.7, Python 3.2, - Python 2.6
2010-05-05 13:49:02barrysetassignee: barry -> r.david.murray

nosy: + r.david.murray
2009-02-14 18:19:34ajaksu2settitle: rfc822.parseaddr is broken, breaks sendmail call in smtplib -> rfc822.parseaddr is broken, breaks sendmail call in smtplib
stage: test needed
type: behavior
versions: + Python 2.6, - Python 2.4
2004-10-19 19:51:33sdosseycreate