classification
Title: email.utils.getaddresses behavior contradicts RFC2822
Type: behavior Stage: resolved
Components: email, Library (Lib) Versions: Python 3.2, Python 3.3, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: Ivan.Egorov, barry, r.david.murray, v+python
Priority: normal Keywords: patch

Created on 2011-01-28 22:30 by Ivan.Egorov, last changed 2012-05-28 19:38 by r.david.murray. This issue is now closed.

Files
File name Uploaded Description Edit
email.utils.getaddresses.patch Ivan.Egorov, 2011-01-28 22:30 Proposed patch
Messages (2)
msg127357 - (view) Author: Ivan Egorov (Ivan.Egorov) Date: 2011-01-28 22:30
email.utils.getaddresses behaves wrong in following folding cases (outer single quote is not a part of value): 
'"A\r\n (B)" <c@d.org>'
'(A\r\n C) <d@e.org>'

The misbehavior occurs in at least 2.6, 2.7 and branches/py3k.

Both these strings are RFC 2822 compliant, but current getaddresses() implementation misbehaves on 'quoted-string' and 'comment' containing CRLF.

Following references the related RFC sections:
http://tools.ietf.org/html/rfc2822#section-3.4
http://tools.ietf.org/html/rfc2822#section-3.2.5

Attachment contains tests and patch for the case.
msg161800 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012-05-28 19:38
The pre 3.3 email package does not do any header unfolding.  You can make this work by doing the header unfolding before passing it to getaddresses:

  >>> email.utils.getaddresses([''.join(m['to'].splitlines())])
  [('A (B)', 'c@d.org'), ('', 'd@e.org')]

The new provisional policy that was just added to 3.3 (which will eventually become the standard interface) does do the unfolding before parsing the addresses, so it does not have this issue.  In 3.3 we now have this:

  >>> import email
  >>> from email.policy import SMTP
  >>> m = email.message_from_string("To: \"A\r\n (B)\" <c@d.org>, (A\r\n C) <d@e.org>\r\nSubject: test\r\n\r\nbody", policy=SMTP)
  >>> m['to'].addresses
  (Address(display_name='A (B)', username='c', domain='d.org'), Address(display_name='', username='d', domain='e.org'))
History
Date User Action Args
2013-01-23 13:24:49r.david.murraylinkissue17017 superseder
2012-05-28 19:38:25r.david.murraysetstatus: open -> closed
resolution: fixed
messages: + msg161800

stage: resolved
2012-05-16 02:01:39r.david.murraysetassignee: r.david.murray ->

nosy: + barry
components: + email
versions: - Python 3.1
2011-03-14 03:56:07r.david.murraysetnosy: v+python, r.david.murray, Ivan.Egorov
versions: + Python 3.1, Python 2.7, Python 3.2
2011-02-04 18:51:32v+pythonsetnosy: + v+python
2011-01-29 01:58:28r.david.murraysetassignee: r.david.murray

nosy: + r.david.murray
2011-01-28 22:30:02Ivan.Egorovcreate