classification
Title: IndexError thrown on email.message.Message.get
Type: behavior Stage: resolved
Components: email Versions: Python 3.9, Python 3.8, Python 3.7
process
Status: closed Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: barry, maxking, r.david.murray, uckelman, xiang.zhang
Priority: normal Keywords: patch

Created on 2017-02-01 13:53 by uckelman, last changed 2019-08-17 03:01 by maxking. This issue is now closed.

Files
File name Uploaded Description Edit
29412.patch uckelman, 2017-02-07 16:04 patch
Pull Requests
URL Status Linked Edit
PR 6907 closed TThurk360, 2018-05-16 14:22
PR 14387 merged maxking, 2019-06-26 02:36
PR 14411 merged miss-islington, 2019-06-26 20:13
PR 14412 merged miss-islington, 2019-06-26 20:13
Messages (13)
msg286631 - (view) Author: Joel Uckelman (uckelman) * Date: 2017-02-01 13:53
Test case:

  import email
  import email.policy
 
  txt = '''From: juckelman@strozfriedberg.co.uk
  To: (Recipient list suppressed)
  Date: Thu, 22 Aug 2013 04:13:02 +0000
  Subject: ADSF-1082
 
  Hey!
  '''
 
  msg = email.message_from_string(txt)
  msg.get('to')
  msg = email.message_from_string(txt, policy=email.policy.default)
  msg.get('to')

The second msg.get() throws an IndexError:

  Traceback (most recent call last):
    File "test.py", line 16, in <module>
      print(msg.get('to'))    # throws IndexError
    File "/usr/lib64/python3.5/email/message.py", line 472, in get
      return self.policy.header_fetch_parse(k, v)
    File "/usr/lib64/python3.5/email/policy.py", line 153, in header_fetch_parse
      return self.header_factory(name, ''.join(value.splitlines()))
    File "/usr/lib64/python3.5/email/headerregistry.py", line 586, in __call__
      return self[name](name, value)
    File "/usr/lib64/python3.5/email/headerregistry.py", line 197, in __new__
      cls.parse(value, kwds)
    File "/usr/lib64/python3.5/email/headerregistry.py", line 337, in parse
      kwds['parse_tree'] = address_list = cls.value_parser(value)
    File "/usr/lib64/python3.5/email/headerregistry.py", line 328, in value_parser
      address_list, value = parser.get_address_list(value)
    File "/usr/lib64/python3.5/email/_header_value_parser.py", line 2336, in get_address_list
      token, value = get_address(value)
    File "/usr/lib64/python3.5/email/_header_value_parser.py", line 2313, in get_address
      token, value = get_group(value)
    File "/usr/lib64/python3.5/email/_header_value_parser.py", line 2269, in get_group
      token, value = get_display_name(value)
    File "/usr/lib64/python3.5/email/_header_value_parser.py", line 2095, in get_display_name
      token, value = get_phrase(value)
    File "/usr/lib64/python3.5/email/_header_value_parser.py", line 1770, in get_phrase
      token, value = get_word(value)
    File "/usr/lib64/python3.5/email/_header_value_parser.py", line 1745, in get_word
      if value[0]=='"':
  IndexError: string index out of range

The docs say that email.policy.default has raise_on_defect set to False, hence parse errors ought to be reported via EmailMessage.defects, not by throwing an exception.
msg286633 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2017-02-01 14:09
Does the patch from issue 27931 fix your problem as well?  I haven't looked closely enough to see if I think it should, I'm just hoping :)
msg286635 - (view) Author: Joel Uckelman (uckelman) * Date: 2017-02-01 14:17
No dice. I get the same exception with issue27931_v2.patch. I briefly looked at the other two, and don't expect those will help, either.
msg286845 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2017-02-03 06:45
This seems not related to #27931.

The problem is that the receiver's content is only CFWS. It's just like it's empty and for the default policy, it checks `value[0] == '"'` in `get_word`.
msg286867 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2017-02-03 13:49
I'm really short on time to even review patches these days, but I'll see if I can pry any loose if someone wants to propose a patch.
msg287232 - (view) Author: Joel Uckelman (uckelman) * Date: 2017-02-07 13:10
I'm working on a patch now.
msg287242 - (view) Author: Joel Uckelman (uckelman) * Date: 2017-02-07 16:04
Here's a patch, complete with tests.

If the value is all CFWS, then get_cfws(value)[1], which is what's left after the CFWS is extracted, is the empty string---which is why value[0] throws in this case.
msg345631 - (view) Author: Abhilash Raj (maxking) * (Python committer) Date: 2019-06-14 19:02
I can't reproduce this problem with the latest master branch, it was perhaps fixed with some other PR. 

This is also a dupe of bpo-31445.

@barry, @david: I think this issue can be closed.
msg345632 - (view) Author: Abhilash Raj (maxking) * (Python committer) Date: 2019-06-14 19:04
For the record, this is how I tested using the master branch:

>>> msg = email.message_from_string('  To: (Recipient list suppressed)')
>>> msg['To']
>>> import email.policy
>>> msg = email.message_from_string('  To: (Recipient list suppressed)', policy=email.policy.default)
>>> msg
<email.message.EmailMessage object at 0x7f377512b370>
>>> msg['To']
>>> msg.get('to')
msg345633 - (view) Author: Abhilash Raj (maxking) * (Python committer) Date: 2019-06-14 19:07
Nevermind, I was wrong, I was able to reproduce it:

>>> msg = email.message_from_string('To: (Recipient list suppressed)', policy=email.policy.default))
  File "<stdin>", line 1
SyntaxError: unmatched ')'
>>> msg = email.message_from_string('To: (Recipient list suppressed)', policy=email.policy.default)
>>> msg.get('to')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/maxking/Documents/cpython/Lib/email/message.py", line 471, in get
    return self.policy.header_fetch_parse(k, v)
  File "/home/maxking/Documents/cpython/Lib/email/policy.py", line 163, in header_fetch_parse
    return self.header_factory(name, value)
  File "/home/maxking/Documents/cpython/Lib/email/headerregistry.py", line 589, in __call__
    return self[name](name, value)
  File "/home/maxking/Documents/cpython/Lib/email/headerregistry.py", line 197, in __new__
    cls.parse(value, kwds)
  File "/home/maxking/Documents/cpython/Lib/email/headerregistry.py", line 340, in parse
    kwds['parse_tree'] = address_list = cls.value_parser(value)
  File "/home/maxking/Documents/cpython/Lib/email/headerregistry.py", line 331, in value_parser
    address_list, value = parser.get_address_list(value)
  File "/home/maxking/Documents/cpython/Lib/email/_header_value_parser.py", line 1951, in get_address_list
    token, value = get_address(value)
  File "/home/maxking/Documents/cpython/Lib/email/_header_value_parser.py", line 1928, in get_address
    token, value = get_group(value)
  File "/home/maxking/Documents/cpython/Lib/email/_header_value_parser.py", line 1884, in get_group
    token, value = get_display_name(value)
  File "/home/maxking/Documents/cpython/Lib/email/_header_value_parser.py", line 1710, in get_display_name
    token, value = get_phrase(value)
  File "/home/maxking/Documents/cpython/Lib/email/_header_value_parser.py", line 1385, in get_phrase
    token, value = get_word(value)
  File "/home/maxking/Documents/cpython/Lib/email/_header_value_parser.py", line 1360, in get_word
    if value[0]=='"':
IndexError: string index out of range
msg346676 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2019-06-26 20:13
New changeset 7213df7bbfd85378c6e42e1ac63144d5974bdcf6 by Barry Warsaw (Abhilash Raj) in branch 'master':
bpo-29412: Fix indexError when parsing a header value ending unexpectedly (GH-14387)
https://github.com/python/cpython/commit/7213df7bbfd85378c6e42e1ac63144d5974bdcf6
msg346689 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2019-06-26 22:05
New changeset b950cdb4beabeb093fa3ccc35f53d51cc0193aba by Barry Warsaw (Miss Islington (bot)) in branch '3.7':
bpo-29412: Fix indexError when parsing a header value ending unexpectedly (GH-14387) (GH-14412)
https://github.com/python/cpython/commit/b950cdb4beabeb093fa3ccc35f53d51cc0193aba
msg346690 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2019-06-26 22:05
New changeset 82654a037211a3466a294d53952926fc87f8403d by Barry Warsaw (Miss Islington (bot)) in branch '3.8':
bpo-29412: Fix indexError when parsing a header value ending unexpectedly (GH-14387) (GH-14411)
https://github.com/python/cpython/commit/82654a037211a3466a294d53952926fc87f8403d
History
Date User Action Args
2019-08-17 03:01:56maxkingsetstatus: open -> closed
stage: patch review -> resolved
versions: + Python 3.8, Python 3.9, - Python 3.5, Python 3.6
2019-06-26 22:05:39barrysetmessages: + msg346690
2019-06-26 22:05:11barrysetmessages: + msg346689
2019-06-26 20:13:22miss-islingtonsetpull_requests: + pull_request14224
2019-06-26 20:13:14miss-islingtonsetpull_requests: + pull_request14223
2019-06-26 20:13:07barrysetmessages: + msg346676
2019-06-26 02:36:44maxkingsetpull_requests: + pull_request14201
2019-06-14 19:07:13maxkingsetmessages: + msg345633
2019-06-14 19:04:12maxkingsetmessages: + msg345632
2019-06-14 19:02:10maxkingsetnosy: + maxking
messages: + msg345631
2018-05-16 14:29:07TThurk360setpull_requests: - pull_request6509
2018-05-16 14:22:53TThurk360setpull_requests: + pull_request6575
2018-05-14 19:41:00TThurk360setstage: patch review
pull_requests: + pull_request6509
2017-02-07 16:04:35uckelmansetfiles: + 29412.patch
keywords: + patch
messages: + msg287242
2017-02-07 13:10:26uckelmansetmessages: + msg287232
2017-02-03 13:49:25r.david.murraysetmessages: + msg286867
versions: + Python 3.6, Python 3.7
2017-02-03 06:45:08xiang.zhangsetnosy: + xiang.zhang
messages: + msg286845
2017-02-01 14:44:15uckelmansettitle: IndexError thrown on email.message.EmailMessage.get -> IndexError thrown on email.message.Message.get
2017-02-01 14:17:25uckelmansetmessages: + msg286635
2017-02-01 14:09:35r.david.murraysetmessages: + msg286633
2017-02-01 13:53:52uckelmansettitle: IndexError thrown on email.message.M -> IndexError thrown on email.message.EmailMessage.get
2017-02-01 13:53:21uckelmancreate