This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: String index out of range in get_group(), email/_header_value_parser.py
Type: behavior Stage: resolved
Components: email Versions: Python 3.8, Python 3.7, Python 3.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: Cacadril, barry, corona10, miss-islington, r.david.murray, steve.dower
Priority: normal Keywords: easy, patch

Created on 2018-05-13 00:50 by Cacadril, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
email.eml Cacadril, 2018-05-20 12:12 email with a group of recipient but missing ";"
Pull Requests
URL Status Linked Edit
PR 7484 merged corona10, 2018-06-07 15:34
PR 8522 merged miss-islington, 2018-07-28 12:55
PR 8524 merged miss-islington, 2018-07-28 12:56
Messages (9)
msg316440 - (view) Author: Enrique Perez-Terron (Cacadril) * Date: 2018-05-13 00:50
When address group is missing final ';', 'value' will be an empty string. I suggest the following patch

$ diff -u _save_header_value_parser.py _header_value_parser.py
--- _save_header_value_parser.py        2018-03-14 01:07:54.000000000 +0100
+++ _header_value_parser.py     2018-05-13 02:17:13.830053600 +0200
@@ -1876,7 +1876,7 @@
     if not value:
         group.defects.append(errors.InvalidHeaderDefect(
             "end of header in group"))
-    if value[0] != ';':
+    elif value[0] != ';':
         raise errors.HeaderParseError(
             "expected ';' at end of group but found {}".format(value))
     group.append(ValueTerminal(';', 'group-terminator'))
msg316538 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2018-05-14 16:53
The fix is mostly likely correct, so we need a PR for this that includes tests.
msg317165 - (view) Author: Enrique Perez-Terron (Cacadril) * Date: 2018-05-20 12:12
Unsure how to issue a "PR" (Problem Report?) with a test case.

Here is my best effort:

Create a file "email.eml" in the current directory, as attached.
(The contents were lifted from RFC2822 section A.1.3, but I deleted the ";" at the end of the "To" header. The file has CRLF line endings.)

Then run the following test program (It appears that I can only attach one file to this ).

$ cat test-bug.py
from email.policy import default
import email

with open('email.eml', 'rb') as f:
    msg = email.message_from_binary_file(f, policy=default)
    toheader = msg['To']
    for addr in toheader.addresses:
        print(addr)

#----------------------------------------------------
# Output without the fix:

$ python3.6.5 test-bug.py
Traceback (most recent call last):
  File "test-bug.py", line 6, in <module>
    toheader = msg['To']
  File "C:\Program Files\Python36\lib\email\message.py", line 391, in __getitem__
    return self.get(name)
  File "C:\Program Files\Python36\lib\email\message.py", line 471, in get
    return self.policy.header_fetch_parse(k, v)
  File "C:\Program Files\Python36\lib\email\policy.py", line 162, in header_fetch_parse
    return self.header_factory(name, value)
  File "C:\Program Files\Python36\lib\email\headerregistry.py", line 589, in __call__
    return self[name](name, value)
  File "C:\Program Files\Python36\lib\email\headerregistry.py", line 197, in __new__
    cls.parse(value, kwds)
  File "C:\Program Files\Python36\lib\email\headerregistry.py", line 340, in parse
    kwds['parse_tree'] = address_list = cls.value_parser(value)
  File "C:\Program Files\Python36\lib\email\headerregistry.py", line 331, in value_parser
    address_list, value = parser.get_address_list(value)
  File "C:\Program Files\Python36\lib\email\_header_value_parser.py", line 1931, in get_address_list
    token, value = get_address(value)
  File "C:\Program Files\Python36\lib\email\_header_value_parser.py", line 1908, in get_address
    token, value = get_group(value)
  File "C:\Program Files\Python36\lib\email\_header_value_parser.py", line 1879, in get_group
    if value[0] != ';':
IndexError: string index out of range

#-----------------------------------------------------
# Output with the fix:

$ test-bug.py
Chris Jones <c@a.test>
joe@where.test
John <jdoe@one.test>
msg318049 - (view) Author: Dong-hee Na (corona10) * (Python committer) Date: 2018-05-29 15:56
@Cacadril

Please send a pull request(PR) through CPython GitHub(https://github.com/python/cpython)

It will be a great experience :)
msg319007 - (view) Author: Dong-hee Na (corona10) * (Python committer) Date: 2018-06-08 02:14
@r.david.murray

Please take a look PR 7484 :)
msg319368 - (view) Author: Dong-hee Na (corona10) * (Python committer) Date: 2018-06-12 08:18
@r.david.murray
Can I get a review for PR 7484?
msg322554 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2018-07-28 12:55
New changeset 8fe9eed937cb69b5e26ac6e36a90b5360eb11277 by Steve Dower (Dong-hee Na) in branch 'master':
bpo-33476: Fix _header_value_parser when address group is missing final ';' (GH-7484)
https://github.com/python/cpython/commit/8fe9eed937cb69b5e26ac6e36a90b5360eb11277
msg322567 - (view) Author: miss-islington (miss-islington) Date: 2018-07-28 15:41
New changeset 2be0124b820729eacc1288950b824e336bd3a4a6 by Miss Islington (bot) in branch '3.7':
bpo-33476: Fix _header_value_parser when address group is missing final ';' (GH-7484)
https://github.com/python/cpython/commit/2be0124b820729eacc1288950b824e336bd3a4a6
msg322568 - (view) Author: miss-islington (miss-islington) Date: 2018-07-28 15:59
New changeset f17e001746e0f697e9bd49ac3748f2543b0a0d47 by Miss Islington (bot) in branch '3.6':
bpo-33476: Fix _header_value_parser when address group is missing final ';' (GH-7484)
https://github.com/python/cpython/commit/f17e001746e0f697e9bd49ac3748f2543b0a0d47
History
Date User Action Args
2022-04-11 14:59:00adminsetgithub: 77657
2018-07-28 16:45:54steve.dowersetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2018-07-28 15:59:21miss-islingtonsetmessages: + msg322568
2018-07-28 15:41:29miss-islingtonsetnosy: + miss-islington
messages: + msg322567
2018-07-28 12:56:20miss-islingtonsetpull_requests: + pull_request8042
2018-07-28 12:55:21miss-islingtonsetpull_requests: + pull_request8040
2018-07-28 12:55:14steve.dowersetnosy: + steve.dower
messages: + msg322554
2018-06-12 08:18:19corona10setmessages: + msg319368
2018-06-08 02:14:47corona10setmessages: + msg319007
2018-06-07 15:34:51corona10setkeywords: + patch
stage: test needed -> patch review
pull_requests: + pull_request7109
2018-05-29 15:56:15corona10setnosy: + corona10
messages: + msg318049
2018-05-20 12:12:18Cacadrilsetfiles: + email.eml

messages: + msg317165
2018-05-14 16:53:34r.david.murraysettype: crash -> behavior
2018-05-14 16:53:25r.david.murraysetkeywords: + easy

stage: test needed
messages: + msg316538
versions: + Python 3.7, Python 3.8
2018-05-13 00:50:02Cacadrilcreate