classification
Title: Exception in BytesGenerator.flatten
Type: Stage:
Components: email Versions: Python 3.6, Python 3.4, Python 3.5
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Pedro Lacerda, barry, berker.peksag, frispete, r.david.murray
Priority: normal Keywords:

Created on 2016-06-07 19:52 by frispete, last changed 2016-06-18 14:29 by Pedro Lacerda.

Files
File name Uploaded Description Edit
flatten-exception.mail frispete, 2016-06-07 19:52
email_flatten.py frispete, 2016-06-16 15:31
flatten-no-exception.mail Pedro Lacerda, 2016-06-18 14:28
flatten-no-exception.mail Pedro Lacerda, 2016-06-18 14:29
Messages (7)
msg267736 - (view) Author: Hans-Peter Jansen (frispete) * Date: 2016-06-07 19:52
Attached mail, parsed with email.message_from_binary_file results in:

Traceback (most recent call last):
  File "./mail_filter.py", line 616, in <module>
    ret = main.run()
  File "./mail_filter.py", line 605, in run
    self.process(fp)
  File "./mail_filter.py", line 589, in process
    self.save_message(msg, self._fname + '.out')
  File "./mail_filter.py", line 103, in save_message
    ofd.write(msg.as_bytes())
  File "/usr/lib64/python3.4/email/message.py", line 179, in as_bytes
    g.flatten(self, unixfrom=unixfrom)
  File "/usr/lib64/python3.4/email/generator.py", line 115, in flatten
    self._write(msg)
  File "/usr/lib64/python3.4/email/generator.py", line 195, in _write
    self._write_headers(msg)
  File "/usr/lib64/python3.4/email/generator.py", line 422, in _write_headers
    self._fp.write(self.policy.fold_binary(h, v))
  File "/usr/lib64/python3.4/email/policy.py", line 190, in fold_binary
    folded = self._fold(name, value, refold_binary=self.cte_type=='7bit')
  File "/usr/lib64/python3.4/email/policy.py", line 204, in _fold
    return self.header_factory(name, ''.join(lines)).fold(policy=self)
  File "/usr/lib64/python3.4/email/headerregistry.py", line 255, in fold
    return header.fold(policy=policy)
  File "/usr/lib64/python3.4/email/_header_value_parser.py", line 300, in fold
    self._fold(folded)
  File "/usr/lib64/python3.4/email/_header_value_parser.py", line 1228, in _fold
    rest._fold(folded)
  File "/usr/lib64/python3.4/email/_header_value_parser.py", line 338, in _fold
    if folded.append_if_fits(part, tstr):
  File "/usr/lib64/python3.4/email/_header_value_parser.py", line 149, in append_if_fits
    token._fold(self)
  File "/usr/lib64/python3.4/email/_header_value_parser.py", line 324, in _fold
    for part in self.parts:
  File "/usr/lib64/python3.4/email/_header_value_parser.py", line 254, in parts
    if token.startswith_fws():
  File "/usr/lib64/python3.4/email/_header_value_parser.py", line 267, in startswith_fws
    return self[0].startswith_fws()
  File "/usr/lib64/python3.4/email/_header_value_parser.py", line 267, in startswith_fws
    return self[0].startswith_fws()
  File "/usr/lib64/python3.4/email/_header_value_parser.py", line 267, in startswith_fws
    return self[0].startswith_fws()
  File "/usr/lib64/python3.4/email/_header_value_parser.py", line 267, in startswith_fws
    return self[0].startswith_fws()
  File "/usr/lib64/python3.4/email/_header_value_parser.py", line 267, in startswith_fws
    return self[0].startswith_fws()
IndexError: list index out of range

when flattened with BytesGenerator.
msg268645 - (view) Author: Pedro Lacerda (Pedro Lacerda) * Date: 2016-06-16 06:52
I was unable to reproduce this bug using the following snippet

    import email, sys
    from email.generator import BytesGenerator
    from email.mime.text import MIMEText

    fp = open('flatten-exception.mail', 'rb')
    email.message_from_binary_file(fp)
    bs = BytesGenerator(sys.stdout.buffer)
    bs.flatten(MIMEText('msg', 'plain', 'utf-8'))
msg268658 - (view) Author: Berker Peksag (berker.peksag) * (Python committer) Date: 2016-06-16 11:08
Thanks for helping to triage this, Pedro. I think there is a typo in your example: ``email.message_from_binary_file(fp)`` needs to be passed to ``bs.flatten()``.

With the following script I'm also unable to reproduce the issue in Python 3.4+:

    import email
    import email.generator
    import sys

    with open('flatten-exception.mail', 'rb') as f:
        msg = email.message_from_binary_file(f)
        gen = email.generator.BytesGenerator(sys.stdout.buffer)
        gen.flatten(msg)

Hans-Peter, could you share a reproducer with us? Thanks!
msg268673 - (view) Author: Hans-Peter Jansen (frispete) * Date: 2016-06-16 15:31
Sorry guys for not providing this earlier.

It turned out, that the sub optimal behaviour is related to a unfortunate policy choice: email.policy.SMTP.
msg268759 - (view) Author: Pedro Lacerda (Pedro Lacerda) * Date: 2016-06-18 04:18
Seems that ``token.has_fws`` evaluates to True in the following condition

    if token.has_fws:

causing ``token._fold(self)`` where isn't needed and raising the exception. Hope it helps!

By the way, why the _header_value_parser.py was removed from the repository?
https://github.com/python/cpython/blob/master/Lib/email/_header_value_parser.py#L144
msg268807 - (view) Author: Pedro Lacerda (Pedro Lacerda) * Date: 2016-06-18 14:28
Now the file is back! If any previous header has a newline before the value the error will not happen. But even with the output correct it isn't as expected.
msg268810 - (view) Author: Pedro Lacerda (Pedro Lacerda) * Date: 2016-06-18 14:29
Now the file is back! If any previous header has a newline before the value the error will not happen. But even with the output correct it isn't as expected.
History
Date User Action Args
2016-06-18 14:29:48Pedro Lacerdasetfiles: + flatten-no-exception.mail

messages: + msg268810
2016-06-18 14:29:01Pedro Lacerdasetfiles: + flatten-no-exception.mail

messages: + msg268807
2016-06-18 04:18:24Pedro Lacerdasetmessages: + msg268759
2016-06-16 15:31:46frispetesetfiles: + email_flatten.py

messages: + msg268673
2016-06-16 11:08:59berker.peksagsetnosy: + berker.peksag
messages: + msg268658
2016-06-16 06:52:58Pedro Lacerdasetnosy: + Pedro Lacerda
messages: + msg268645
2016-06-07 19:52:43frispetecreate