classification
Title: message.as_bytes() produces recursion depth exceeded
Type: behavior Stage:
Components: email Versions: Python 3.4, Python 3.5
process
Status: pending Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: barry, iritkatriel, janmalte, labrat, pas, r.david.murray, serhiy.storchaka
Priority: normal Keywords:

Created on 2014-10-21 12:29 by pas, last changed 2021-11-26 16:44 by iritkatriel.

Files
File name Uploaded Description Edit
py34-email.message.as_bytes-recursion-depth-exceeded.txt pas, 2014-10-21 12:29 traceback of RuntimeError
msg.mbox labrat, 2014-11-07 09:17 Example failing message
msg.mbox serhiy.storchaka, 2015-10-19 20:04
Messages (11)
msg229762 - (view) Author: Pas (pas) Date: 2014-10-21 12:29
Please see the attached traceback (or this http://pastebin.com/WYinRGie for fancy colors).

It depends on message size, we're trying to send Multipart MIME messages (a PDF attached, that has an image embedded).

After editing flask_mail.py to use the fallback ( message().as_string().encode(self.charset or 'utf-8') ) things work again.

If anyone could help confirm if this is a bug, or help me understand how I misuse the library, I'd be grateful. Thanks!
msg229764 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2014-10-21 14:10
It looks like a bug, but I'm not sure why as_bytes would trigger it but not as_string.

Can you supply a copy of the message that fails?  The as_string version (assuming the content was all ascii) should be enough to reproduce the issue, since it appears to be happening in the header folding step.
msg230772 - (view) Author: W. Trevor King (labrat) * Date: 2014-11-07 09:17
Here's an example from the notmuch list.  You can trigger the exception in Python 3.4 with:

  >>> import email.policy
  >>> import mailbox
  >>> mbox = mailbox.mbox('msg.mbox', factory=None, create=False)
  >>> message = mbox[0]
  >>> message.as_bytes(policy=email.policy.SMTP)
  Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    File "/home/wking/src/notmuch/ssoma_mda.py", line 319, in deliver
      message_bytes = message.as_bytes(policy=_email_policy.SMTP)
    File "/usr/lib64/python3.4/email/message.py", line 179, in as_bytes
      g.flatten(self, unixfrom=unixfrom)
    File "/usr/lib64/python3.4/email/generator.py", line 112, in flatten
      self._write(msg)
    File "/usr/lib64/python3.4/email/generator.py", line 192, in _write
      self._write_headers(msg)
    …
    File "/usr/lib64/python3.4/email/_header_value_parser.py", line 195, in <genexpr>
      return ''.join(str(x) for x in self)
  RuntimeError: maximum recursion depth exceeded while getting the str of an object

Interestingly, it serializes fine using the default policy:

  >>> message.as_bytes()
  b'Return-Path: …-----\n'
msg230775 - (view) Author: W. Trevor King (labrat) * Date: 2014-11-07 09:57
The troublesome header formatting is:

  >>> import email.policy
  >>> email.policy.SMTP.fold_binary('Cc', 'notmuch\n\t<public-public-notmuch-gxuj+Tv9EO5zyzON3hdc1g-wOFGN7rlS/M9smdsby/KFg@plane.gmane.org>,\n\tpublic-notmuch-gxuj+Tv9EO5zyzON3hdc1g@plane.gmane.org,\n\tRainer M Krug <public-R.M.Krug-Re5JQEeQqe8AvxtiuMwx3w@plane.gmane.org>,\n\tJeremy Nickurak\n\t<public-public-not-much-kexSNQTsIoD754YsiR0rpA-wOFGN7rlS/M9smdsby/KFg@plane.gmane.org>')
  Traceback (most recent call last):
    …
  RuntimeError: maximum recursion depth exceeded while getting the str of an object

Trimming that down a bit, a minimal trigger seems to be:

  >>> email.policy.SMTP.fold_binary('Cc', 'a\n\taaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa,\n\ta')
  Traceback…

Where removing much of anything gives a working fold.
msg230787 - (view) Author: W. Trevor King (labrat) * Date: 2014-11-07 10:31
In email._header_value_parser._Folded.append_if_fits, if I shift:

            if token.has_fws:
                ws = token.pop_leading_fws()
                if ws is not None:
                    self.stickyspace += str(ws)
                    stickyspace_len += len(ws)
                token._fold(self)
                return True

to:

            if token.has_fws:
                ws = token.pop_leading_fws()
                if ws is not None:
                    self.stickyspace += str(ws)
                    stickyspace_len += len(ws)
                    token._fold(self)
                    return True

I can avoid the recursion.

The problem seems to be that the "a aaaa…aaa" token/part contains folding white space, but doesn't *start* with folding whitespace.  Maybe the folding should try to split on existing FWS, instead of just trying to pop leading FWS?
msg230810 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2014-11-07 15:09
Something like that.  That folding algorithm is a bit...bizantine.  I need to sit down and completely rewrite it, I think.  But maybe I can fix this problem in the meantime, until I have time to do that.
msg252919 - (view) Author: Jan Malte (janmalte) Date: 2015-10-13 08:46
Are there any news about this bug report?
msg253183 - (view) Author: Jan Malte (janmalte) Date: 2015-10-19 15:31
for the same objects as_string() is working correctly
msg253189 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-10-19 20:04
Here is minimized example.

from email._header_value_parser import *
al = AddressList([Address([Mailbox([NameAddr([DisplayName([Atom([ValueTerminal('example', 'atext'), CFWSList([WhiteSpaceTerminal('\t', 'fws')])])]), AngleAddr([ValueTerminal('<', 'angle-addr-start'), AddrSpec([LocalPart([DotAtom([DotAtomText([ValueTerminal('very-very-very-very-very-very-very-very-very-very-very-very-long', 'atext')])])]), ValueTerminal('@', 'address-at-symbol'), Domain([DotAtom([DotAtomText([ValueTerminal('example', 'atext'), ValueTerminal('.', 'dot'), ValueTerminal('org', 'atext')])])])]), ValueTerminal('>', 'angle-addr-end')])])])]), ValueTerminal(',', 'list-separator'), Address([Mailbox([NameAddr([AngleAddr([CFWSList([WhiteSpaceTerminal('\t', 'fws')]), ValueTerminal('<', 'angle-addr-start'), AddrSpec([LocalPart([DotAtom([DotAtomText([ValueTerminal('very-very-very-very-very-very-very-very-very-very-very-very-long', 'atext')])])]), ValueTerminal('@', 'address-at-symbol'), Domain([DotAtom([DotAtomText([ValueTerminal('example', 'atext'), ValueTerminal('.', 'dot'), ValueTerminal('org', 'atext')])])])]), ValueTerminal('>', 'angle-addr-end')])])])])])
import email.policy
al.fold(policy=email.policy.default)
msg253192 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-10-19 21:05
Even more minimized artificial example:

from email._header_value_parser import *
import email.policy
tl = TokenList([
    TokenList([
        ValueTerminal('x', 'atext'),
        WhiteSpaceTerminal(' ', 'fws'),
        ValueTerminal('x'*76, 'atext'),
    ]),
    ValueTerminal(',', 'list-separator')
])
tl.fold(policy=email.policy.default)

list(tl.parts)[0] == tl and tl.has_fws is True, so TokenList._fold() is called recursively with the same argument.
msg407070 - (view) Author: Irit Katriel (iritkatriel) * (Python committer) Date: 2021-11-26 16:44
I am unable to reproduce this on 3.11.
History
Date User Action Args
2021-11-26 16:44:58iritkatrielsetstatus: open -> pending
nosy: + iritkatriel
messages: + msg407070

2015-10-19 21:05:22serhiy.storchakasetmessages: + msg253192
2015-10-19 20:04:23serhiy.storchakasetfiles: + msg.mbox
nosy: + serhiy.storchaka
messages: + msg253189

2015-10-19 15:31:56janmaltesetmessages: + msg253183
2015-10-13 08:46:17janmaltesetnosy: + janmalte
messages: + msg252919
2014-11-07 15:09:16r.david.murraysetmessages: + msg230810
versions: + Python 3.5
2014-11-07 10:31:12labratsetmessages: + msg230787
2014-11-07 09:57:36labratsetmessages: + msg230775
2014-11-07 09:17:29labratsetfiles: + msg.mbox
nosy: + labrat
messages: + msg230772

2014-10-21 14:10:58r.david.murraysetmessages: + msg229764
2014-10-21 12:29:25pascreate