This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: email.generator.Generator memory consumption
Type: resource usage Stage: resolved
Components: email Versions: Python 3.3
process
Status: closed Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: barry, iritkatriel, r.david.murray, rpatterson, srikanths
Priority: normal Keywords:

Created on 2009-09-18 23:53 by rpatterson, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Messages (4)
msg92853 - (view) Author: Ross Patterson (rpatterson) Date: 2009-09-18 23:53
Due to repeated use of StringIO as a way to "look ahead" into subparts 
while checking that multipart boundaries are unique, memory consumption 
during email.generator.Generator.flatten() can be up to 3 times the 
original message size.

I implemented a subclass of email.generator.Generator that works around 
this using email.message.Message.walk() to check message headers and 
string (final) payloads for the boundary without duplicating their 
contents into a StringIO.

It assumes that the boundary only ever might be duplicated in a single 
part's headers or in a single part's payload when that part's payload is 
a string.  IOW, it assumes that the boundary will not be duplicated by 
some combination of all the parts' and recursive subparts' headers and 
string payloads.

If this assumption is safe, then this implementation should work.  If 
this assumption is not safe, then perhaps a different boundary format 
can be used which will make this assumption safe?

You can find my implementation at http://gitorious.org/rpatterson-
imappipe/rpatterson-
imappipe/blobs/master/rpatterson/imappipe/generator.py
msg113044 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2010-08-05 20:37
The email module has several problems. RDM is working on overhauling the email module for 3.2. Existing issues may not get individual attention.
msg113071 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010-08-06 03:57
When I first looked at this issue, it appeared to me reading the code that the assumption was not considered safe.  I'm hoping I can implement something akin to your algorithm in email6 (which, unfortunately, won't make 3.2), but I'm going to have to give it a lot of thought to make sure I'm not making any unsafe assumptions.  So while I have this issue on my list, it won't be addressed in 3.2.
msg408383 - (view) Author: Irit Katriel (iritkatriel) * (Python committer) Date: 2021-12-12 13:14
Ross, the link to your code no longer works. Do you still have it, and if so could you paste/upload it here?
History
Date User Action Args
2022-04-11 14:56:53adminsetgithub: 51191
2022-01-10 23:17:08iritkatrielsetstatus: pending -> closed
stage: resolved
2021-12-12 13:14:12iritkatrielsetstatus: open -> pending
nosy: + iritkatriel
messages: + msg408383

2012-05-24 03:18:38r.david.murraysetassignee: r.david.murray ->
components: + email, - Library (Lib)
2011-07-14 12:11:18srikanthssetnosy: + srikanths
2010-11-12 06:14:43terry.reedysetnosy: - terry.reedy
2010-08-06 03:57:30r.david.murraysetmessages: + msg113071
versions: + Python 3.3, - Python 2.7
2010-08-05 20:37:29terry.reedysetnosy: + terry.reedy

messages: + msg113044
versions: + Python 2.7, - Python 2.6
2010-05-05 13:39:27barrysetassignee: barry -> r.david.murray

nosy: + r.david.murray
2009-09-19 07:48:01georg.brandlsetassignee: barry

nosy: + barry
2009-09-18 23:53:51rpattersoncreate