Title: Big speedup in email message parsing
Type: performance
Components: email, Library (Lib) Versions: Python 3.5
Nosy List: ajaksu2, barry, holdenweb, lpd, r.david.murray, serhiy.storchaka
Created on 2005-07-23 22:07 by lpd, last changed 2022-04-11 14:56 by admin.

Author: L. Peter Deutsch (lpd) Date: 2005-07-23 22:07
Python 2.4.1, Red Hat Linux 7.3.

Speeds up message parsing on files with large
attachments by approximately 4x, mostly by replacing
REs by direct string processing.
Author: Steve Holden (holdenweb) Date: 2006-05-25 22:55
Logged In: YES 

A first examinaation reveals no particular speedup on an
email with approximately 30 MB of attachments. Can the OP
perhaps provide some code and test data I could time to
verify the assertions of speedup? Otherwise I can't see much
point in applying the patch.
Author: Barry A. Warsaw (barry) Date: 2006-05-28 01:12
Logged In: YES 

Here's a slightly better version, cleaned up for style and
applicable to Python 2.5 (which is the only place I'd feel
comfortable applying it).  I've verified that this provides
about a 3x speed up at least for some messages with really
big attachments.
Author: R. David Murray (r.david.murray) Date: 2010-12-27 17:17
Since this is a performance hack and is considerably invasive of the feedparser code (and needs updating), I'm deferring it to 3.3.
Author: Serhiy Storchaka (serhiy.storchaka) Date: 2013-03-14 20:46
Test fails with stack overflow:

ERROR: test_pushCR_LF (email.test.test_email.TestIterators)
FeedParser BufferedSubFile.push() assumed it received complete
Traceback (most recent call last):
  File "/home/serhiy/py/cpython2.7/Lib/email/test/", line 2585, in test_pushCR_LF
  File "/home/serhiy/py/cpython2.7/Lib/email/", line 140, in push
    parts = _splitlines(data)
  File "/home/serhiy/py/cpython2.7/Lib/email/", line 170, in _splitlines
  File "/home/serhiy/py/cpython2.7/Lib/email/", line 170, in _splitlines
RuntimeError: maximum recursion depth exceeded
