classification
Title: Reduce parsing overhead in email.feedparser.BufferedSubFile
Type: performance Stage: resolved
Components: email Versions: Python 3.4
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: barry, jcea, pitrou, python-dev, r.david.murray
Priority: normal Keywords: patch

Created on 2012-06-28 21:19 by r.david.murray, last changed 2013-02-14 03:21 by python-dev. This issue is now closed.

Files
File name Uploaded Description Edit
feedparser_performance.patch r.david.murray, 2012-06-28 21:19
Messages (4)
msg164295 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012-06-28 21:19
The idea for the attached patch comes from the QNX development team.  In their measurements, replacing the re.split-plus-line-reassembly code in BufferedSubFile with str.splitlines provided a 30% reduction in email parsing time.  The code is also a lot more readable, which is a plus.

The patch is simple enough, and the improvement is large enough, that I'd like to apply this to all active branches.
msg164306 - (view) Author: Jesús Cea Avión (jcea) * (Python committer) Date: 2012-06-29 00:11
This is a performance enhancement. Out of question for 3.2.

Python 3.3 is in beta now, and this would be considered a new feature, but I think it is pretty safe to apply. I am +1 to applying it.

Personally I would apply it to 2.7 too, but current official position is "bug fixes only". We have rejected performance improvements for 2.7 in the past, for this reason.
msg164330 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-06-29 14:41
Realistically, any performance improvement is 3.4-only now.
msg182071 - (view) Author: Roundup Robot (python-dev) Date: 2013-02-14 03:21
New changeset 0f827775f7b7 by R David Murray in branch 'default':
#15220: simplify and speed up feedparser's line splitting.
http://hg.python.org/cpython/rev/0f827775f7b7
History
Date User Action Args
2013-02-14 03:21:04python-devsetnosy: + python-dev
messages: + msg182071
2013-02-14 02:17:33r.david.murraysetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2012-06-29 14:41:35pitrousetnosy: + pitrou

messages: + msg164330
versions: + Python 3.4, - Python 2.7, Python 3.3
2012-06-29 00:11:24jceasetnosy: + jcea

messages: + msg164306
versions: - Python 3.2
2012-06-28 21:19:23r.david.murraycreate