Message 224562 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	serhiy.storchaka
Recipients	barry, jader.fabiano, r.david.murray, serhiy.storchaka, tshepang
Date	2014-08-02.13:26:55
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1406986015.56.0.523010824829.issue21448@psf.upfronthosting.co.za>
In-reply-to

Content
Parser reads from input file small chunks (8192 churacters) and feed FeedParser which pushes data into BufferedSubFile. In BufferedSubFile.push() chunks of incomplete data are accumulated in a buffer and repeatedly scanned for newlines. Every push() has linear complexity from the size of accumulated buffer, and total complexity is quadratic. Here is a patch which fixes problem with parsing long lines. Feel free to add comments if they are needed (there is an abundance of comments in the module).

Parser reads from input file small chunks (8192 churacters) and feed FeedParser which pushes data into BufferedSubFile. In BufferedSubFile.push() chunks of incomplete data are accumulated in a buffer and repeatedly scanned for newlines. Every push() has linear complexity from the size of accumulated buffer, and total complexity is quadratic.

Here is a patch which fixes problem with parsing long lines. Feel free to add comments if they are needed (there is an abundance of comments in the module).

History
Date	User	Action	Args
2014-08-02 13:26:55	serhiy.storchaka	set	recipients: + serhiy.storchaka, barry, r.david.murray, tshepang, jader.fabiano
2014-08-02 13:26:55	serhiy.storchaka	set	messageid: <1406986015.56.0.523010824829.issue21448@psf.upfronthosting.co.za>
2014-08-02 13:26:55	serhiy.storchaka	link	issue21448 messages
2014-08-02 13:26:55	serhiy.storchaka	create