Message 79288 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	umaxx
Recipients	pitrou, umaxx
Date	2009-01-06.21:27:43
SpamBayes Score	0.053431086
Marked as misclassified	No
Message-id	<1231277264.54.0.308595734385.issue1708@psf.upfronthosting.co.za>
In-reply-to

Content
> Looking at the patch, the recorded seek points will probably be wrong if > some newlines were translated (e.g. '\r\n' -> '\n') when reading the file. ack, this could be a problem. > I'm also not sure not what the use case for very big files is. this is easy to answer: i used it for example for parsing (still growing) big log files from mail servers. parsing the whole file first time, and than later: starting from line xyz+1 (xyz was the last line recorded after first time parsing) without parsing the whole file again. especially very useful for growing log files >1GB just try to get linenumber 1234567 from a 2,3GB log file with the current linecache implementation :) the main idea behind the patch is to cache the seek points to save a lot of time on big files. > linecache is primarily used for printing tracebacks, the API > isn't really general-purpose. i know :)

> Looking at the patch, the recorded seek points will probably be wrong if
> some newlines were translated (e.g. '\r\n' -> '\n') when reading the file.

ack, this could be a problem.

> I'm also not sure not what the use case for very big files is. 

this is easy to answer: i used it for example for parsing (still
growing) big log files from mail servers. parsing the whole file first
time, and than later: starting from line xyz+1 (xyz was the last line
recorded after first time parsing) *without* parsing the whole file
again. especially very useful for growing log files >1GB

just try to get linenumber 1234567 from a 2,3GB log file with the
current linecache implementation :)
the main idea behind the patch is to cache the seek points to save a lot
of time on big files.

> linecache is primarily used for printing tracebacks, the API 
> isn't really general-purpose.

i know :)

History
Date	User	Action	Args
2009-01-06 21:27:44	umaxx	set	recipients: + umaxx, pitrou
2009-01-06 21:27:44	umaxx	set	messageid: <1231277264.54.0.308595734385.issue1708@psf.upfronthosting.co.za>
2009-01-06 21:27:43	umaxx	link	issue1708 messages
2009-01-06 21:27:43	umaxx	create