Message 87817 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	pitrou
Recipients	benjamin.peterson, facundobatista, georg.brandl, ncoghlan, nessus42, pitrou, r.david.murray, rhettinger
Date	2009-05-15.13:07:54
SpamBayes Score	5.450536e-09
Marked as misclassified	No
Message-id	<1242392994.8399.9.camel@localhost>
In-reply-to	<1242388014.59.0.430003269127.issue1152248@psf.upfronthosting.co.za>

Content
> Note that the problem with the read()+split() approach is that you > either have to read the whole file into memory (which this RFE is trying > to avoid) or you have to do your own buffering and so forth to split > records as you go. Since the latter is both difficult to get right and > very similar to what the IO module already has to do for readlines(), it > makes sense to include the extra complexity there. I wonder how often this use case happens though. Usually you first split on lines, and only then you split on another character or string (think CSV files, HTTP headers, etc.). When you don't split on lines, conversely, you probably have a binary format, and binary formats have more efficient ways of chunking (for example, a couple of bytes at the beginning indicating the length of the chunk).

> Note that the problem with the read()+split() approach is that you
> either have to read the whole file into memory (which this RFE is trying
> to avoid) or you have to do your own buffering and so forth to split
> records as you go. Since the latter is both difficult to get right and
> very similar to what the IO module already has to do for readlines(), it
> makes sense to include the extra complexity there.

I wonder how often this use case happens though. Usually you first split
on lines, and only then you split on another character or string (think
CSV files, HTTP headers, etc.).

When you don't split on lines, conversely, you probably have a binary
format, and binary formats have more efficient ways of chunking (for
example, a couple of bytes at the beginning indicating the length of the
chunk).

History
Date	User	Action	Args
2009-05-15 13:07:57	pitrou	set	recipients: + pitrou, georg.brandl, rhettinger, facundobatista, ncoghlan, benjamin.peterson, nessus42, r.david.murray
2009-05-15 13:07:55	pitrou	link	issue1152248 messages
2009-05-15 13:07:54	pitrou	create