This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author nessus42
Recipients benjamin.peterson, facundobatista, georg.brandl, ncoghlan, nessus42, pitrou, r.david.murray, rhettinger
Date 2009-05-15.17:46:21
SpamBayes Score 8.580914e-12
Marked as misclassified No
Message-id <1242409585.02.0.949900911428.issue1152248@psf.upfronthosting.co.za>
In-reply-to
Content
Antoine Pitrou <report@bugs.python.org> wrote:

> Nick Coghlan <ncoghlan@gmail.com> added the comment:

> > Note that the problem with the read()+split() approach is that you
> > either have to read the whole file into memory (which this RFE is 
trying
> > to avoid) or you have to do your own buffering and so forth to split
> > records as you go. Since the latter is both difficult to get right 
and
> > very similar to what the IO module already has to do for 
readlines(), it
> > makes sense to include the extra complexity there.

> I wonder how often this use case happens though.

Every day for me.  The reason that I originally brought up this request
some years back on comp.lang.python was that I wanted to be able to use
Python easily like I use the xargs program.

E.g.,

   find -type f -regex 'myFancyRegex' -print0 | stuff-to-do-on-each-
file.py

With "-print0" the line separator is chaged to null, so that you can
deal with filenames that have newlines in them.

("find" and "xargs" traditionally have used newline to separate files,
but that fails in the face of filenames that have newlines in them, so
the -print0 argument to find and the "-0" argument to xargs were
thankfully eventually added as a fix for this issue.  Nulls are not
allowed in filenames.  At least not on Unix.)

> When you don't split on lines, conversely, you probably have a binary
> format,

That's not true for the daily use case I just mentioned.

|>ouglas

P.S. I wrote my own version of readlines, of course, as the archives of
comp.lang.python will show.  I just don't feel that everyone should be
required to do the same, when this is the sort of thing that sysadmins
and other Unix-savy folks are wont to do on a daily basis.

P.P.S. Another use case is that I often end up with files that have
beeen transferred back and forth between Unix and Windows and
god-knows-what-else, and the newlines end up being some weird mixture of
carriage returns and line feeds (and sometimes some other stray
characters such as "=20" or somesuch) that many programs seem to have a
hard time recognizing as newlines.
History
Date User Action Args
2009-05-15 17:46:25nessus42setrecipients: + nessus42, georg.brandl, rhettinger, facundobatista, ncoghlan, pitrou, benjamin.peterson, r.david.murray
2009-05-15 17:46:25nessus42setmessageid: <1242409585.02.0.949900911428.issue1152248@psf.upfronthosting.co.za>
2009-05-15 17:46:23nessus42linkissue1152248 messages
2009-05-15 17:46:21nessus42create