Message223491
While we're at it, Douglas Alan's solution wouldn't be an ideal solution even if it were a builtin. A fileLineIter obviously doesn't support the stream API. It means you end up with two objects that share the same file, but have separate buffers and out-of-sync file pointers. And it's a lot slower.
That being said, I think it may be useful enough to put in the stdlib—even more so if you pull the resplit-an-iterator-of-strings code out:
def resplit(strings, separator):
partialLine = None
for s in strings:
if partialLine:
partialLine += s
else:
partialLine = s
if not s:
break
lines = partialLine.split(separator)
partialLine = lines.pop()
yield from lines
if partialLine:
yield partialLine
Now, you can do this:
with open('rdm-example') as f:
chunks = iter(partial(f.read, 8192), '')
lines = resplit(chunks, '\0')
lines = (line + '\n' for line in lines)
# Or, if you're just going to strip off the newlines anyway:
with open('file-0-example') as f:
chunks = iter(partial(f.read, 8192), '')
lines = resplit(chunks, '\0')
# Or, if you have a binary file:
with open('binary-example, 'rb') as f:
chunks = iter(partial(f.read, 8192), b'')
lines = resplit(chunks, b'\0')
# Or, if I understand ysj.ray's example:
with open('ysj.ray-example') as f:
chunks = iter(partial(f.read, 8192), '')
lines = resplit(chunks, '\r\n')
records = resplit(lines, '\t')
# Or, if you have something that isn't a file at all:
lines = resplit((packet.body for packet in packets), '\n') |
|
Date |
User |
Action |
Args |
2014-07-20 00:41:35 | abarnert | set | recipients:
+ abarnert, georg.brandl, rhettinger, facundobatista, amaury.forgeotdarc, ncoghlan, pitrou, benjamin.peterson, nessus42, eric.araujo, ralph.corderoy, r.david.murray, ysj.ray, Douglas.Alan, jcon |
2014-07-20 00:41:35 | abarnert | set | messageid: <1405816895.54.0.705421525632.issue1152248@psf.upfronthosting.co.za> |
2014-07-20 00:41:35 | abarnert | link | issue1152248 messages |
2014-07-20 00:41:33 | abarnert | create | |
|