Message96217
Hope this reply works right, the python bug interface is a bit confusing
for this newbie, it doesn't say "Reply" anywhere - sorry if it goes FUBAR.
I tried the splitlines() version you suggested, it thrashed my machine
so badly I pressed alt+sysrq+f (which invokes kernel oom_kill) after
about 1 minute so I didn't lose anything important. About half a minute
later the machine came back to life. In other words: the splitlines
version used way, way too much memory - far worse even than making a
cStringIO from a GzipFile instance.read().
It's not just a GzipFile.readline() issue either, c.py calls .read() and
tries to turn the result into a cStringIO and that was the worst one of
my three previous tests. I'm going to look at this purely from a
consumer angle and not even look at gzip module source, from this angle
(a consumer), zcat out performs it by a factor of 10 when gzip module is
used with .readline() and by a good deal more when I try to read the
whole gzip file as a string to turn into a cStringIO to emulate as
closely as possible what happens with forking a zcat process. When I
tried to splitlines() it was even worse. This is probably a RAM issue,
but it just brings us back to - should gzip module eat so much ram when
shelling out to zcat uses far less? |
|
Date |
User |
Action |
Args |
2009-12-10 22:27:16 | asnakelover | set | recipients:
+ asnakelover, pitrou, brian.curtin |
2009-12-10 22:27:15 | asnakelover | set | messageid: <1260484035.55.0.626629150168.issue7471@psf.upfronthosting.co.za> |
2009-12-10 22:27:14 | asnakelover | link | issue7471 messages |
2009-12-10 22:27:12 | asnakelover | create | |
|