Message10008
Logged In: YES
user_id=49843
Hi Martin,
Yikes; Sorry about that. I've attached the file.
---
I did some more experimentation with xml.sax, and there does
appear to be a serious problem with object destruction, even
with Python 2.2.1c.
I'm working with a fairly large XML file located on the TIGR
(The Institute for Genomic Research) ftp site. A sample
file would be something like:
ftp://ftp.tigr.org/pub/data/a_thaliana/ath1/PSEUDOCHROMOSOMES/chr1.xml
(60 MBs)
and I noticed that my scripts were leaking memory. I've
isolated the problem to what looks like a garbage collection
problem: it looks like my ContentHandlers are not getting
recycled. Here's a simplified program:
###
import xml.sax
import glob
from cStringIO import StringIO
class FooParser(xml.sax.ContentHandler):
def __init__(self):
self.bigcontent = StringIO()
def startElement(self, name, attrs):
pass
def endElement(self, name):
pass
def characters(self, chars):
self.bigcontent.write(chars)
filename =
'/home/arabidopsis/bacs/20020107/PSEUDOCHROMOSOME/chr1.xml'
i = 0
while 1:
print "Iteration %d" % i
xml.sax.parse(open(filename), FooParser())
i = i + 1
###
I've watched 'top', and the memory usage continues growing.
Any suggestions? Thanks! |
|
Date |
User |
Action |
Args |
2007-08-23 14:00:09 | admin | link | issue535474 messages |
2007-08-23 14:00:09 | admin | create | |
|