This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author dyoo
Recipients
Date 2002-03-28.22:37:04
SpamBayes Score
Marked as misclassified
Message-id
In-reply-to
Content
Logged In: YES 
user_id=49843

Hi Martin,

Yikes; Sorry about that.  I've attached the file.

---


I did some more experimentation with xml.sax, and there does
appear to be a serious problem with object destruction, even
with Python 2.2.1c.

I'm working with a fairly large XML file located on the TIGR
(The Institute for Genomic Research) ftp site.  A sample
file would be something like:

ftp://ftp.tigr.org/pub/data/a_thaliana/ath1/PSEUDOCHROMOSOMES/chr1.xml

(60 MBs)

and I noticed that my scripts were leaking memory.  I've
isolated the problem to what looks like a garbage collection
problem: it looks like my ContentHandlers are not getting
recycled.  Here's a simplified program:

###
import xml.sax
import glob
from cStringIO import StringIO


class FooParser(xml.sax.ContentHandler):
    def __init__(self):
        self.bigcontent = StringIO()

    def startElement(self, name, attrs):
        pass

    def endElement(self, name):
        pass

    def characters(self, chars):
        self.bigcontent.write(chars)


filename =
'/home/arabidopsis/bacs/20020107/PSEUDOCHROMOSOME/chr1.xml'
i = 0
while 1:
    print "Iteration %d" % i
    xml.sax.parse(open(filename), FooParser())
    i = i + 1
###

I've watched 'top', and the memory usage continues growing.
 Any suggestions?  Thanks!
History
Date User Action Args
2007-08-23 14:00:09adminlinkissue535474 messages
2007-08-23 14:00:09admincreate