This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author scoder
Recipients docs@python, eric.araujo, ezio.melotti, fdrake, flox, loewis, pitrou, scoder
Date 2011-11-29.19:57:26
SpamBayes Score 2.93972e-09
Marked as misclassified No
Message-id <1322596647.5.0.369930925821.issue11379@psf.upfronthosting.co.za>
In-reply-to
Content
Hmm, looks like I messed up the last example. I accidentally left in the formatting whitespace, thus growing the file to 6.2 MB. Removing that, I get this for the (now really) 4.5 MB XML file with lots of structure and very little data:

Memory usage: 11600
xml.etree.ElementTree.parse done in 3.374 seconds
Memory usage: 203420 (+191820)
xml.etree.cElementTree.parse done in 0.192 seconds
Memory usage: 36444 (+24844)
lxml.etree.parse done in 0.131 seconds
Memory usage: 62648 (+51048)
minidom tree read in 5.935 seconds
Memory usage: 527684 (+516084)

It's actually surprising how much of a difference trailing whitespace content makes in minidom (from 2MB on disk to 300MB in memory???), most likely due to the usage of dedicated DOM text nodes in the tree.

PS: I think the "XML/performance" tags on this bug would hint at a separate ticket. This is really meant as a documentation bug.
History
Date User Action Args
2011-11-29 19:57:27scodersetrecipients: + scoder, loewis, fdrake, pitrou, ezio.melotti, eric.araujo, flox, docs@python
2011-11-29 19:57:27scodersetmessageid: <1322596647.5.0.369930925821.issue11379@psf.upfronthosting.co.za>
2011-11-29 19:57:26scoderlinkissue11379 messages
2011-11-29 19:57:26scodercreate