Author MLModel
Recipients MLModel, georg.brandl
Date 2009-07-13.00:55:50
SpamBayes Score 2.74741e-12
Marked as misclassified No
Message-id <1247446553.79.0.919218325186.issue6472@psf.upfronthosting.co.za>
In-reply-to
Content
I can't quite sort this out, because it's difficult to see what is
intended. The documentation of xml.etree.ElementTree (19.11 in the
Library doc) uses terms like "iterator", "tree iterator", "iterable",
"list" in vague and perhaps not quite accurate ways. I can't tell from
the documentation which functions/methods return lists, which return a
generator, which return an unspecified kind of iterable, and so on.
Moreover, the results are different using ElementTree than they are
using cElementTree. In particular, getiterator() returns a list in
ElementTree and a generator in cElementTree. This can make a substantial
difference in performance when iterating over a large number of nodes
(in addition to cElementTree's parsing being what appears to be about
10x faster).

I think someone should go over the page and sort this out and make it
clear what the user can expect. (I don't think it's fair to
overgeneralize to things like "iterables" if the module is really meant
to be making a commitment to a list or a generator.) I also think that
the differences in the results of methods returned in the Python and C
versions of the module should be highlighted.

I stumbled on this trying to parses and extract individual bits of
information out of large XML files. I full well realize there are better
ways to do this (SAX, e.g.) and better ways to search than just iterate
over all the tags of the type I'm interested in, but I should still know
what to expect from ElementTree, especially because it is so wonderful!
History
Date User Action Args
2009-07-13 00:55:53MLModelsetrecipients: + MLModel, georg.brandl
2009-07-13 00:55:53MLModelsetmessageid: <1247446553.79.0.919218325186.issue6472@psf.upfronthosting.co.za>
2009-07-13 00:55:52MLModellinkissue6472 messages
2009-07-13 00:55:50MLModelcreate