This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: pulldom.PullDOM doesn't populate DOM tree
Type: behavior Stage:
Components: XML Versions: Python 3.2, Python 2.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: AchimGaedke, eric.araujo, martin.panter
Priority: normal Keywords:

Created on 2011-12-08 00:41 by AchimGaedke, last changed 2022-04-11 14:57 by admin.

Files
File name Uploaded Description Edit
pulldom_test.py AchimGaedke, 2011-12-08 00:41 test program with xml data embedded
pulldom_test.py AchimGaedke, 2011-12-08 01:06 test for python3
Messages (5)
msg149006 - (view) Author: Achim Gaedke (AchimGaedke) Date: 2011-12-08 00:41
Hi!

I tried to use the more general xml.dom interface, no longer defaulting to minidom straight away (or using etree).

The DOM tree gained with PullDOM seem to be incomplete.
Here is the output of the program attached.

$ python pulldom_test.py
with pulldom
(<xml.dom.minidom.Document instance at 0xb722962c>, [])
with minidom
(<xml.dom.minidom.Document instance at 0xb72459ac>, [<DOM Text node "u'\n'">, <DOM Element: a at 0xb71cb74c>, <DOM Text node "u'\n'">, <DOM Element: b at 0xb71cb7ac>, <DOM Text node "u'\n'">, <DOM Element: c at 0xb71cb80c>, <DOM Text node "u'\n'">])

I'd expect the same result for both ways of populating the minidom DOM implementation.

Thanks & Cheers, Achim
msg149011 - (view) Author: Achim Gaedke (AchimGaedke) Date: 2011-12-08 01:06
sorry, the output given before was generated with python2.7

changing the line
xml.sax.parseString(xml_data, d_handler)

to

xml.sax.parseString(bytes(xml_data, "utf-8"), d_handler)

makes it working for python 3.2, result the same:

$ python3.2 pulldom_test.py
with pulldom
<xml.dom.minidom.Document object at 0x9bf9f2c> []
with minidom
<xml.dom.minidom.Document object at 0x9c2fccc> [<DOM Text node "'\n'">, <DOM Element: a at 0x9c3850c>, <DOM Text node "'\n'">, <DOM Element: b at 0x9c3856c>, <DOM Text node "'\n'">, <DOM Element: c at 0x9c385cc>, <DOM Text node "'\n'">]

Both test were done on debian testing/wheezy.

$ python3.2 -V
Python 3.2.2rc1
$ python -V
Python 2.7.2+
msg149167 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011-12-10 16:30
So, is there a code or documentation bug in any version?
msg149488 - (view) Author: Achim Gaedke (AchimGaedke) Date: 2011-12-15 02:00
Potentially both: The xml.dom.pulldom documentation is not really there.

Maybe the PullDOM builds a partial tree, not caring about nested nodes. In contrast to SAX2DOM, which seems to fill the DOM tree completely.

I tried to figure out, what the programmer intended to do: Obviously SAX2DOM extends the behaviour of PullDOM.

The documentation (online V3.2, retrieved today) does state: "PullDOM allows building only selected portions of a Document Object Model representation"

Does that mean, one would derive a customized class from PullDOM, implement methods like SAX2DOM, but control the construction with conditions?

For example:

class MyDOM(PullDOM):
    def startElement(self, name, attrs):
        PullDOM.startElement(self, name, attrs)
        if name[:3]=="my_":
            curNode = self.elementStack[-1]
            parentNode = self.elementStack[-2]
            parentNode.appendChild(curNode)

When someone says "YEAH, MATE, ABSOLUTELY, YOU GOT IT!", I might be able fill some of the documentation gaps.
msg258345 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-01-16 01:06
Achim: where did you see your quote about the purpose of PullDOM? Neither <https://docs.python.org/3.2/library/xml.dom.pulldom.html> nor <https://hg.python.org/cpython/annotate/3.2/Lib/xml/dom/pulldom.py> even mention the word “selected”.

In Issue 9453 it was suggested to deprecate SAX2DOM due to being undocumented and quirky. Do you think we could do the same for PullDOM, or do you think it is useful?
History
Date User Action Args
2022-04-11 14:57:24adminsetgithub: 57760
2016-01-16 01:06:25martin.pantersetnosy: + martin.panter

messages: + msg258345
title: pulldom doesn't populate DOM tree -> pulldom.PullDOM doesn't populate DOM tree
2011-12-15 02:00:33AchimGaedkesetmessages: + msg149488
2011-12-10 16:30:59eric.araujosetnosy: + eric.araujo
messages: + msg149167
2011-12-08 01:06:06AchimGaedkesetfiles: + pulldom_test.py

messages: + msg149011
versions: + Python 2.7
2011-12-08 00:41:38AchimGaedkecreate