:mod:`xml.dom.pulldom` --- Support for building partial DOM trees
=================================================================
.. module:: xml.dom.pulldom
:synopsis: Support for building partial DOM trees from SAX events.
.. moduleauthor:: Paul Prescod
**Source code:** :source:`Lib/xml/dom/pulldom.py`
--------------
The :mod:`xml.dom.pulldom` module provides a "pull parser" which can also be
asked to produce DOM-accessible fragments of the document where necessary. The
basic concept involves pulling "events" from a stream of incoming XML and
processing them, although in contrast to SAX which also employs an event-driven
processing model together with callbacks, the user of a pull parser is
responsible for explicitly pulling events from the stream, looping over those
events until either processing is finished or an error condition occurs.
Example::
from xml.dom import pulldom
doc = pulldom.parse("foo.xml")
for event, node in doc:
process(node)
``event`` is a string and can be one of:
* START_ELEMENT
* END_ELEMENT
* COMMENT
* START_DOCUMENT
* END_DOCUMENT
* CHARACTERS
* PROCESSING_INSTRUCTION
* IGNORABLE_WHITESPACE
``node`` is a object of type :class:`xml.dom.minidom.Document`,
:class:`xml.dom.minidom.Element` or :class:`xml.dom.minidom.Text`.
Since the document is treated as a "flat" stream of events, the document "tree"
is implicitly traversed and the desired elements are found regardless of their
depth in the tree. In other words, one need not consider hierarchical issues
such as recursive searching of the document nodes, although if the context of
elements were important, one would either need to maintain some context-related
state (ie. remembering where one is in the document at any given point) or to
make use of the :func:`DOMEventStream.expandNode` method and switch to DOM-related processing.
.. function:: parse(stream_or_string, parser=None, bufsize=None)
Return a :class:`DOMEventStream` from the given input. *stream_or_string* may be
either a file name, or a file-like object. *parser*, if given, must be a
SAX2 parser object. This function will change the document handler of the
parser and activate namespace support; other parser configuration (like
setting an entity resolver) must have been done in advance.
If you have XML in a string, you can use the :func:`parseString` function instead:
.. function:: parseString(string, parser=None)
Return a :class:`DOMEventStream` that represents the *string*.
.. data:: default_bufsize
Default value for the *bufsize* parameter to :func:`parse`.
The value of this variable can be changed before calling :func:`parse` and
the new value will take effect.
.. _domeventstream-objects:
DOMEventStream Objects
----------------------
.. method:: DOMEventStream.getEvent()
Return a tuple containing *event* and the current *node* as
:class:`xml.dom.minidom.Document` if event equals START_DOCUMENT,
:class:`xml.dom.minidom.Element` if event equals START_ELEMENT or
END_ELEMENT or :class:`xml.dom.minidom.Text` if event equals CHARACTERS.
The current node does not contain informations about its children, unless
:func:`expandNode` is called.
.. method:: DOMEventStream.expandNode(node)
Expands all children of *node* into *node*. Example::
xml = 'Foo Some text
and more
'
doc = pulldom.parseString(xml)
for event, node in doc:
if event == pulldom.START_ELEMENT and node.tagName == 'p':
# Following statement only prints ''
print(node.toxml())
doc.exandNode(node)
# Following statement prints node with all its children 'Some text
and more
'
print(node.toxml())