diff --git a/Doc/library/xml.etree.elementtree.rst b/Doc/library/xml.etree.elementtree.rst --- a/Doc/library/xml.etree.elementtree.rst +++ b/Doc/library/xml.etree.elementtree.rst @@ -284,6 +284,68 @@ >>> ET.dump(a) +Parsing XML with Namespaces +^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +If the XML input has namespaces, the prefixes are expanded so that +tags in the form ``prefix:sometag`` get expanded to ``{uri}tag``. +Also, if there is a default namespace, it is prepended to all of the +non-prefixed tags. + +Here is example XML that uses two namespaces, one with the prefix +"fictional" and the other serving as a default namespace: + +.. code-block:: xml + + + + + John Cleese + Lancelot + Archie Leach + + + Eric Idle + Sir Robin + Gunther + Commander Clement + + + +One way to search and explore this XML is to manually add the URI to +every tag in the xpath of a *find()* or *findall()*:: + + root = parse('demo.xml').getroot() + for actor in root.findall('{http://people.example.com}actor'): + name = actor.find('{http://people.example.com}name') + print(name.text) + for char in actor.findall('{http://characters.example.com}character'): + print(' |-->', char.text) + +Another way to search the namespaced XML is to create a dictionary +with your own prefixes and use those in the search:: + + ns = {'real_person': 'http://people.example.com', + 'role': 'http://characters.example.com'} + + for actor in root.findall('real_person:actor', ns): + name = actor.find('real_person:name', ns) + print(name.text) + for char in actor.findall('role:character', ns): + print(' |-->', char.text) + +These two approaches both output:: + + John Cleese + |--> Lancelot + |--> Archie Leach + Eric Idle + |--> Sir Robin + |--> Gunther + |--> Commander Clement + + Additional resources ^^^^^^^^^^^^^^^^^^^^ @@ -366,6 +428,9 @@ | ``[tag]`` | Selects all elements that have a child named | | | ``tag``. Only immediate children are supported. | +-----------------------+------------------------------------------------------+ +| ``[tag=text]`` | Selects all elements that have a child named | +| | ``tag`` that includes the given ``text``. | ++-----------------------+------------------------------------------------------+ | ``[position]`` | Selects all elements that are located at the given | | | position. The position can be either an integer | | | (1 is the first position), the expression ``last()`` |