Message191894
ElementTree offers a wonderful and easy API for parsing XML... but if there is a namespace involved it suddenly gets ugly. This is a proposal to fix that. First an example:
------------------
!/usr/bin/python
# Demonstrate awkward behavior of namespaces in ElementTree
import xml.etree.cElementTree as ET
xml_sample_one = """\
<?xml version="1.0"?>
<presets>
<thing stuff="some stuff"/>
<thing stuff="more stuff"/>
</presets>
"""
root = ET.fromstring(xml_sample_one)
for child in root.iter('thing'):
print child.tag
xml_sample_two = """\
<?xml version="1.0"?>
<presets xmlns="http://josm.openstreetmap.de/tagging-preset-1.0">
<thing stuff="some stuff"/>
<thing stuff="more stuff"/>
</presets>
"""
root = ET.fromstring(xml_sample_two)
for child in root.iter('{http://josm.openstreetmap.de/tagging-preset-1.0}thing'):
print child.tag
------------------
Because of the namespace in the 2nd example, a {namespace} name keeps {namespace} getting {namespace} in {namespace} {namespace} the way.
Online there are dozens of question on how to deal with this, for example: http://stackoverflow.com/questions/11226247/python-ignore-xmlns-in-elementtree-elementtree
With wonderfully syntactic solutions like 'item.tag.split("}")[1][0:]'
-----
How about if I could set any root to have an array of namespaces to suppress:
root = ET.fromstring(xml_sample_two)
root.xmlns_at_root.append('{namespace}')
Or even just a boolean that says I'll take all my namespaces without qualification? |
|
Date |
User |
Action |
Args |
2013-06-26 03:48:17 | brycenesbitt | set | recipients:
+ brycenesbitt |
2013-06-26 03:48:17 | brycenesbitt | set | messageid: <1372218497.28.0.26294900817.issue18304@psf.upfronthosting.co.za> |
2013-06-26 03:48:17 | brycenesbitt | link | issue18304 messages |
2013-06-26 03:48:16 | brycenesbitt | create | |
|