Author brycenesbitt
Recipients brycenesbitt
Date 2013-06-26.03:48:16
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1372218497.28.0.26294900817.issue18304@psf.upfronthosting.co.za>
In-reply-to
Content
ElementTree offers a wonderful and easy API for parsing XML... but if there is a namespace involved it suddenly gets ugly.  This is a proposal to fix that.  First an example:

------------------
!/usr/bin/python
# Demonstrate awkward behavior of namespaces in ElementTree
import xml.etree.cElementTree as ET

xml_sample_one = """\
<?xml version="1.0"?>
<presets>
<thing stuff="some stuff"/>
<thing stuff="more stuff"/>
</presets>
"""
root = ET.fromstring(xml_sample_one)
for child in root.iter('thing'):
    print child.tag

xml_sample_two = """\
<?xml version="1.0"?>
<presets xmlns="http://josm.openstreetmap.de/tagging-preset-1.0">
<thing stuff="some stuff"/>
<thing stuff="more stuff"/>
</presets>
"""
root = ET.fromstring(xml_sample_two)
for child in root.iter('{http://josm.openstreetmap.de/tagging-preset-1.0}thing'):
    print child.tag
------------------

Because of the namespace in the 2nd example, a {namespace} name keeps {namespace} getting {namespace} in {namespace} {namespace} the way.

Online there are dozens of question on how to deal with this, for example: http://stackoverflow.com/questions/11226247/python-ignore-xmlns-in-elementtree-elementtree

With wonderfully syntactic solutions like 'item.tag.split("}")[1][0:]'

-----
How about if I could set any root to have an array of namespaces to suppress:

root = ET.fromstring(xml_sample_two)
root.xmlns_at_root.append('{namespace}')

Or even just a boolean that says I'll take all my namespaces without qualification?
History
Date User Action Args
2013-06-26 03:48:17brycenesbittsetrecipients: + brycenesbitt
2013-06-26 03:48:17brycenesbittsetmessageid: <1372218497.28.0.26294900817.issue18304@psf.upfronthosting.co.za>
2013-06-26 03:48:17brycenesbittlinkissue18304 messages
2013-06-26 03:48:16brycenesbittcreate