This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author dmtr
Recipients dmtr
Date 2010-04-30.22:57:26
SpamBayes Score 5.596571e-09
Marked as misclassified No
Message-id <1272668249.2.0.865128956478.issue8583@psf.upfronthosting.co.za>
In-reply-to
Content
The namespace_separator parameter is hard coded in the cElementTree.XMLParser class disallowing the option of ignoring XML Namespaces with cElementTree library.

Here's the code example:
 from xml.etree.cElementTree import iterparse
 from StringIO import StringIO
 xml = """<root xmlns="http://www.very_long_url.com"><child/></root>"""
 for event, elem in iterparse(StringIO(xml)): print event, elem

It produces:
 end <Element '{http://www.very_long_url.com}child' at 0xb7ddfa58>
 end <Element '{http://www.very_long_url.com}root' at 0xb7ddfa40> 

In the current implementation local tags get forcibly concatenated with URIs often resulting in the ugly code on the user's side and performance degradation (at least due to extra concatenations and extra lengthy compare operations in the elements matching code).

Internally cElementTree uses EXPAT parser, which is doing namespace processing only optionally, enabled by providing a value for namespace_separator argument. This argument is hard-coded in the cElementTree: 
 self->parser = EXPAT(ParserCreate_MM)(encoding, &memory_handler, "}");

Well, attached is a patch exposing this parameter in the cElementTree.XMLParser() arguments. This parameter is optional and the default behavior should be unchanged.  Here's the test code:

import cElementTree

x = """<root xmlns="http://www.very_long_url.com"><child>text</child></root>"""

parser = cElementTree.XMLParser()
parser.feed(x)
elem = parser.close()
print elem

parser = cElementTree.XMLParser(namespace_separator="}")
parser.feed(x)
elem = parser.close()
print elem

parser = cElementTree.XMLParser(namespace_separator=None)
parser.feed(x)
elem = parser.close()
print elem

The resulting output:
<Element '{http://www.very_long_url.com}root' at 0xb7e885f0>
<Element '{http://www.very_long_url.com}root' at 0xb7e88608>
<Element 'root' at 0xb7e88458>
History
Date User Action Args
2010-04-30 22:57:29dmtrsetrecipients: + dmtr
2010-04-30 22:57:29dmtrsetmessageid: <1272668249.2.0.865128956478.issue8583@psf.upfronthosting.co.za>
2010-04-30 22:57:27dmtrlinkissue8583 messages
2010-04-30 22:57:26dmtrcreate