Author scoder
Recipients eli.bendersky, gene_wood, scoder, silverbacknet, wiml
Date 2014-04-14.06:00:41
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1397455243.28.0.771018697869.issue17088@psf.upfronthosting.co.za>
In-reply-to
Content
@gene_wood: that's unrelated. This ticket is about attributes being rejected incorrectly.

Fixing the example of the OP:

>>> from xml.etree.ElementTree import *
>>> svg = ElementTree(XML("""
... <svg width="12cm" height="4cm" viewBox="0 0 1200 400" xmlns="http://www.w3.org/2000/svg" version="1.1">
... <rect x="1" y="1" width="1198" height="398" fill="none" stroke="blue" stroke-width="2" />
... </svg>
... """))
>>> tostring(svg.getroot())   # formatting is mine
b'<svg:svg xmlns:svg="http://www.w3.org/2000/svg" height="4cm" version="1.1" viewBox="0 0 1200 400" width="12cm">\n
      <svg:rect fill="none" height="398" stroke="blue" stroke-width="2" width="1198" x="1" y="1" />\n
  </svg:svg>'
>>> svg.write('simple_new.svg',encoding='UTF-8',default_namespace='http://www.w3.org/2000/svg')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.3/xml/etree/ElementTree.py", line 826, in write
    qnames, namespaces = _namespaces(self._root, default_namespace)
  File "/usr/lib/python3.3/xml/etree/ElementTree.py", line 942, in _namespaces
    add_qname(key)
  File "/usr/lib/python3.3/xml/etree/ElementTree.py", line 920, in add_qname
    "cannot use non-qualified names with "
ValueError: cannot use non-qualified names with default_namespace option
>>> svg.write('simple_new.svg',encoding='UTF-8')
>>> 

So, it works without namespace defaulting and fails with an incorrect error when a default namespace is provided. Clearly a bug.

Regarding the proposed patch: it looks like the right thing to do in general, but it has a relatively high code impact. I would prefer a patch with lower churn. One thing that could be tried is to use only one tag cache dict and extend the key from the plain tag to (tag, is_attribute). Might have a performance impact on the already slow serialiser, though. In any case, both approaches are quite wasteful, because they duplicate the entire namespace-prefix mapping just because there might be a single namespace that behaves differently for atributes. An alternative could be to split the *value* of the mapping in two: (element_prefix, attribute_prefix). This would keep the overhead at serialisation low, with only slightly more work when building the mapping. At first sight, I like that idea better.

This code returns a list in one case and a set-like view in another (Py3):

+    if default_namespace:
+        prefixes_list = [ (default_namespace, "") ]
+        prefixes_list.extend(namespaces.items())
+    else:
+        prefixes_list = namespaces.items()

I can't see the need for this change. Why can't the default namespace be stored in the namespaces dict right from the start, as it was before?

As a minor nitpick, this lambda based sort key:

    key=lambda x: x[1]):  # sort on prefix

is better expressed using operator.itemgetter(1).

I'd also rename the "defaultable" flag to "is_attribute" and pass it as keyword argument (bare boolean parameters are unreadable in function calls).

Given the impact of this change, I'd also suggest not applying it to Py2.x anymore.
History
Date User Action Args
2014-04-14 06:00:43scodersetrecipients: + scoder, eli.bendersky, wiml, silverbacknet, gene_wood
2014-04-14 06:00:43scodersetmessageid: <1397455243.28.0.771018697869.issue17088@psf.upfronthosting.co.za>
2014-04-14 06:00:43scoderlinkissue17088 messages
2014-04-14 06:00:41scodercreate