classification
Title: xml.dom.createElement() does not take implicit namespaces into account
Type: behavior Stage: resolved
Components: XML Versions: Python 2.7
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: Alexander.Tobias.Heinrich, jkloth, karlcow
Priority: normal Keywords:

Created on 2013-06-10 13:59 by Alexander.Tobias.Heinrich, last changed 2021-06-18 21:05 by iritkatriel. This issue is now closed.

Files
File name Uploaded Description Edit
pydomprob.py Alexander.Tobias.Heinrich, 2013-06-10 13:59 Sample script
Messages (3)
msg190906 - (view) Author: Alexander Tobias Heinrich (Alexander.Tobias.Heinrich) Date: 2013-06-10 13:59
First of all, I am not sure, if this is a bug in python itself - it could as well be a bug in the py-dom-xpath module (http://code.google.com/p/py-dom-xpath) or not a bug at all (but I find the latter to be highly unlikely).

Consider an XML document such as:
<?xml version="1.0" encoding="utf-8"?>
<Zoo xmlns='http://foo.bar/zoo'>
  <Compound><Chimp/></Compound>
</Zoo>

If one creates a new Chimp-element using the xml.dom.createElement() function and then appends it to the Compound element, then the xpath module will not find the element unless the whole document is saved and then re-parsed.

Creating the element with xml.dom.createElementNS() and thus explicitly specifying its namespace works around the problem.

I consider this to be a bug, because in both cases, xml.dom will create the same valid XML, so I believe xpath should produce the same results in both cases, too. My believe, that the bug is in xml.dom is just a feeling and I could be wrong. I imagine, that xml.dom.createElement() forgets about adding the new element to its inherited namespace and adds it to the null namespace instead. In consequence xpath doesn't find the new element.

I originally posted a more verbose explanation of this issue to StackOverflow (see http://stackoverflow.com/questions/16980521/python-xpath-find-wont-find-new-elements-if-they-were-added-without-namespac
), because I was unsure about whether this is a bug or not - and if it was, then in which module. Because I did not receive any feedback on that post, I have now decided to file it here as a bug report.

I attached a sample script that demonstrates the problem if (xpath dependency is installed). I tested it under Windows with Python 2.7.5 and 2.7.4.
msg190927 - (view) Author: Jeremy Kloth (jkloth) * Date: 2013-06-10 18:07
This really is not a bug, but more of a (common) misunderstanding of how the mixing of namespace-aware (createElementNS) and namespace-ignorant (createElement) methods work.

From DOM3 Core [http://www.w3.org/TR/DOM-Level-3-Core/core.html#Namespaces-Considerations]:

  Elements and attributes created by the DOM Level 1 methods do not
  have a namespace prefix, namespace URI or local name.

DOM4 updates this to say that the namespace URI and prefix are null if not explicitly given at creation.

To use your example, adding a Chimp-element to the existing document is really:

<?xml version="1.0" encoding="utf-8"?>
<Zoo xmlns='http://foo.bar/zoo'>
  <Compound><Chimp/><Chimp xmlns=""/></Compound>
</Zoo>

The fact that serializing and re-parsing it works is really a bug in the serializing code not honoring the null-namespace.

So in short, always use create*NS() methods when dealing with namespace-aware XML (like XPath).
msg353299 - (view) Author: karl (karlcow) * Date: 2019-09-26 12:45
The current specification as of today documents
https://dom.spec.whatwg.org/#dom-document-createelementns


If you run this in the browser console, 

var nsdoc = 'http://foo.bar/zoo';
var xmldoc = document.implementation.createDocument(nsdoc, 'Zoo', null);
var cpd = document.createElementNS(nsdoc, 'Compound');
var chimp = document.createElementNS(nsdoc, 'Chimp');
cpd.appendChild(chimp)
xmldoc.documentElement.appendChild(cpd);

/* serializing */
var docserializer = new XMLSerializer();
var flatxml = docserializer.serializeToString(xmldoc);
flatxml


you get:

<Zoo xmlns="http://foo.bar/zoo">
  <Compound>
    <Chimp/>
  </Compound>
</Zoo>


but if you run this in the browser console,

var nsdoc = 'http://foo.bar/zoo';
var xmldoc = document.implementation.createDocument(nsdoc, 'Zoo', null);
var cpd = document.createElement('Compound');
var chimp = document.createElement('Chimp');
cpd.appendChild(chimp)
xmldoc.documentElement.appendChild(cpd);

/* serializing */
var docserializer = new XMLSerializer();
var flatxml = docserializer.serializeToString(xmldoc);
flatxml


you get:


<Zoo xmlns="http://foo.bar/zoo">
  <compound xmlns="http://www.w3.org/1999/xhtml">
    <chimp></chimp>
  </compound>
</Zoo>


which is a complete different beast.


I don't think there is an issue here. And we can close this bug safely.
History
Date User Action Args
2021-06-18 21:05:40iritkatrielsetstatus: open -> closed
resolution: not a bug
stage: resolved
2019-09-26 12:45:20karlcowsetnosy: + karlcow
messages: + msg353299
2013-06-10 18:07:27jklothsetnosy: + jkloth
messages: + msg190927
2013-06-10 13:59:13Alexander.Tobias.Heinrichcreate