Message338681
If we are writing xml with CDATA section and leaving non-empty indentation and new-line parameters, a parent node of the section will contain useless indentation, that will be parsed as a text.
Example:
>>>doc = minidom.Document()
>>>root = doc.createElement('root')
>>>doc.appendChild(root)
>>>node = doc.createElement('node')
>>>root.appendChild(node)
>>>data = doc.createCDATASection('</data>')
>>>node.appendChild(data)
>>>print(doc.toprettyxml(indent=‘ ‘ * 4)
<?xml version="1.0" ?>
<root>
<node>
<![CDATA[</data>]]> </node>
</root>
If we try to parse this output doc, we won’t get CDATA value correctly.
Following code returns a string that contains only indentation characters:
>>>doc = minidom.parseString(xml_text)
>>>doc.getElementsByTagName('node')[0].firstChild.nodeValue
Returns a string with CDATA value and indentation characters:
>>>doc.getElementsByTagName('node')[0].firstChild.wholeText
But we have a workaround:
>>>data.nodeType = data.TEXT_NODE
…
>>>print(doc.toprettyxml(indent=‘ ‘ * 4)
<?xml version="1.0" ?>
<root>
<node><![CDATA[</data>]]></node>
</root>
It will be parsed correctly:
>>>doc.getElementsByTagName('node')[0].firstChild.nodeValue
</data>
But I think it will be better if we fix the writing function, which would set this as default behavior. |
|
Date |
User |
Action |
Args |
2019-03-23 15:38:49 | vsurjaninov | set | recipients:
+ vsurjaninov |
2019-03-23 15:38:49 | vsurjaninov | set | messageid: <1553355529.11.0.873645774038.issue36407@roundup.psfhosted.org> |
2019-03-23 15:38:49 | vsurjaninov | link | issue36407 messages |
2019-03-23 15:38:48 | vsurjaninov | create | |
|