Message273937
Both on python2.7 and python3.4
>>> from xml.etree import cElementTree as ET
>>> text = '<end>its > < & '</end>'
>>> root = ET.fromstring(text.encode('utf-8'))
>>> ET.tostring(root, method="xml")
<end>its > < & '</end>
I would expected to return the same as the input to be a complient XML 1.0
I would understand why for html it would return something diffrent, see:
http://stackoverflow.com/questions/2083754/why-shouldnt-apos-be-used-to-escape-single-quotes
as a workaround I had to path ElementTree:
from xml.etree.ElementTree import _escape_cdata ,_raise_serialization_error
from mock import patch
def _escape_cdata(text):
# escape character data
try:
# it's worth avoiding do-nothing calls for strings that are
# shorter than 500 character, or so. assume that's, by far,
# the most common case in most applications.
if "&" in text:
text = text.replace("&", "&")
if "<" in text:
text = text.replace("<", "<")
if ">" in text:
text = text.replace(">", ">")
if "'" in text:
text = text.replace("'", "'")
return text
except (TypeError, AttributeError):
_raise_serialization_error(text)
from xml.etree import cElementTree as ET
text = '<end>its > < & '</end>'
root = ET.fromstring(text.encode('utf-8'))
with patch('xml.etree.ElementTree._escape_cdata', new=_escape_cdata):
s = ET.tostring(root, encoding='unicode', method="xml")
print(s) |
|
Date |
User |
Action |
Args |
2016-08-30 17:18:25 | fruch | set | recipients:
+ fruch |
2016-08-30 17:18:25 | fruch | set | messageid: <1472577505.85.0.700751415342.issue27899@psf.upfronthosting.co.za> |
2016-08-30 17:18:25 | fruch | link | issue27899 messages |
2016-08-30 17:18:25 | fruch | create | |
|