Message358154
TLDR:
If I place "\r" in an Element attribute, it is handled and idiomized to " " in the XML file. But wait - \r is not really code 10, right?
Real description:
If I create ElementTree and read it just after creation, I'm getting what I put there - "\r". But if I save and re-load, it transforms into "\n". The character is incorrectly converted before being idiomized, and saved XML file has invalid value stored.
Quick repro:
# python3 -i
Python 3.8.0 (default, Oct 25 2019, 06:23:40) [GCC 9.2.0 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import xml.etree.ElementTree as ET
>>> elem = ET.Element('TEST')
>>> elem.set("Attt", "a\x0db")
>>> tree = ET.ElementTree(elem)
>>> with open("_test1.xml", "wb") as xml_fh:
... tree.write(xml_fh, encoding='utf-8', xml_declaration=True)
...
>>> tree.getroot().get("Attt")
'a\rb'
>>> tree = ET.parse("_test1.xml")
>>> tree.getroot().get("Attt")
'a\nb'
>>>
Related issue: https://bugs.python.org/issue5752
(keeping this one separate as it seem to be a simple bug, easy to fix outside of the discussion there)
If there's a good workaround - please let me know.
Tested on Windows, v3.8 and v3.6 |
|
Date |
User |
Action |
Args |
2019-12-09 23:40:50 | mefistotelis | set | recipients:
+ mefistotelis |
2019-12-09 23:40:50 | mefistotelis | set | messageid: <1575934850.24.0.848331688216.issue39011@roundup.psfhosted.org> |
2019-12-09 23:40:50 | mefistotelis | link | issue39011 messages |
2019-12-09 23:40:49 | mefistotelis | create | |
|