This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: ElementTree writes invalid files when UTF-16 encoding is specified
Type: behavior Stage: resolved
Components: Unicode, XML Versions: Python 2.7
process
Status: closed Resolution: wont fix
Dependencies: Superseder:
Assigned To: Nosy List: Adam.Urban, eli.bendersky, ezio.melotti, scoder, serhiy.storchaka
Priority: normal Keywords:

Created on 2013-05-31 06:54 by Adam.Urban, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Messages (5)
msg190392 - (view) Author: Adam Urban (Adam.Urban) Date: 2013-05-31 06:54
import xml.etree.ElementTree as ET
tree = ET.parse("myinput.xml")
tree.write("myoutput.xml", encoding="utf-16")

...Output is a garbled mess, often a mix of UTF-8 and UTF-16 bytes... UTF-8 output works fine, but when UTF-16, UTF-16LE, or UTF-16BE are specified the output is mangled.
msg190395 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-05-31 09:05
For 3.3+ it was fixed in issue1767933.
msg196736 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2013-09-01 20:47
I can well imagine that the serialiser is broken for this in Py2.x, given that the API accepts byte strings and stores them as such. The fix might be as simple as decoding byte strings in the serialiser before writing them out. Involves a pretty high performance regression, though (and ET's serialiser is known to be rather slow anyway).

Not sure if the current behaviour should be changed in 2.x.

In any case, it's a duplicate of the other ticket, which was *not* fixed for 2.7.
msg196745 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-09-01 22:33
Due to the fact that such bug was not fixed even in 3.2 where it was more ease I doubt that it worth to fix in 2.7.
msg196754 - (view) Author: Eli Bendersky (eli.bendersky) * (Python committer) Date: 2013-09-01 23:45
What Serhiy said.
History
Date User Action Args
2022-04-11 14:57:46adminsetgithub: 62305
2013-09-01 23:45:20eli.benderskysetstatus: open -> closed
resolution: wont fix
messages: + msg196754

stage: resolved
2013-09-01 22:33:45serhiy.storchakasetmessages: + msg196745
2013-09-01 20:47:03scodersetmessages: + msg196736
2013-09-01 19:28:37serhiy.storchakasetnosy: + scoder
2013-05-31 09:05:55serhiy.storchakasetnosy: + eli.bendersky, serhiy.storchaka

messages: + msg190395
versions: - Python 2.6, 3rd party, Python 3.1, Python 3.2, Python 3.3, Python 3.4, Python 3.5
2013-05-31 06:54:16Adam.Urbancreate