This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: xml.etree.ElementTree.ElementTree.write attribute sorting
Type: enhancement Stage: resolved
Components: Library (Lib) Versions: Python 3.3, Python 3.4
process
Status: closed Resolution: duplicate
Dependencies: Superseder:
Assigned To: Nosy List: bagratte, eli.bendersky, martin.panter, scoder, xtreak
Priority: normal Keywords:

Created on 2014-01-09 01:35 by bagratte, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Messages (5)
msg207721 - (view) Author: bagrat lazaryan (bagratte) Date: 2014-01-09 01:35
xml.etree.ElementTree.ElementTree.write method (and, as a derivative, xml.etree.ElementTree.tostring function) sorts attributes in lexical order. while an admissible behavior instead of the randomness inherited from ordinary dict, this prevents a picky user to have her own custom ordering by passing an OrderedDict to Element, SubElement and the like (i guess there are none). that is to say:

if
-----------------------------------
e = Element("tag", OrderedDict([("a", "a"), ("c", "c"), ("b", "b")]))
-----------------------------------
then both
-----------------------------------
tostring(e)
ElementTree(e).write("xml.xml")
-----------------------------------
will result in
-----------------------------------
<tag a="a" b="b" c="c" />
-----------------------------------
while the intention of the user was
-----------------------------------
<tag a="a" c="c" b="b" />
msg207729 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2014-01-09 07:50
IMHO, it makes sense to support this. My intuition tells me that lxml also handles this as expected, by accident through iteration.

Not sure how to do this correctly in ET, though. Special case dict? Or special case OrderedDict? Both would leave some reasonable use cases uncovered, e.g. dict subclasses that do not impact iteration, and self-implemented OrderedDict-like types. And being too broad in the special casing will certainly kill someone's doctests somewhere...

Given that OrderedDict is the one way to do this in recent Python versions, I guess it would be reasonable to special case on that type.
msg207730 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2014-01-09 07:53
> My intuition tells me that lxml also handles this as expected, by accident through iteration.

And, obviously, it doesn't. It sorts, too. :)

I'm ok with switching for both libraries.
msg215401 - (view) Author: bagrat lazaryan (bagratte) Date: 2014-04-02 20:52
well... ElementTree.py imports some c accelerators as can be seen at the end of the file. i have no idea how to get to those accelerators, and even if i had, i don't think i would make anything of them.
as far as the pure python code concerns in the rest of ElementTree.py, it suffices not to sort the items in _serialize_xml:

line 929 of ElementTree.py:
-for k, v in sorted(items):  # lexical order
+for k, v in items:

i gather something similar must be done in c accelerators.

(by the way, does anyone know why i am not receiving email notifications when someone posts to an issue i have started or i have commented to?)
msg329716 - (view) Author: Karthikeyan Singaravelan (xtreak) * (Python committer) Date: 2018-11-12 05:49
dictionary's insertion order is preserved in 3.6 and above. Hence sorting by lexical order was removed in issue34160 and there is a discussion in the same issue to add an option to provide sorted output.

As part of triaging I propose closing this issue since the changes were made in issue34160.

$ ./python.exe
Python 3.8.0a0 (heads/master:cd449806fa, Nov 12 2018, 09:51:24)
[Clang 7.0.2 (clang-700.1.81)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from xml.etree.ElementTree import Element, tostring
>>> from collections import OrderedDict
>>> e = Element("tag", OrderedDict([("a", "a"), ("c", "c"), ("b", "b")]))
>>> tostring(e)
b'<tag a="a" c="c" b="b" />'
>>> e = Element("tag", dict([("a", "a"), ("c", "c"), ("b", "b")]))
>>> tostring(e)
b'<tag a="a" c="c" b="b" />'
History
Date User Action Args
2022-04-11 14:57:56adminsetgithub: 64397
2020-09-08 05:02:49scodersetstatus: open -> closed
resolution: duplicate
stage: resolved
2018-11-12 05:49:51xtreaksetnosy: + xtreak
messages: + msg329716
2014-04-17 09:12:28martin.pantersetnosy: + martin.panter
2014-04-02 20:52:31bagrattesetversions: + Python 3.4
2014-04-02 20:52:20bagrattesetmessages: + msg215401
2014-01-09 07:53:55scodersetmessages: + msg207730
2014-01-09 07:50:21scodersetnosy: + scoder, eli.bendersky
messages: + msg207729
2014-01-09 01:35:08bagrattecreate