Author scoder
Recipients Mariatta, dfrojas, eli.bendersky, lukasz.langa, matrixise, nedbat, rhettinger, scoder, serhiy.storchaka, sivert, taleinat, vstinner
Date 2019-03-18.17:44:30
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1552931070.91.0.413439137288.issue34160@roundup.psfhosted.org>
In-reply-to
Content
Victor, as much as I appreciate backwards compatibility, I really don't think it's a big deal in this case. In fact, it might not even apply.

My (somewhat educated) gut feeling is that most users simply won't care or won't even notice the change. Out of those who do (or have to) care, many are better off by fixing their code to not rely on an entirely arbitrary (sorted by name) attribute order than by getting the old behaviour back. And for those few who really need attributes to be sorted by name, there's the recipe I posted which works on all ElementTree implementations out there with all alive CPython versions.

> This recipe does modify the document and so changes the behaviour of the application when it iterates on attributes later

This is actually a very rare thing. If I were to make up numbers, I'd guess that some 99% of the applications do XML serialisation as the last thing and then throw away the tree afterwards, without touching it (or its attributes) again. And the remaining cases are most probably covered by the "don't need to care" type of users. I don't think we should optimise for 0.05% of our user base by providing a new API option for them. Especially in ElementTree, which decidedly aims to be simple.

The example that Ned gave refers to a very specific and narrow case: comparing serialised XML, at the byte level, in tests. He was very lucky that ElementTree was so stable over the last 10 Python releases that the output did not change at all. That is not something that an XML library needs to guarantee. There is some ambiguity in XML for everything that's outside of the XML Information set, and there is a good reason why the W3C has tackled this ambiguity with an explicit and *separate* specification: C14N. So, when you write:

> Many XML parsers rely on the order of attributes

It's certainly not many parsers, and could even be close to none. The order of attributes is explicitly excluded from the XML Information set:

https://www.w3.org/TR/xml-infoset/#omitted

Despite this, cases where the order of the attributes matters to the *application* are not unheard of. But for them, attributes sorted by their name are most likely the problem and not a solution. Raymond mentioned one such example. Sorting attributes by their name really only fulfils a single purpose: to get reproducible output in cases where the order does *not* matter. For all the (few) cases where the order *does* matter, it gets in the way.

But by removing the sorting, as this change does, we still get predictable output due to dict ordering. So this use case is still covered. It's just not necessarily the same output as before, because now the ordering is entirely in the hands of the users. Meaning, those users who *do* care can now actually influence the ordering, which was very difficult and hackish to achieve before. We are allowing users to remove these hacks, not forcing them to add new ones.
History
Date User Action Args
2019-03-18 17:44:30scodersetrecipients: + scoder, rhettinger, vstinner, taleinat, nedbat, eli.bendersky, lukasz.langa, serhiy.storchaka, matrixise, sivert, Mariatta, dfrojas
2019-03-18 17:44:30scodersetmessageid: <1552931070.91.0.413439137288.issue34160@roundup.psfhosted.org>
2019-03-18 17:44:30scoderlinkissue34160 messages
2019-03-18 17:44:30scodercreate