Title: Pure Python xml.etree.ElementTree is missing default attribute values
Type: behavior Stage: patch review
Components: XML Versions: Python 3.10, Python 3.9
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: corona10, eli.bendersky, obfusk, scoder, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2020-10-26 03:51 by obfusk, last changed 2020-10-28 08:57 by scoder.

Pull Requests
URL Status Linked Edit
PR 22987 open obfusk, 2020-10-26 16:01
Messages (5)
msg379637 - (view) Author: Felix C. Stegerman (obfusk) * Date: 2020-10-26 03:51
I originally reported this as a bug in PyPy, but it turns out that CPython's C implementation (_elementtree) behaves differently than the pure Python version (b/c it sets specified_attributes = 1).

PyPy issue with example code:
msg379672 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2020-10-26 16:32
The patch looks right. I'm not sure if this can still be changed in Py3.8, though, since that has been around for quite a while now.

Admittedly, few people will disable the C accelerator module and thus whitness this issue, but for them, this is a breaking change, and some code might rely on the current behaviour. I have no way to tell how much, and whether it intentionally relies on it.

I'd definitely change this for 3.9 and later. Maybe for 3.8, but it's at least a bit of a risk, given that there will only be very few more minor releases for it, and given that this is how things have been working for years. So, rather not, unless there is a convincing argument for backporting the change.
msg379739 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-10-27 07:06
specified_attributes = True is also set in xml.dom.expatbuilder. Should not it be set to true in the C implementation of ElementTree?
msg379796 - (view) Author: Felix C. Stegerman (obfusk) * Date: 2020-10-27 19:27
> specified_attributes = True is also set in xml.dom.expatbuilder.

That's good to know and should perhaps be addressed as well.

> Should not it be set to true in the C implementation of ElementTree?

That would break existing code.  Including mine.

I also think the current behaviour of the C implementation makes a lot more sense, especially as there is currently no way to request the alternative.

I think using specified_attributes=False as the default behaviour for both implementations is the best solution.  But I certainly would not oppose adding e.g. a keyword argument to override the default behaviour for those who would prefer the alternative.
msg379824 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2020-10-28 08:57
In general, since the C accelerator is enabled by default, and few people would consider disabling it explicitly, I generally consider the behaviour of the C implementation to be "right", if both implementations differ.

As a single data point, the reason why the difference was found in this case was differing behaviour in PyPy (which uses only the Python implementation). It was only later found to be a problem on the CPython side.

Changing the behaviour of the C implementation would certainly break a lot more code than changing the Python implementation.
Date User Action Args
2020-10-28 08:57:34scodersetmessages: + msg379824
2020-10-27 19:27:44obfusksetmessages: + msg379796
2020-10-27 07:06:49serhiy.storchakasetmessages: + msg379739
2020-10-26 16:32:45scodersetmessages: + msg379672
components: + XML, - Library (Lib)
versions: - Python 3.6, Python 3.7, Python 3.8
2020-10-26 16:01:16obfusksetkeywords: + patch
stage: patch review
pull_requests: + pull_request21901
2020-10-26 05:11:48corona10setnosy: + corona10
2020-10-26 03:58:18xtreaksetnosy: + scoder, eli.bendersky, serhiy.storchaka
2020-10-26 03:51:30obfuskcreate