classification
Title: Pure Python xml.etree.ElementTree is missing default attribute values
Type: behavior Stage: resolved
Components: XML Versions: Python 3.10
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: corona10, eli.bendersky, mattip, obfusk, rhettinger, scoder, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2020-10-26 03:51 by obfusk, last changed 2021-02-24 02:28 by corona10. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 22987 merged obfusk, 2020-10-26 16:01
Messages (10)
msg379637 - (view) Author: Felix C. Stegerman (obfusk) * Date: 2020-10-26 03:51
I originally reported this as a bug in PyPy, but it turns out that CPython's C implementation (_elementtree) behaves differently than the pure Python version (b/c it sets specified_attributes = 1).

PyPy issue with example code: https://foss.heptapod.net/pypy/pypy/-/issues/3333
msg379672 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2020-10-26 16:32
The patch looks right. I'm not sure if this can still be changed in Py3.8, though, since that has been around for quite a while now.

Admittedly, few people will disable the C accelerator module and thus whitness this issue, but for them, this is a breaking change, and some code might rely on the current behaviour. I have no way to tell how much, and whether it intentionally relies on it.

I'd definitely change this for 3.9 and later. Maybe for 3.8, but it's at least a bit of a risk, given that there will only be very few more minor releases for it, and given that this is how things have been working for years. So, rather not, unless there is a convincing argument for backporting the change.
msg379739 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-10-27 07:06
specified_attributes = True is also set in xml.dom.expatbuilder. Should not it be set to true in the C implementation of ElementTree?
msg379796 - (view) Author: Felix C. Stegerman (obfusk) * Date: 2020-10-27 19:27
> specified_attributes = True is also set in xml.dom.expatbuilder.

That's good to know and should perhaps be addressed as well.

> Should not it be set to true in the C implementation of ElementTree?

That would break existing code.  Including mine.

I also think the current behaviour of the C implementation makes a lot more sense, especially as there is currently no way to request the alternative.

I think using specified_attributes=False as the default behaviour for both implementations is the best solution.  But I certainly would not oppose adding e.g. a keyword argument to override the default behaviour for those who would prefer the alternative.
msg379824 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2020-10-28 08:57
In general, since the C accelerator is enabled by default, and few people would consider disabling it explicitly, I generally consider the behaviour of the C implementation to be "right", if both implementations differ.

As a single data point, the reason why the difference was found in this case was differing behaviour in PyPy (which uses only the Python implementation). It was only later found to be a problem on the CPython side.

Changing the behaviour of the C implementation would certainly break a lot more code than changing the Python implementation.
msg383493 - (view) Author: mattip (mattip) * Date: 2020-12-21 08:24
Is there an owner for the XML module that can make a decision? The PR has a test that shows this fix brings the python implementation into sync with the C implementation, which is, unintuitively, the "reference implementation".
msg383558 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2020-12-21 21:46
> Changing the behaviour of the C implementation would 
> certainly break a lot more code than changing the Python
> implementation.

+1 for changing only the Python implementation.
msg387569 - (view) Author: mattip (mattip) * Date: 2021-02-23 13:57
PyPy issue https://foss.heptapod.net/pypy/pypy/-/issues/3181 shows another problem with the pure-python ElementTree implementation, that again is not reflected in the C implementation. Is there a code owner for this stdlib module?
msg387601 - (view) Author: Dong-hee Na (corona10) * (Python committer) Date: 2021-02-24 02:25
New changeset 1f433406bd46fbd00b88223ad64daea6bc9eaadc by Felix C. Stegerman in branch 'master':
bpo-42151: don't set specified_attributes=1 in pure Python ElementTree (GH-22987)
https://github.com/python/cpython/commit/1f433406bd46fbd00b88223ad64daea6bc9eaadc
msg387602 - (view) Author: Dong-hee Na (corona10) * (Python committer) Date: 2021-02-24 02:28
@obfusk
Thank you Felix for reporting and contributing!
History
Date User Action Args
2021-02-24 02:28:26corona10setstatus: open -> closed
versions: - Python 3.9
messages: + msg387602

resolution: fixed
stage: patch review -> resolved
2021-02-24 02:25:34corona10setmessages: + msg387601
2021-02-23 13:57:43mattipsetmessages: + msg387569
2020-12-21 21:46:05rhettingersetnosy: + rhettinger
messages: + msg383558
2020-12-21 08:24:49mattipsetnosy: + mattip
messages: + msg383493
2020-10-28 08:57:34scodersetmessages: + msg379824
2020-10-27 19:27:44obfusksetmessages: + msg379796
2020-10-27 07:06:49serhiy.storchakasetmessages: + msg379739
2020-10-26 16:32:45scodersetmessages: + msg379672
components: + XML, - Library (Lib)
versions: - Python 3.6, Python 3.7, Python 3.8
2020-10-26 16:01:16obfusksetkeywords: + patch
stage: patch review
pull_requests: + pull_request21901
2020-10-26 05:11:48corona10setnosy: + corona10
2020-10-26 03:58:18xtreaksetnosy: + scoder, eli.bendersky, serhiy.storchaka
2020-10-26 03:51:30obfuskcreate