Title: C implementation of xml.etree.ElementTree does not make a copy of attrib argument when creating new Element
Type: behavior Stage: patch review
Components: Library (Lib), XML Versions: Python 3.9, Python 3.8, Python 2.7
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: asvetlov, eli.bendersky, gphemsley, mdk, p-ganssle, r.david.murray, scoder, serhiy.storchaka, thatiparthy
Priority: normal Keywords: patch

Created on 2019-04-21 00:07 by gphemsley, last changed 2019-04-21 21:41 by gphemsley.

Pull Requests
URL Status Linked Edit
PR 12891 gphemsley, 2019-04-21 21:41
Messages (4)
msg340590 - (view) Author: Gordon P. Hemsley (gphemsley) * Date: 2019-04-21 00:07
In the process of investigating and writing tests for issue32424, I discovered that the C implementation of xml.etree.ElementTree does not make a copy of the attrib argument when creating a new element, allowing the attributes of the element to be modified outside of creation.

The Python implementation does not have this problem.
msg340591 - (view) Author: Gordon P. Hemsley (gphemsley) * Date: 2019-04-21 00:09
My proposed solution to this was to make a copy of the attrib dictionary in the create_new_element() method in the C implementation, which solves the problem. However, this was apparently objected to on the grounds of performance.

Not knowing C very well, any guidance on a better way to fix this would be appreciated. :)
msg340598 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2019-04-21 06:23
Instead of always copying the dict in create_new_element(), we should make sure that all code that calls that function (directly or indirectly) does so with a safely owned dict. If that means that we need to add dict copying in some other place, then that's what's needed to fix this bug.

The best fix is to make the copy close to the user input, where we know what we received and how we will continue to use it.
msg340599 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2019-04-21 06:28
Given that this has probably been like this forever, and thus provably doesn't hurt very much, I would argue that it's not worth fixing this in Py3.7. Not sure about Py2.7, but I think the answer there should be a no, too.
Date User Action Args
2019-04-21 21:41:30gphemsleysetkeywords: + patch
stage: needs patch -> patch review
pull_requests: + pull_request12823
2019-04-21 06:28:43scodersetstage: needs patch
messages: + msg340599
versions: - Python 3.5, Python 3.6, Python 3.7
2019-04-21 06:23:34scodersetmessages: + msg340598
2019-04-21 00:09:23gphemsleysetmessages: + msg340591
2019-04-21 00:07:40gphemsleycreate