This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: various refleaks in _elementtree, and crashes when using an uninitialized XMLParser object
Type: crash Stage: resolved
Components: XML Versions: Python 3.9, Python 3.8, Python 3.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: Oren Milman, eli.bendersky, miss-islington, scoder, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2017-10-11 14:53 by Oren Milman, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 3956 closed Oren Milman, 2017-10-11 17:05
PR 3997 merged Oren Milman, 2017-10-14 19:42
PR 19485 merged miss-islington, 2020-04-12 14:37
PR 19486 closed miss-islington, 2020-04-12 15:20
PR 19487 merged miss-islington, 2020-04-12 16:10
Messages (9)
msg304145 - (view) Author: Oren Milman (Oren Milman) * Date: 2017-10-11 14:53
The following code results in refleaks:
import sys
import _elementtree
builder = _elementtree.TreeBuilder()
parser = _elementtree.XMLParser(target=builder)

refcount_before = sys.gettotalrefcount()
parser.__init__(target=builder)
print(sys.gettotalrefcount() - refcount_before)  # should be close to 0

This is because _elementtree_XMLParser___init___impl()
(in Modules/_elementtree.c) doesn't decref before assigning to fields of
`self`.


The following code also results in refleaks:
import sys
import _elementtree
elem = _elementtree.Element(42)
elem.__setstate__({'tag': 42, '_children': list(range(1000))})

refcount_before = sys.gettotalrefcount()
elem.__setstate__({'tag': 42, '_children': []})
print(sys.gettotalrefcount() - refcount_before)  # should be close to -1000

This is because element_setstate_from_attributes() doesn't decref the old
children before storing the new children.


I would open a PR to fix this soon.
msg304322 - (view) Author: Oren Milman (Oren Milman) * Date: 2017-10-13 07:59
Shame on me. I only now found out that Serhiy already mentioned most of the refleaks
in https://bugs.python.org/issue31455#msg302103.
msg304396 - (view) Author: Oren Milman (Oren Milman) * Date: 2017-10-14 16:39
According to Serhiy's advice (https://bugs.python.org/issue31455#msg304338),
this issue now also includes some crashes in _elementtree:


The following code crashes:
import _elementtree
parser = _elementtree.XMLParser.__new__(_elementtree.XMLParser)
parser.close()

This is because _elementtree_XMLParser_close_impl() assumes that the XMLParser
object is initialized, and so it passes `self` to expat_parse(), which assumes
that `self->parser` is valid, and crashes.
Similarly, calling feed(), _parse_whole() or _setevents(), or reading the
`entity` or `target` attribute of an uninitialized XMLParser object would
result in a crash.


ISTM that PR 3956 is more complex, and already not so small, so i would soon
open another PR to fix these crashes.
msg366248 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2020-04-12 13:52
This has been pending for a while too long, but the fixes look good to me. They should still go at least into Py3.8 and later.
msg366252 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2020-04-12 14:36
New changeset 402e1cdb132f384e4dcde7a3d7ec7ea1fc7ab527 by Oren Milman in branch 'master':
bpo-31758: Prevent crashes when using an uninitialized _elementtree.XMLParser object (GH-3997)
https://github.com/python/cpython/commit/402e1cdb132f384e4dcde7a3d7ec7ea1fc7ab527
msg366253 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2020-04-12 15:19
New changeset 61511488cf4e7a1cb57a38efba7e0a84a387fe58 by Miss Islington (bot) in branch '3.8':
bpo-31758: Prevent crashes when using an uninitialized _elementtree.XMLParser object (GH-3997) (GH-19485)
https://github.com/python/cpython/commit/61511488cf4e7a1cb57a38efba7e0a84a387fe58
msg366254 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2020-04-12 15:22
Let's add it to the last bug fix release of 3.7 as well. It fixes a crash bug, after all.
msg366255 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2020-04-12 17:15
New changeset 096e41aa4e558b28b7260fe01eb21414b1458b20 by Miss Islington (bot) in branch '3.7':
[3.7] bpo-31758: Prevent crashes when using an uninitialized _elementtree.XMLParser object (GH-3997) (GH-19487)
https://github.com/python/cpython/commit/096e41aa4e558b28b7260fe01eb21414b1458b20
msg366256 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2020-04-12 17:16
Thanks for the fix, Oren!
History
Date User Action Args
2022-04-11 14:58:53adminsetgithub: 75939
2020-04-12 17:16:54scodersetstatus: open -> closed
resolution: fixed
messages: + msg366256

stage: patch review -> resolved
2020-04-12 17:15:41scodersetmessages: + msg366255
2020-04-12 16:10:53miss-islingtonsetstage: backport needed -> patch review
pull_requests: + pull_request18840
2020-04-12 15:22:10scodersetstage: patch review -> backport needed
messages: + msg366254
versions: + Python 3.7
2020-04-12 15:20:24miss-islingtonsetpull_requests: + pull_request18839
2020-04-12 15:19:01scodersetmessages: + msg366253
2020-04-12 14:37:01miss-islingtonsetnosy: + miss-islington
pull_requests: + pull_request18838
2020-04-12 14:36:48scodersetmessages: + msg366252
2020-04-12 13:52:37scodersetmessages: + msg366248
versions: + Python 3.8, Python 3.9, - Python 2.7, Python 3.6, Python 3.7
2017-10-14 19:42:49Oren Milmansetpull_requests: + pull_request3972
2017-10-14 16:39:36Oren Milmansettype: resource usage -> crash
messages: + msg304396
title: various refleaks in _elementtree -> various refleaks in _elementtree, and crashes when using an uninitialized XMLParser object
2017-10-13 07:59:13Oren Milmansetmessages: + msg304322
2017-10-11 17:05:26Oren Milmansetkeywords: + patch
stage: needs patch -> patch review
pull_requests: + pull_request3931
2017-10-11 15:55:40serhiy.storchakasetnosy: + scoder, eli.bendersky, serhiy.storchaka
stage: needs patch

versions: + Python 2.7, Python 3.6
2017-10-11 14:53:39Oren Milmancreate