classification
Title: various refleaks in _elementtree, and crashes when using an uninitialized XMLParser object
Type: crash Stage: patch review
Components: XML Versions: Python 3.7, Python 3.6, Python 2.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Oren Milman, eli.bendersky, scoder, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2017-10-11 14:53 by Oren Milman, last changed 2017-10-14 19:42 by Oren Milman.

Pull Requests
URL Status Linked Edit
PR 3956 open Oren Milman, 2017-10-11 17:05
PR 3997 open Oren Milman, 2017-10-14 19:42
Messages (3)
msg304145 - (view) Author: Oren Milman (Oren Milman) * Date: 2017-10-11 14:53
The following code results in refleaks:
import sys
import _elementtree
builder = _elementtree.TreeBuilder()
parser = _elementtree.XMLParser(target=builder)

refcount_before = sys.gettotalrefcount()
parser.__init__(target=builder)
print(sys.gettotalrefcount() - refcount_before)  # should be close to 0

This is because _elementtree_XMLParser___init___impl()
(in Modules/_elementtree.c) doesn't decref before assigning to fields of
`self`.


The following code also results in refleaks:
import sys
import _elementtree
elem = _elementtree.Element(42)
elem.__setstate__({'tag': 42, '_children': list(range(1000))})

refcount_before = sys.gettotalrefcount()
elem.__setstate__({'tag': 42, '_children': []})
print(sys.gettotalrefcount() - refcount_before)  # should be close to -1000

This is because element_setstate_from_attributes() doesn't decref the old
children before storing the new children.


I would open a PR to fix this soon.
msg304322 - (view) Author: Oren Milman (Oren Milman) * Date: 2017-10-13 07:59
Shame on me. I only now found out that Serhiy already mentioned most of the refleaks
in https://bugs.python.org/issue31455#msg302103.
msg304396 - (view) Author: Oren Milman (Oren Milman) * Date: 2017-10-14 16:39
According to Serhiy's advice (https://bugs.python.org/issue31455#msg304338),
this issue now also includes some crashes in _elementtree:


The following code crashes:
import _elementtree
parser = _elementtree.XMLParser.__new__(_elementtree.XMLParser)
parser.close()

This is because _elementtree_XMLParser_close_impl() assumes that the XMLParser
object is initialized, and so it passes `self` to expat_parse(), which assumes
that `self->parser` is valid, and crashes.
Similarly, calling feed(), _parse_whole() or _setevents(), or reading the
`entity` or `target` attribute of an uninitialized XMLParser object would
result in a crash.


ISTM that PR 3956 is more complex, and already not so small, so i would soon
open another PR to fix these crashes.
History
Date User Action Args
2017-10-14 19:42:49Oren Milmansetpull_requests: + pull_request3972
2017-10-14 16:39:36Oren Milmansettype: resource usage -> crash
messages: + msg304396
title: various refleaks in _elementtree -> various refleaks in _elementtree, and crashes when using an uninitialized XMLParser object
2017-10-13 07:59:13Oren Milmansetmessages: + msg304322
2017-10-11 17:05:26Oren Milmansetkeywords: + patch
stage: needs patch -> patch review
pull_requests: + pull_request3931
2017-10-11 15:55:40serhiy.storchakasetnosy: + scoder, eli.bendersky, serhiy.storchaka
stage: needs patch

versions: + Python 2.7, Python 3.6
2017-10-11 14:53:39Oren Milmancreate