classification
Title: ElementTree.XMLParser() mishandles exceptions
Type: behavior Stage: patch review
Components: Extension Modules Versions: Python 3.7, Python 3.6, Python 2.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: eli.bendersky, haypo, scoder, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2017-09-13 17:28 by scoder, last changed 2017-09-14 23:13 by serhiy.storchaka.

Pull Requests
URL Status Linked Edit
PR 3545 merged scoder, 2017-09-13 17:29
PR 3585 merged python-dev, 2017-09-14 20:00
Messages (6)
msg302101 - (view) Author: Stefan Behnel (scoder) * Date: 2017-09-13 17:28
The "XMLParser.__init__()" method in "_elementtree.c" contains this code:

    self->handle_start = PyObject_GetAttrString(target, "start");
    self->handle_data = PyObject_GetAttrString(target, "data");
    self->handle_end = PyObject_GetAttrString(target, "end");
    self->handle_comment = PyObject_GetAttrString(target, "comment");
    self->handle_pi = PyObject_GetAttrString(target, "pi");
    self->handle_close = PyObject_GetAttrString(target, "close");
    self->handle_doctype = PyObject_GetAttrString(target, "doctype");
    PyErr_Clear();

This ignores all exceptions, not only AttributeError.
It also passes live exceptions into the later lookup calls, which may execute arbitrary user code.
msg302103 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-09-13 18:20
Oh, I see many other issues with C implementation of ElementTree.

1. If XMLParser.__init__ is called twice, it leaks references and the Expat parser.

Possible solution: use the Py_XSETREF() macro instead of simple assignment. The Expat parser needs special handling.

Other possible solution: drop __init__() and make all initialization in __new__(). But this is a solution only for 3.7 because can break compatibility.

2. If XMLParser.__init__ is not called or if it fails to initialize the Expat parser, self->entity and self->target are NULL. Later in xmlparser_getattro() they are increfed unconditionally.
msg302107 - (view) Author: Stefan Behnel (scoder) * Date: 2017-09-13 19:13
Feel free to provide a separate pull request. These issues seem independent of the exception handling problem that I wrote a fix for.
msg302169 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-09-14 12:11
I agree. While these issues are not totally independent (they changes the same code), my issue needs uncommon use of __new__ and __init__, while your issue can be exposed in normal cases.
msg302205 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-09-14 20:00
New changeset c8d8e15bfc24abeeaaf3d8be9073276b0c011cdf by Serhiy Storchaka (scoder) in branch 'master':
bpo-31455: Fix an assertion failure in ElementTree.XMLParser(). (#3545)
https://github.com/python/cpython/commit/c8d8e15bfc24abeeaaf3d8be9073276b0c011cdf
msg302224 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-09-14 23:13
New changeset 49caab46f687eb201898fb6c2c40d47bdcb0e58b by Serhiy Storchaka (Miss Islington (bot)) in branch '3.6':
[3.6] bpo-31455: Fix an assertion failure in ElementTree.XMLParser(). (GH-3545) (#3585)
https://github.com/python/cpython/commit/49caab46f687eb201898fb6c2c40d47bdcb0e58b
History
Date User Action Args
2017-09-14 23:13:23serhiy.storchakasetmessages: + msg302224
2017-09-14 20:00:15python-devsetpull_requests: + pull_request3576
2017-09-14 20:00:05serhiy.storchakasetmessages: + msg302205
2017-09-14 12:11:56serhiy.storchakasetmessages: + msg302169
2017-09-13 19:13:53scodersetmessages: + msg302107
2017-09-13 18:20:42serhiy.storchakasetmessages: + msg302103
2017-09-13 17:33:04serhiy.storchakasetnosy: + haypo, eli.bendersky, serhiy.storchaka

versions: + Python 2.7, Python 3.6
2017-09-13 17:29:13scodersetkeywords: + patch
stage: patch review
pull_requests: + pull_request3542
2017-09-13 17:28:39scodercreate