classification
Title: Avoid entity expansion attacks in Element Tree
Type: security Stage: patch review
Components: XML Versions: Python 3.6
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: christian.heimes, martin.panter, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2015-05-19 11:17 by martin.panter, last changed 2016-06-04 06:48 by martin.panter.

Files
File name Uploaded Description Edit
etree_20130519.patch martin.panter, 2015-05-19 11:17 review
etree-entities.v2.patch martin.panter, 2016-06-04 06:48 review
Messages (2)
msg243577 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-05-19 11:17
This patch could be the basis of an alternative to Christian Heimes’s patch in Issue 17239. It adds a parser flag to the Element Tree modules so that they will immediately raise an exception when an entity declaration is encountered. I believe this should be sufficient to avoid DOS vulnerabilities like the Billion Laughs attack, where a small XML entity reference expands into a large string, and/or involves a large number of entity expansions.

I think the advantage of this patch over the patch in Issue 17239 is this one should work on the current Expat library (which I understand Python can load externally). The other patch modifies the Expat library itself, so would only be useful when Python’s internal Expat library is being used (or the external Expat library was also patched in a similar manner).

The disadvantage of this patch is that it disables handling XML data as soon as an entity is declared, even if the entities are not actually used, or they are only used in a non-malicious way. The other patch allows a limited amount of entity expansion.

I would like some feedback on:

* What others think of the basic approach, compared with Christian’s approach in Issue 17239
* If reject_entities=True should be switched on by default, which could break compatibility, but could be sensible for most cases of basic XML parsing
* If my changes to the examples in the documentation are excessive
* If other Element Tree APIs should be modified similarly to XMLParser

So far I have only changed the XMLParser class. The following APIs accept a parser object, so can also avoid the vulnerability by passing a custom parser object:

* fromstringlist()
* iterparse(), though “parser” is listed as deprecated (by Issue 17741)
* parse() (module-level function)
* XML()
* XMLID()
* ElementTree.parse() (method of ElementTree class)

These APIs don’t have a custom parser object, so they are still always vulnerable:

* fromstring()
* XMLPullParser
msg267238 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-06-04 06:48
Today I discovered that Christian’s defusedxml project already does the same sort of thing. The difference is he calls the parameter forbid_entities. So I have updated my patch and changed the name from reject_entities to forbid_entities for compatibility.
History
Date User Action Args
2016-06-12 12:11:24martin.panterlinkissue17239 dependencies
2016-06-04 06:48:58martin.pantersetfiles: + etree-entities.v2.patch

messages: + msg267238
versions: + Python 3.6, - Python 3.5
2015-12-08 11:48:03serhiy.storchakasetnosy: + serhiy.storchaka

stage: patch review
2015-05-19 11:17:59martin.pantercreate