Issue 24238: Avoid entity expansion attacks in Element Tree

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/68426

classification

Title:	Avoid entity expansion attacks in Element Tree
Type:	security	Stage:	patch review
Components:	XML	Versions:	Python 3.6

process

Status:	open	Resolution:
Dependencies:		Superseder:
Assigned To:		Nosy List:	christian.heimes, martin.panter, serhiy.storchaka
Priority:	normal	Keywords:	patch

Created on 2015-05-19 11:17 by martin.panter, last changed 2022-04-11 14:58 by admin.

Files
File name	Uploaded	Description	Edit
etree_20130519.patch	martin.panter, 2015-05-19 11:17		review
etree-entities.v2.patch	martin.panter, 2016-06-04 06:48		review

Messages (2)
msg243577 - (view)	Author: Martin Panter (martin.panter) *	Date: 2015-05-19 11:17
This patch could be the basis of an alternative to Christian Heimes’s patch in Issue 17239. It adds a parser flag to the Element Tree modules so that they will immediately raise an exception when an entity declaration is encountered. I believe this should be sufficient to avoid DOS vulnerabilities like the Billion Laughs attack, where a small XML entity reference expands into a large string, and/or involves a large number of entity expansions. I think the advantage of this patch over the patch in Issue 17239 is this one should work on the current Expat library (which I understand Python can load externally). The other patch modifies the Expat library itself, so would only be useful when Python’s internal Expat library is being used (or the external Expat library was also patched in a similar manner). The disadvantage of this patch is that it disables handling XML data as soon as an entity is declared, even if the entities are not actually used, or they are only used in a non-malicious way. The other patch allows a limited amount of entity expansion. I would like some feedback on: * What others think of the basic approach, compared with Christian’s approach in Issue 17239 * If reject_entities=True should be switched on by default, which could break compatibility, but could be sensible for most cases of basic XML parsing * If my changes to the examples in the documentation are excessive * If other Element Tree APIs should be modified similarly to XMLParser So far I have only changed the XMLParser class. The following APIs accept a parser object, so can also avoid the vulnerability by passing a custom parser object: * fromstringlist() * iterparse(), though “parser” is listed as deprecated (by Issue 17741) * parse() (module-level function) * XML() * XMLID() * ElementTree.parse() (method of ElementTree class) These APIs don’t have a custom parser object, so they are still always vulnerable: * fromstring() * XMLPullParser
msg267238 - (view)	Author: Martin Panter (martin.panter) *	Date: 2016-06-04 06:48
Today I discovered that Christian’s defusedxml project already does the same sort of thing. The difference is he calls the parameter forbid_entities. So I have updated my patch and changed the name from reject_entities to forbid_entities for compatibility.

History
Date	User	Action	Args
2022-04-11 14:58:17	admin	set	github: 68426
2016-06-12 12:11:24	martin.panter	link	issue17239 dependencies
2016-06-04 06:48:58	martin.panter	set	files: + etree-entities.v2.patch messages: + msg267238 versions: + Python 3.6, - Python 3.5
2015-12-08 11:48:03	serhiy.storchaka	set	nosy: + serhiy.storchaka stage: patch review
2015-05-19 11:17:59	martin.panter	create