This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author martin.panter
Recipients christian.heimes, martin.panter
Date 2015-05-19.11:17:56
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1432034279.35.0.728988379999.issue24238@psf.upfronthosting.co.za>
In-reply-to
Content
This patch could be the basis of an alternative to Christian Heimes’s patch in Issue 17239. It adds a parser flag to the Element Tree modules so that they will immediately raise an exception when an entity declaration is encountered. I believe this should be sufficient to avoid DOS vulnerabilities like the Billion Laughs attack, where a small XML entity reference expands into a large string, and/or involves a large number of entity expansions.

I think the advantage of this patch over the patch in Issue 17239 is this one should work on the current Expat library (which I understand Python can load externally). The other patch modifies the Expat library itself, so would only be useful when Python’s internal Expat library is being used (or the external Expat library was also patched in a similar manner).

The disadvantage of this patch is that it disables handling XML data as soon as an entity is declared, even if the entities are not actually used, or they are only used in a non-malicious way. The other patch allows a limited amount of entity expansion.

I would like some feedback on:

* What others think of the basic approach, compared with Christian’s approach in Issue 17239
* If reject_entities=True should be switched on by default, which could break compatibility, but could be sensible for most cases of basic XML parsing
* If my changes to the examples in the documentation are excessive
* If other Element Tree APIs should be modified similarly to XMLParser

So far I have only changed the XMLParser class. The following APIs accept a parser object, so can also avoid the vulnerability by passing a custom parser object:

* fromstringlist()
* iterparse(), though “parser” is listed as deprecated (by Issue 17741)
* parse() (module-level function)
* XML()
* XMLID()
* ElementTree.parse() (method of ElementTree class)

These APIs don’t have a custom parser object, so they are still always vulnerable:

* fromstring()
* XMLPullParser
History
Date User Action Args
2015-05-19 11:18:00martin.pantersetrecipients: + martin.panter, christian.heimes
2015-05-19 11:17:59martin.pantersetmessageid: <1432034279.35.0.728988379999.issue24238@psf.upfronthosting.co.za>
2015-05-19 11:17:59martin.panterlinkissue24238 messages
2015-05-19 11:17:58martin.pantercreate