classification
Title: XML vulnerabilities in Python
Type: security Stage: patch review
Components: Extension Modules, Library (Lib), XML Versions: Python 3.8, Python 3.7, Python 3.6, Python 2.7
process
Status: open Resolution:
Dependencies: 17318 24238 Superseder:
Assigned To: Nosy List: Arfrever, barry, benjamin.peterson, christian.heimes, djc, eli.bendersky, ezio.melotti, franck, georg.brandl, jwilk, larry, martin.panter, mcepl, ned.deily, pitrou, rhettinger, rsandwick3, scoder, serhiy.storchaka, steve.dower, vstinner
Priority: critical Keywords: patch

Created on 2013-02-19 15:35 by christian.heimes, last changed 2018-09-18 13:39 by vstinner.

Files
File name Uploaded Description Edit
xmlbomb_20130219.patch christian.heimes, 2013-02-19 15:35 review
xmlbomb_20150518.patch martin.panter, 2015-05-18 02:59 Merged to 3.5 review
Pull Requests
URL Status Linked Edit
PR 9217 open christian.heimes, 2018-09-12 16:26
PR 9265 open christian.heimes, 2018-09-13 17:19
Messages (17)
msg182393 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2013-02-19 15:35
Experimental fix for XML vulnerabilities against default. It's NOT ready and needs lots of polishing.

https://pypi.python.org/pypi/defusedxml contains explanations of all issues
https://pypi.python.org/pypi/defusedexpat is a standalone version of part of the patches for Python 2.6 to 3.3
msg184285 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2013-03-16 02:17
Since this has dragged on for quite a while, I'm probably just going to release 2.7.4 with a pointer to defusedxml in the release notes. (docs, though, perhaps)
msg184289 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2013-03-16 04:19
> Since this has dragged on for quite a while, I'm probably 
> just going to release 2.7.4 with a pointer to defusedxml
> in the release notes. (docs, though, perhaps)

+1
msg184387 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-03-17 16:46
> Since this has dragged on for quite a while, I'm probably just going to 
> release 2.7.4 with a pointer to defusedxml in the release notes. (docs, 
> though, perhaps)

+1 too.
msg185053 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2013-03-23 14:45
Not blocking 2.7.4 as discussed on mailing list.
msg243450 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-05-18 02:59
I did a rough merge with current “default” (3.5 pre-release) branch so that I can have a closer look at this issue; see xmlbomb_20150518.patch for the result. There are some bits with Argument Clinit that need perfecting:

* Unsure how to convert the ElementTree.XMLParser.__init__() signature (varied depending on XML_BOMB_PROTECTION compile-time flag) to Argument Clinic. So I just hard-coded it as if XML_BOMB_PROTECTION is always enabled. Why do we have to have a variable signature in the first place?

* New pyexpat functions need porting to Argument Clinic.
msg243469 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-05-18 10:44
I started looking at the lower Expat-level changes. Here are some thoughts, in the order that I thought them. :) But the end result is to investigate a different approach to disable entities in existing versions of Expat.

Currently, it looks like max_entity_indirections = 0 is a special value meaning no limit. I think it would be better to use some other value such as None for this, and then 0 could disable all entity expansion (other than pre-defined entities like & &#xNNNN; etc).

What is the benefit of having the indirection limit? I would have thought the entity expansion (character) limit on its own would already be effective at preventing nested expansion attacks like “billion laughs”. Even if the entity expanded to an empty string, all of the intermediate entity references are still included in the character count.

I wonder if it would make more sense to have a total character limit instead, which would include the characters from custom entity expansions as already counted by the patch, but also count characters directly from the XML body. Why would you want to avoid 8 million characters from entity expansion, but allow 8 million characters of plain XML (or gzipped XML)? (I am not an XML expert, so I could be missing something obvious here.)

Now I have discovered that it seems you can build Python to use an external Expat library, which won’t be affected by Christian’s fix (correct me if I am wrong). I think we should find a different solution that will also work with existing external Expat versions. Maybe setting EntityDeclHandler to raise an error would be good enough:

>>> from xml.parsers import expat
>>> bomb = '<!DOCTYPE bomb [\n<!ENTITY a "" >\n<!ENTITY b "' + '&a;' * 1000 + '" >\n<!ENTITY c "' + '&b;' * 1000 + '" >\n]>\n<bomb a="' + '&c;' * 10 + '" />\n'
>>> p = expat.ParserCreate()
>>> p.Parse(bomb, True)  # Noticeable delay (DOS) while parsing
1
>>> p = expat.ParserCreate()
>>> def handler(*so_much_argh):
...     raise ValueError("Entity handling disabled")
... 
>>> p.EntityDeclHandler = handler
>>> p.Parse(bomb, True)  # Instant failure (no DOS)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/build/python/src/Python-3.4.3/Modules/pyexpat.c", line 494, in EntityDecl
  File "<stdin>", line 2, in handler
ValueError: Entity handling disabled

This solution has been suggested and implemented elsewhere:
* https://bugzilla.redhat.com/show_bug.cgi?id=1000109#c1
* http://mail-archives.apache.org/mod_mbox/apr-dev/200906.mbox/%3C20090602162934.GA28360@redhat.com%3E (though I suspect the SetDefaultHandler option there is not sufficient)
msg243581 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-05-19 11:26
I have opened Issue 24238 with a patch for Element Tree that uses my EntityDeclHandler technique, instead of patching Expat. I would be interested in other people’s thoughts on the approach.
msg324416 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-08-31 12:24
This issue didn't get much attention in 5 years. The XML documentation starts with a big red warning:
https://docs.python.org/dev/library/xml.html

The warning is present in 2.7 and 3.4 as well:
https://docs.python.org/2.7/library/xml.html
https://docs.python.org/3.4/library/xml.html

It seems like XML is getting less popular because of JSON becoming more popular (JSON obviously comes with its own set of security issues). It seems like less core developers care about XML.

I suggest to:

* close bpo-17318 as a duplicate of this issue (bpo-17239)
* close bpo-24238
* close this issue

We just have to accept that core developers have limited availability and that documenting security issues is an acceptable tradeoff. I don't see any value of keeping these 3 issues open.
msg324685 - (view) Author: Matej Cepl (mcepl) * Date: 2018-09-06 11:46
> I suggest to:
> 
> * close bpo-17318 as a duplicate of this issue (bpo-17239)
> * close bpo-24238
> * close this issue

+1 from me.
msg325562 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2018-09-17 18:31
Ned - I don't think this is necessarily a release blocker, as we've been shipping it for a long time, but it would be nice if we can hold 3.7.1rc1 just long enough to get it in (provided Christian jumps in and says he'll get the last minor concerns on the PRs wrapped up very soon)
msg325573 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2018-09-17 20:33
We discussed this last week at the sprint.  Christian, it would be great if you could get this merged for 3.7 and possibly 3.6 in the next 24 hours.
msg325586 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2018-09-17 21:55
The external entity patch is ready, but the billion laughs fix need more time. I'm working with an upstream developer on a proper fix.
msg325590 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2018-09-17 22:27
Any reason to not take the current patch for our vendored copy and give it some exposure at least on platforms that rely on it (maybe just Windows)? I don't see any reason to wait on another group to "release" it when we need to manually apply the update to our own repo anyway.

Platforms using system libexpat that hasn't been patched have obviously decided not to patch it themselves :)
msg325595 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-09-17 23:11
> Any reason to not take the current patch for our vendored copy and give it some exposure at least on platforms that rely on it (maybe just Windows)? I don't see any reason to wait on another group to "release" it when we need to manually apply the update to our own repo anyway.

My policy is upstream fix: first, get a change merged upstream.

If we start with a downstream patch:

* only Windows and macOS will get the fix
* upstream may require changes making the change incompatible, for example change the default limits
* I would prefer to keep Modules/expat/ as close as possible to the upstream

Python is vulnerable for years, it's not like there is an urgency to fix it.
msg325610 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2018-09-18 04:08
There's also the view that it'll be easier to justify upstreaming a patch if it's been released and tested in a separate app. We require that all the time for Python patches, so why should we expect other projects to be different?

We're totally entitled to only release it for those platforms, because we are responsible for libexpat on those (we could vendor it for all of them? Or switch to platform-supported libraries for macOS and Windows?)

Who normally updates the vendored libexpat? I'd rather let them make the call on how far to diverge from upstream, since it'll be up to them to roll the changes forward or revert them in favour of upstream. I doubt different defaults will be an issue, especially since they aren't configurable anyway.
msg325642 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-09-18 13:39
> Who normally updates the vendored libexpat?

I made the 3 latest libexpat updates, and each of them was painful :-)

My notes on vendored libraries:
https://pythondev.readthedocs.io/cpython.html#vendored-external-libraries

I wrote a tool to get the version of all vendored libraries, and a script to updated libexpat.
History
Date User Action Args
2018-09-18 13:39:19vstinnersetmessages: + msg325642
2018-09-18 04:08:25steve.dowersetmessages: + msg325610
2018-09-17 23:11:11vstinnersetmessages: + msg325595
2018-09-17 22:27:15steve.dowersetmessages: + msg325590
2018-09-17 21:55:12christian.heimessetnosy: + christian.heimes
messages: + msg325586
2018-09-17 20:33:45ned.deilysetmessages: + msg325573
2018-09-17 18:31:11steve.dowersetnosy: + steve.dower, ned.deily

messages: + msg325562
versions: + Python 3.6, Python 3.7, Python 3.8, - Python 2.6, Python 3.1, Python 3.2, Python 3.3, Python 3.4, Python 3.5
2018-09-13 17:19:07christian.heimessetpull_requests: + pull_request8697
2018-09-12 16:26:27christian.heimessetstage: needs patch -> patch review
pull_requests: + pull_request8649
2018-09-06 11:46:56mceplsetmessages: + msg324685
2018-08-31 12:24:35vstinnersetnosy: + vstinner
messages: + msg324416
2018-03-05 00:46:11mceplsetnosy: + mcepl
2016-06-12 12:14:14christian.heimessetnosy: - christian.heimes
2016-06-12 12:11:24martin.pantersetdependencies: + xml.sax and xml.dom fetch DTDs by default, Avoid entity expansion attacks in Element Tree
2015-05-24 08:36:04scodersetnosy: + scoder
2015-05-19 11:26:52martin.pantersetmessages: + msg243581
2015-05-18 10:44:13martin.pantersetmessages: + msg243469
2015-05-18 03:00:00martin.pantersetfiles: + xmlbomb_20150518.patch

messages: + msg243450
versions: + Python 3.5
2015-01-14 23:49:33jwilksetnosy: + jwilk
2015-01-12 00:11:24martin.pantersetnosy: + martin.panter
2013-03-25 16:59:30rsandwick3setnosy: + rsandwick3
2013-03-23 14:45:10benjamin.petersonsetpriority: release blocker -> critical

messages: + msg185053
2013-03-17 16:46:08pitrousetnosy: + pitrou
messages: + msg184387
2013-03-16 04:19:02rhettingersetnosy: + rhettinger
messages: + msg184289
2013-03-16 02:17:49benjamin.petersonsetmessages: + msg184285
2013-02-22 23:40:35Arfreversetnosy: + Arfrever
2013-02-20 10:21:07djcsetnosy: + djc
2013-02-19 19:49:24serhiy.storchakasetnosy: + serhiy.storchaka
2013-02-19 17:26:29francksetnosy: + franck
2013-02-19 15:37:13ezio.melottisetnosy: + ezio.melotti, eli.bendersky
2013-02-19 15:35:41christian.heimescreate