classification
Title: Integer overflow in zipimport.c
Type: compile error Stage: resolved
Components: Extension Modules Versions: Python 3.6, Python 3.5, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: serhiy.storchaka Nosy List: benjamin.peterson, brett.cannon, christian.heimes, eric.snow, gregory.p.smith, martin.panter, python-dev, serhiy.storchaka, superluser, vstinner
Priority: low Keywords: needs review, patch

Created on 2013-12-04 09:34 by christian.heimes, last changed 2016-01-29 17:14 by serhiy.storchaka. This issue is now closed.

Files
File name Uploaded Description Edit
zipimport_header_offset.patch christian.heimes, 2013-12-04 09:34 review
zipimport_int_overflow.patch vstinner, 2013-12-08 11:10 review
zipimport_int_overflow_2.patch serhiy.storchaka, 2015-11-08 17:45 review
zipimport_int_overflow_3.patch serhiy.storchaka, 2016-01-22 11:49 review
zipimport_int_overflow_4.patch serhiy.storchaka, 2016-01-23 10:12 review
Messages (17)
msg205209 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2013-12-04 09:34
MSVC complains about "conversion from 'Py_ssize_t' to 'long', possible loss of data" in zipimport.c. header_offset is a Py_ssize_t but fseek() only takes a long. On 64bit Windows Py_ssize_t is a 64bit data type but long is still a 32bit data type.

It's safe to assume that the header will be smaller than 2 GB for the next couple of years. :)
msg205210 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2013-12-04 09:57
read_directory() uses fseek() and ftell() which don't support offset larger than LONG_MAX (2 GB on 32-bit system).  I don't know if it's an issue. What happens if the file is longer?

"header_offset += arc_offset;" can overflow or not? This instuction looks weird.

    header_position = ftell(fp);
    ...
    header_offset = get_long((unsigned char *)endof_central_dir + 16);
    arc_offset = header_position - header_offset - header_size;
    header_offset += arc_offset;

If I computed correctly, the final line can be replaced with:

    arc_offset = header_position - header_offset - header_size;
    header_offset = header_position - header_size;

(It is weird to reuse header_position for two different values, a new variable may be added.)

Instead of checking that "header_offset > LONG_MAX", it may be safer to check that:

 - header_size >= 0
 - header_offset >= 0
 - header_offset + header_size <= LONG_MAX ---> header_offset <= LONG_MAX - header_size
 - arc_offset >= 0 ---> header_position >= header_offset + header_size
 - header_offset > 0 ---> header_position >= header_size

If all these values must be positive according to ZIP format, get_long() may be replaced with get_ulong() to simplify these checks.
msg205211 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2013-12-04 10:00
See also zipfile.py which is probably more correct than zipimport.c: zipfile uses for example "L" format for struct.unpack (*unsigned* long) to decode header fields.
msg205380 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-12-06 15:28
Yes, these fields are unsingned.
msg205504 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2013-12-08 02:08
zipimport.c makes no attempt to support zip files larger than 2GiB or zip64 files.
msg205544 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2013-12-08 11:10
Here is a work-in-progress patch.

PyMarshal_ReadShortFromFile() and PyMarshal_ReadLongFromFile() are still wrong: new Unsigned version should be added to marshal.c. I don't know if a C cast to unsigned is enough because long can be larger than 32-bit (ex: on Linux 64-bit):
#if SIZEOF_LONG > 4
        /* Sign extension for 64-bit machines */
        x |= -(x & 0x80000000L);
#endif

I didn't test my patch. Anyone interested to finish the patch?
msg205597 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2013-12-08 19:29
comments added to the patch.
msg231272 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-11-17 08:27
Ping.
msg254348 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-11-08 17:27
Here is revised patch. It addresses Gregory's comments, uses properly integer types and converters for all values, and adds additional checks for integer overflows and ZIP file validity. As a side effect the performance can be increased due to less memory allocations and IO operations.
msg254351 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-11-08 17:45
Sorry, the patch contained parts of the advanced patch that will be submitted in separate issue.
msg258797 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-01-22 11:49
Synchronized with current sources.
msg258858 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-01-23 10:12
Updated patch addresses Victor's comments and adds (mandatory now) parenthesis. Thank you Victor.
msg258862 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-01-23 11:54
Serhiy Storchaka added the comment:
> Updated patch addresses Victor's comments and adds (mandatory now) parenthesis. Thank you Victor.

Do  you mean braces {...}?

The new patch looks good to me, thanks for taking all my comments in account ;-)
msg259153 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2016-01-28 19:33
New changeset 687be1cbe587 by Serhiy Storchaka in branch '3.5':
Issue #19883: Fixed possible integer overflows in zipimport.
https://hg.python.org/cpython/rev/687be1cbe587

New changeset f4631dc56ecf by Serhiy Storchaka in branch 'default':
Issue #19883: Fixed possible integer overflows in zipimport.
https://hg.python.org/cpython/rev/f4631dc56ecf

New changeset cebcd2fd3e1f by Serhiy Storchaka in branch '2.7':
Issue #19883: Fixed possible integer overflows in zipimport.
https://hg.python.org/cpython/rev/cebcd2fd3e1f
msg259165 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-01-28 22:24
This seems to be causing an infinite loop in 2.7, in test_cmd_line_script.CmdLineTest.test_module_in_package_in_zipfile():

test_module_in_package_in_zipfile (test.test_cmd_line_script.CmdLineTest) ... ^C
Test suite interrupted by signal SIGINT.
1 test omitted:
    test_cmd_line_script
[Exit 1]
[vadmium@localhost cpython]$ sudo strace -p 11412 2>&1 | head
Process 11412 attached - interrupt to quit
lseek(3, 1024, SEEK_SET)                = 1024
lseek(3, 1024, SEEK_SET)                = 1024
lseek(3, 1024, SEEK_SET)                = 1024
lseek(3, 1024, SEEK_SET)                = 1024
lseek(3, 1024, SEEK_SET)                = 1024
lseek(3, 1024, SEEK_SET)                = 1024
lseek(3, 1024, SEEK_SET)                = 1024
lseek(3, 1024, SEEK_SET)                = 1024
lseek(3, 1024, SEEK_SET)                = 1024
[vadmium@localhost cpython]$ ls -l /proc/11412/fdtotal 0
lr-x------ 1 vadmium users 64 Jan 29 22:23 0 -> pipe:[1249749]
l-wx------ 1 vadmium users 64 Jan 29 22:23 1 -> pipe:[1249750]
l-wx------ 1 vadmium users 64 Jan 29 22:23 2 -> pipe:[1249750]
lr-x------ 1 vadmium users 64 Jan 29 22:23 3 -> /tmp/tmpHMiuOm/test_zip.zip (deleted)
[vadmium@localhost cpython]$ kill 11412
msg259168 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2016-01-28 22:38
New changeset 82ee3c24bb86 by Serhiy Storchaka in branch '2.7':
Fixed an infinite loop in zipimport caused by cebcd2fd3e1f (issue #19883).
https://hg.python.org/cpython/rev/82ee3c24bb86
msg259169 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-01-28 22:39
Thank you Martin.
History
Date User Action Args
2016-01-29 17:14:48serhiy.storchakasetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2016-01-28 22:39:08serhiy.storchakasetmessages: + msg259169
2016-01-28 22:38:13python-devsetmessages: + msg259168
2016-01-28 22:24:48martin.pantersetnosy: + martin.panter
messages: + msg259165
2016-01-28 19:33:33python-devsetnosy: + python-dev
messages: + msg259153
2016-01-23 11:54:04vstinnersetmessages: + msg258862
2016-01-23 10:12:50serhiy.storchakasetfiles: + zipimport_int_overflow_4.patch
assignee: serhiy.storchaka
messages: + msg258858
2016-01-22 11:50:00serhiy.storchakasetfiles: + zipimport_int_overflow_3.patch

components: + Extension Modules
versions: - Python 3.4
keywords: + needs review
nosy: + benjamin.peterson

messages: + msg258797
2015-11-08 17:45:44serhiy.storchakasetfiles: + zipimport_int_overflow_2.patch

messages: + msg254351
2015-11-08 17:44:20serhiy.storchakasetfiles: - zipimport_int_overflow_2.patch
2015-11-08 17:42:55serhiy.storchakasetfiles: + zipimport_int_overflow_2.patch
2015-11-08 17:42:22serhiy.storchakasetfiles: - zipimport_int_overflow_2.patch
2015-11-08 17:27:52serhiy.storchakasetfiles: + zipimport_int_overflow_2.patch

messages: + msg254348
2015-08-05 15:53:39eric.snowsetversions: + Python 3.6
2015-08-05 15:53:08eric.snowsetnosy: + eric.snow, superluser
2014-11-17 08:27:58serhiy.storchakasetmessages: + msg231272
2014-08-18 10:00:47serhiy.storchakasetversions: + Python 2.7, Python 3.5, - Python 3.3
2013-12-08 19:29:52gregory.p.smithsetmessages: + msg205597
2013-12-08 11:10:28vstinnersetfiles: + zipimport_int_overflow.patch

messages: + msg205544
2013-12-08 02:08:04gregory.p.smithsetnosy: + gregory.p.smith
messages: + msg205504
2013-12-06 15:28:30serhiy.storchakasetmessages: + msg205380
2013-12-06 15:18:18brett.cannonsetassignee: brett.cannon -> (no value)
2013-12-04 10:21:05serhiy.storchakasetnosy: + serhiy.storchaka
2013-12-04 10:00:20vstinnersetmessages: + msg205211
2013-12-04 09:57:49vstinnersetnosy: + vstinner

messages: + msg205210
title: overflow in zipexport.c -> Integer overflow in zipimport.c
2013-12-04 09:34:05christian.heimescreate