Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integer overflow in zipimport.c #64082

Closed
tiran opened this issue Dec 4, 2013 · 17 comments
Closed

Integer overflow in zipimport.c #64082

tiran opened this issue Dec 4, 2013 · 17 comments
Assignees
Labels
build The build process and cross-build extension-modules C modules in the Modules dir

Comments

@tiran
Copy link
Member

tiran commented Dec 4, 2013

BPO 19883
Nosy @brettcannon, @gpshead, @vstinner, @tiran, @benjaminp, @ericsnowcurrently, @vadmium, @serhiy-storchaka
Files
  • zipimport_header_offset.patch
  • zipimport_int_overflow.patch
  • zipimport_int_overflow_2.patch
  • zipimport_int_overflow_3.patch
  • zipimport_int_overflow_4.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/serhiy-storchaka'
    closed_at = <Date 2016-01-29.17:14:48.329>
    created_at = <Date 2013-12-04.09:34:05.578>
    labels = ['extension-modules', 'build']
    title = 'Integer overflow in zipimport.c'
    updated_at = <Date 2016-01-29.17:14:48.328>
    user = 'https://github.com/tiran'

    bugs.python.org fields:

    activity = <Date 2016-01-29.17:14:48.328>
    actor = 'serhiy.storchaka'
    assignee = 'serhiy.storchaka'
    closed = True
    closed_date = <Date 2016-01-29.17:14:48.329>
    closer = 'serhiy.storchaka'
    components = ['Extension Modules']
    creation = <Date 2013-12-04.09:34:05.578>
    creator = 'christian.heimes'
    dependencies = []
    files = ['32970', '33044', '40982', '41687', '41697']
    hgrepos = []
    issue_num = 19883
    keywords = ['patch', 'needs review']
    message_count = 17.0
    messages = ['205209', '205210', '205211', '205380', '205504', '205544', '205597', '231272', '254348', '254351', '258797', '258858', '258862', '259153', '259165', '259168', '259169']
    nosy_count = 10.0
    nosy_names = ['brett.cannon', 'gregory.p.smith', 'vstinner', 'christian.heimes', 'benjamin.peterson', 'python-dev', 'eric.snow', 'martin.panter', 'serhiy.storchaka', 'superluser']
    pr_nums = []
    priority = 'low'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'compile error'
    url = 'https://bugs.python.org/issue19883'
    versions = ['Python 2.7', 'Python 3.5', 'Python 3.6']

    @tiran
    Copy link
    Member Author

    tiran commented Dec 4, 2013

    MSVC complains about "conversion from 'Py_ssize_t' to 'long', possible loss of data" in zipimport.c. header_offset is a Py_ssize_t but fseek() only takes a long. On 64bit Windows Py_ssize_t is a 64bit data type but long is still a 32bit data type.

    It's safe to assume that the header will be smaller than 2 GB for the next couple of years. :)

    @tiran tiran added the build The build process and cross-build label Dec 4, 2013
    @vstinner
    Copy link
    Member

    vstinner commented Dec 4, 2013

    read_directory() uses fseek() and ftell() which don't support offset larger than LONG_MAX (2 GB on 32-bit system). I don't know if it's an issue. What happens if the file is longer?

    "header_offset += arc_offset;" can overflow or not? This instuction looks weird.

        header_position = ftell(fp);
        ...
        header_offset = get_long((unsigned char *)endof_central_dir + 16);
        arc_offset = header_position - header_offset - header_size;
        header_offset += arc_offset;

    If I computed correctly, the final line can be replaced with:

        arc_offset = header_position - header_offset - header_size;
        header_offset = header_position - header_size;

    (It is weird to reuse header_position for two different values, a new variable may be added.)

    Instead of checking that "header_offset > LONG_MAX", it may be safer to check that:

    • header_size >= 0
    • header_offset >= 0
    • header_offset + header_size <= LONG_MAX ---> header_offset <= LONG_MAX - header_size
    • arc_offset >= 0 ---> header_position >= header_offset + header_size
    • header_offset > 0 ---> header_position >= header_size

    If all these values must be positive according to ZIP format, get_long() may be replaced with get_ulong() to simplify these checks.

    @vstinner vstinner changed the title overflow in zipexport.c Integer overflow in zipimport.c Dec 4, 2013
    @vstinner
    Copy link
    Member

    vstinner commented Dec 4, 2013

    See also zipfile.py which is probably more correct than zipimport.c: zipfile uses for example "L" format for struct.unpack (unsigned long) to decode header fields.

    @brettcannon brettcannon removed their assignment Dec 6, 2013
    @serhiy-storchaka
    Copy link
    Member

    Yes, these fields are unsingned.

    @gpshead
    Copy link
    Member

    gpshead commented Dec 8, 2013

    zipimport.c makes no attempt to support zip files larger than 2GiB or zip64 files.

    @vstinner
    Copy link
    Member

    vstinner commented Dec 8, 2013

    Here is a work-in-progress patch.

    PyMarshal_ReadShortFromFile() and PyMarshal_ReadLongFromFile() are still wrong: new Unsigned version should be added to marshal.c. I don't know if a C cast to unsigned is enough because long can be larger than 32-bit (ex: on Linux 64-bit):
    #if SIZEOF_LONG > 4
            /* Sign extension for 64-bit machines */
            x |= -(x & 0x80000000L);
    #endif

    I didn't test my patch. Anyone interested to finish the patch?

    @gpshead
    Copy link
    Member

    gpshead commented Dec 8, 2013

    comments added to the patch.

    @serhiy-storchaka
    Copy link
    Member

    Ping.

    @serhiy-storchaka
    Copy link
    Member

    Here is revised patch. It addresses Gregory's comments, uses properly integer types and converters for all values, and adds additional checks for integer overflows and ZIP file validity. As a side effect the performance can be increased due to less memory allocations and IO operations.

    @serhiy-storchaka
    Copy link
    Member

    Sorry, the patch contained parts of the advanced patch that will be submitted in separate issue.

    @serhiy-storchaka
    Copy link
    Member

    Synchronized with current sources.

    @serhiy-storchaka serhiy-storchaka added the extension-modules C modules in the Modules dir label Jan 22, 2016
    @serhiy-storchaka
    Copy link
    Member

    Updated patch addresses Victor's comments and adds (mandatory now) parenthesis. Thank you Victor.

    @serhiy-storchaka serhiy-storchaka self-assigned this Jan 23, 2016
    @vstinner
    Copy link
    Member

    Serhiy Storchaka added the comment:

    Updated patch addresses Victor's comments and adds (mandatory now) parenthesis. Thank you Victor.

    Do you mean braces {...}?

    The new patch looks good to me, thanks for taking all my comments in account ;-)

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Jan 28, 2016

    New changeset 687be1cbe587 by Serhiy Storchaka in branch '3.5':
    Issue bpo-19883: Fixed possible integer overflows in zipimport.
    https://hg.python.org/cpython/rev/687be1cbe587

    New changeset f4631dc56ecf by Serhiy Storchaka in branch 'default':
    Issue bpo-19883: Fixed possible integer overflows in zipimport.
    https://hg.python.org/cpython/rev/f4631dc56ecf

    New changeset cebcd2fd3e1f by Serhiy Storchaka in branch '2.7':
    Issue bpo-19883: Fixed possible integer overflows in zipimport.
    https://hg.python.org/cpython/rev/cebcd2fd3e1f

    @vadmium
    Copy link
    Member

    vadmium commented Jan 28, 2016

    This seems to be causing an infinite loop in 2.7, in test_cmd_line_script.CmdLineTest.test_module_in_package_in_zipfile():

    test_module_in_package_in_zipfile (test.test_cmd_line_script.CmdLineTest) ... ^C
    Test suite interrupted by signal SIGINT.
    1 test omitted:
    test_cmd_line_script
    [Exit 1]
    [vadmium@localhost cpython]$ sudo strace -p 11412 2>&1 | head
    Process 11412 attached - interrupt to quit
    lseek(3, 1024, SEEK_SET) = 1024
    lseek(3, 1024, SEEK_SET) = 1024
    lseek(3, 1024, SEEK_SET) = 1024
    lseek(3, 1024, SEEK_SET) = 1024
    lseek(3, 1024, SEEK_SET) = 1024
    lseek(3, 1024, SEEK_SET) = 1024
    lseek(3, 1024, SEEK_SET) = 1024
    lseek(3, 1024, SEEK_SET) = 1024
    lseek(3, 1024, SEEK_SET) = 1024
    [vadmium@localhost cpython]$ ls -l /proc/11412/fdtotal 0
    lr-x------ 1 vadmium users 64 Jan 29 22:23 0 -> pipe:[1249749]
    l-wx------ 1 vadmium users 64 Jan 29 22:23 1 -> pipe:[1249750]
    l-wx------ 1 vadmium users 64 Jan 29 22:23 2 -> pipe:[1249750]
    lr-x------ 1 vadmium users 64 Jan 29 22:23 3 -> /tmp/tmpHMiuOm/test_zip.zip (deleted)
    [vadmium@localhost cpython]$ kill 11412

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Jan 28, 2016

    New changeset 82ee3c24bb86 by Serhiy Storchaka in branch '2.7':
    Fixed an infinite loop in zipimport caused by cebcd2fd3e1f (issue bpo-19883).
    https://hg.python.org/cpython/rev/82ee3c24bb86

    @serhiy-storchaka
    Copy link
    Member

    Thank you Martin.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    build The build process and cross-build extension-modules C modules in the Modules dir
    Projects
    None yet
    Development

    No branches or pull requests

    6 participants