Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Within zipfile, use of zlib.crc32 raises OverflowError at argument-parsing time on large strings #67495

Closed
DannyYoo mannequin opened this issue Jan 23, 2015 · 5 comments
Labels
stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@DannyYoo
Copy link
Mannequin

DannyYoo mannequin commented Jan 23, 2015

BPO 23306
Nosy @gpshead, @vadmium, @serhiy-storchaka
Dependencies
  • bpo-27130: zlib: OverflowError while trying to compress 2^32 bytes or more
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2016-08-07.16:32:30.895>
    created_at = <Date 2015-01-23.23:49:26.771>
    labels = ['type-bug', 'library']
    title = 'Within zipfile, use of zlib.crc32 raises OverflowError at argument-parsing time on large strings'
    updated_at = <Date 2016-08-07.16:32:30.893>
    user = 'https://bugs.python.org/DannyYoo'

    bugs.python.org fields:

    activity = <Date 2016-08-07.16:32:30.893>
    actor = 'gregory.p.smith'
    assignee = 'none'
    closed = True
    closed_date = <Date 2016-08-07.16:32:30.895>
    closer = 'gregory.p.smith'
    components = ['Library (Lib)']
    creation = <Date 2015-01-23.23:49:26.771>
    creator = 'Danny.Yoo'
    dependencies = ['27130']
    files = []
    hgrepos = []
    issue_num = 23306
    keywords = []
    message_count = 5.0
    messages = ['234587', '234588', '234589', '266467', '272123']
    nosy_count = 5.0
    nosy_names = ['gregory.p.smith', 'nadeem.vawda', 'martin.panter', 'serhiy.storchaka', 'Danny.Yoo']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue23306'
    versions = ['Python 2.7']

    @DannyYoo
    Copy link
    Mannequin Author

    DannyYoo mannequin commented Jan 23, 2015

    Reproduction steps:

    ---

    $ python2.7 -c "import zlib;zlib.crc32('a'*(1<<31))"
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
    OverflowError: size does not fit in an int

    We ran into this bug in zlib.crc32 when using zipfile.writestr() with a very large string; as soon as zipfile tried to write the crc checksum, it raised this error.

    Python 3 does not appear to suffer from this bug.

    @DannyYoo DannyYoo mannequin added stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels Jan 23, 2015
    @gpshead
    Copy link
    Member

    gpshead commented Jan 23, 2015

    This bug prevents zipfile's writestr() from writing large data (longer than UINT_MAX) to a 64-bit zip file.

    The zlib.crc32 function which, as written, cannot accept input with a size larger than an unsigned int.

    https://hg.python.org/cpython/file/94ec4d8cf104/Modules/zlibmodule.c#l964

    Python 3 has updated this to call the zlib crc32 function multiple times in this situation:

    https://hg.python.org/cpython/file/93888975606b/Modules/zlibmodule.c#l1210

    so the fix exists, we just need to do this in 2.7.

    @DannyYoo
    Copy link
    Mannequin Author

    DannyYoo mannequin commented Jan 24, 2015

    Unfortunately, fixing just zlib.crc32 isn't quite enough for our purposes. We still will see OverflowErrow in zipfile if compression is selected.

    Demonstration code:

    ############################################
    import zipfile
    
    ## Possible workaround: monkey-patch crc32 from binascii?!
    import binascii
    zipfile.crc32 = binascii.crc32
    
    content = 'a'*(1<<31)
    filename = '/tmp/zip_test.zip'
    
    zf = zipfile.ZipFile(filename, "w",
                         compression=zipfile.ZIP_DEFLATED,
                         allowZip64=True)
    zf.writestr('big', content)
    zf.close()
    
    zf = zipfile.ZipFile(filename, "r", allowZip64=True)
    print zf.open('big').read() == content
    #############################################

    This will raise the following error under Python 2.7.6:

    #############################################
    $ python zip_test.py
    Traceback (most recent call last):
      File "zip_test.py", line 13, in <module>
        zf.writestr('big', content)
      File "/usr/lib/python2.7/zipfile.py", line 1228, in writestr
        bytes = co.compress(bytes) + co.flush()
    OverflowError: size does not fit in an int
    #############################################

    If we use compression=zipfile.ZIP_STORED, we don't see this error, but it kind of misses a major point of using zipfile.

    @DannyYoo DannyYoo mannequin changed the title zlib.crc32 raises OverflowError at argument-parsing time on large strings Within zipfile, use of zlib.crc32 raises OverflowError at argument-parsing time on large strings Jan 24, 2015
    @vadmium
    Copy link
    Member

    vadmium commented May 27, 2016

    Apparently crc32() was fixed in Python 3 via bpo-10276.

    See also bpo-27130 about 64-bit support more generally in zlib.

    @gpshead
    Copy link
    Member

    gpshead commented Aug 7, 2016

    This appears to have been fixed by at least bpo-27130's https://hg.python.org/cpython/rev/2192edcfea02 recent commits.

    greg:cpython/build27$ ./python -c "import zlib;zlib.crc32('a'*(1<<31))"
    greg:cpython/build27$ ./python ../zipfile_2gb_test.py
    True
    greg:cpython/build27$ ls -al /tmp/zip_test.zip
    -rw-rw-r-- 1 greg greg 2087407 Aug 7 09:28 /tmp/zip_test.zip
    greg:~/sandbox/python/cpython/build27$ unzip -t /tmp/zip_test.zip
    Archive: /tmp/zip_test.zip
    testing: big OK
    No errors detected in compressed data of /tmp/zip_test.zip.
    greg:cpython/build27$ unzip -l /tmp/zip_test.zip
    Archive: /tmp/zip_test.zip
    Length Date Time Name
    --------- ---------- ----- ----
    2147483648 2016-08-07 09:27 big
    --------- -------
    2147483648 1 file

    @gpshead gpshead closed this as completed Aug 7, 2016
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants