Title: zlib crc32/adler32 buffer length truncation (64-bit)
Type: behavior Stage: resolved
Components: Extension Modules Versions: Python 2.7
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: loewis, nadeem.vawda, pitrou, python-dev, sdaoden
Priority: normal Keywords: patch

Created on 2010-11-01 09:46 by nadeem.vawda, last changed 2016-05-27 02:41 by martin.panter. This issue is now closed.

File name Uploaded Description Edit
zlib-checksum-truncation.diff nadeem.vawda, 2010-11-01 09:46 Calculate checksums incrementally for large buffers.
zlib-v2.diff nadeem.vawda, 2011-01-26 23:46 Updated fix, with test.
Messages (8)
msg120114 - (view) Author: Nadeem Vawda (nadeem.vawda) * (Python committer) Date: 2010-11-01 09:46
zlib.crc32() and zlib.adler32() in Modules/zlibmodule.c don't handle buffers of >=4GB correctly. The length of a Py_buffer is of type Py_ssize_t, while the C zlib functions take length as an unsigned integer. This means that on a 64-bit build, the buffer length gets silently truncated to 32 bits, which results in incorrect output for large inputs.

Attached is a patch that fixes this by computing the checksum incrementally, using small-enough chunks of the buffer.

A better fix might be to have Modules/zlib/crc32.c use 64-bit lengths. I tried this, but I couldn't get it to work. It seems that if the system already has zlib installed, Python will link against the existing version instead of compiling its own.

Testing this might be a bit tricky. Allocating a 4+GB regular buffer isn't practical. Using a memory-mapped file would work, but I'm not sure having a unit test create a multi-gigabyte file is a great thing to do.
msg120116 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2010-11-01 09:51
I find your approach fine; there isn't a need (IMO) to have the underlying functions change.
msg127159 - (view) Author: Nadeem Vawda (nadeem.vawda) * (Python committer) Date: 2011-01-26 23:46
Here is an update patch, which corrects a typo in the previous patch, and adds a test to test_zlib.

The test uses a memory-mapped sparse file, so it gets skipped on systems without mmap. The alternative would be to allocate a 4+GB buffer of ordinary memory, causes heavy swapping on my machine (4GB of RAM). The test also gets skipped on 32-bit builds, where the address space is too small for this bug to arise.

I'm not sure whether the test can count on the created file actually being sparse, so I had the test require the 'largefile' resource, to be on the safe side.
msg128959 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-02-21 12:36
Patch looks good to me.
msg128977 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-02-21 18:28
Thank you for the patch! Committed in r88460 (3.3) and r88461 (3.2).
2.7 would need more surgery in order for this to be fixed, see #8651 and #8650.
msg135036 - (view) Author: Roundup Robot (python-dev) Date: 2011-05-03 13:19
New changeset f43213129ba8 by Victor Stinner in branch '2.7':
Issue #10276: test_zlib checks that inputs of 2 GB are handled correctly by
msg135042 - (view) Author: Roundup Robot (python-dev) Date: 2011-05-03 15:25
New changeset dd58f8072216 by Victor Stinner in branch '2.7':
Issue #10276: Fix test_zlib, m may be undefined in the finally block
msg135072 - (view) Author: Nadeem Vawda (nadeem.vawda) * (Python committer) Date: 2011-05-03 21:17
Changeset a0681e7a6ded fixes this bug for 2.7 - too-large buffers cause
an OverflowError during argument parsing, so there is no possibility of
truncation happening.
Date User Action Args
2016-05-27 02:41:08martin.pantersetmessages: - msg135065
2016-05-27 02:40:38martin.pantersetmessages: - msg135066
2011-05-03 21:17:34nadeem.vawdasetstatus: open -> closed

messages: + msg135072
2011-05-03 19:52:08sdaodensetmessages: + msg135066
2011-05-03 19:51:17sdaodensetnosy: + sdaoden
messages: + msg135065
2011-05-03 15:25:46python-devsetmessages: + msg135042
2011-05-03 13:19:35python-devsetstatus: pending -> open

messages: + msg135036
nosy: + python-dev
2011-02-21 18:28:06pitrousetstatus: open -> pending
versions: - Python 3.2, Python 3.3
messages: + msg128977

resolution: fixed
stage: patch review -> resolved
2011-02-21 12:36:18pitrousetversions: + Python 3.3, - Python 3.1
nosy: + pitrou

messages: + msg128959

stage: patch review
2011-01-26 23:46:35nadeem.vawdasetfiles: + zlib-v2.diff

messages: + msg127159
2010-11-01 17:24:34eric.araujosetcomponents: + Extension Modules, - Library (Lib)
versions: - Python 2.6, Python 2.5, Python 3.3
2010-11-01 09:51:44loewissetnosy: + loewis
messages: + msg120116
2010-11-01 09:46:03nadeem.vawdacreate