This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author nadeem.vawda
Recipients nadeem.vawda
Date 2010-11-01.09:46:01
SpamBayes Score 5.5603755e-10
Marked as misclassified No
Message-id <1288604765.69.0.684600327936.issue10276@psf.upfronthosting.co.za>
In-reply-to
Content
zlib.crc32() and zlib.adler32() in Modules/zlibmodule.c don't handle buffers of >=4GB correctly. The length of a Py_buffer is of type Py_ssize_t, while the C zlib functions take length as an unsigned integer. This means that on a 64-bit build, the buffer length gets silently truncated to 32 bits, which results in incorrect output for large inputs.

Attached is a patch that fixes this by computing the checksum incrementally, using small-enough chunks of the buffer.

A better fix might be to have Modules/zlib/crc32.c use 64-bit lengths. I tried this, but I couldn't get it to work. It seems that if the system already has zlib installed, Python will link against the existing version instead of compiling its own.

Testing this might be a bit tricky. Allocating a 4+GB regular buffer isn't practical. Using a memory-mapped file would work, but I'm not sure having a unit test create a multi-gigabyte file is a great thing to do.
History
Date User Action Args
2010-11-01 09:46:05nadeem.vawdasetrecipients: + nadeem.vawda
2010-11-01 09:46:05nadeem.vawdasetmessageid: <1288604765.69.0.684600327936.issue10276@psf.upfronthosting.co.za>
2010-11-01 09:46:03nadeem.vawdalinkissue10276 messages
2010-11-01 09:46:02nadeem.vawdacreate