classification
Title: zlib causes a SystemError when decompressing a chunk >1GB
Type: crash Stage: resolved
Components: Library (Lib) Versions: Python 3.1, Python 3.2, Python 2.7, Python 2.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: Sreejith.Madhavan, ddmitriev, gregory.p.smith, pitrou
Priority: high Keywords: patch

Created on 2010-04-29 15:35 by ddmitriev, last changed 2010-05-07 17:12 by pitrou. This issue is now closed.

Files
File name Uploaded Description Edit
zlibmodule.patch Sreejith.Madhavan, 2010-05-07 09:04 length and old_length in PyZlib_objcompress and PyZlib_objdecompress should be Py_ssize_t not int
zlibbigbuf.patch pitrou, 2010-05-07 15:55
Messages (10)
msg104522 - (view) Author: Denis Dmitriev (ddmitriev) Date: 2010-04-29 15:35
There's a bug in python's zlibmodule.c that makes it raise SystemError whenever it tries to decompress a chunk larger than 1GB is size. Here's an example of this in action:

dmitriev@...:~/moo/zlib_bug> cat zlib_bug.py 
import zlib

def test_zlib(size_mb):
    print "testing zlib with a %dMB object" % size_mb
    c = zlib.compressobj(1)
    sm = c.compress(' ' * (size_mb*1024*1024)) + c.flush()
    d = zlib.decompressobj()
    dm = d.decompress(sm) + d.flush()

test_zlib(1024)
test_zlib(1025)

dmitriev@...:~/moo/zlib_bug> python2.6 zlib_bug.py 
testing zlib with a 1024MB object
testing zlib with a 1025MB object
Traceback (most recent call last):
  File "zlib_bug.py", line 11, in <module>
    test_zlib(1025)
  File "zlib_bug.py", line 8, in test_zlib
    dm = d.decompress(sm) + d.flush()
SystemError: Objects/stringobject.c:4271: bad argument to internal function

dmitriev@...:~/moo/zlib_bug>

A similar issue was reported in issue1372; however, either this one is different, or the issue still exists in all versions of Python 2.6 that I tested, on both Solaris and Mac OS X. These are all 64-bit builds and have no problem manipulating multi-GB structures, so it's not an out-of-memory condition:

dmitriev@...:~/moo/zlib_bug> python2.6   
Python 2.6.1 (r261:67515, Nov 18 2009, 12:21:47) 
[GCC 4.3.3] on sunos5
Type "help", "copyright", "credits" or "license" for more information.
>>> len(' ' * (6000*1024*1024))
6291456000
>>>
msg104537 - (view) Author: Denis Dmitriev (ddmitriev) Date: 2010-04-29 16:50
Alright, I think this is caused by the following:

static PyObject *
PyZlib_objdecompress(compobject *self, PyObject *args)
{
    int err, inplen, old_length, length = DEFAULTALLOC;
    int max_length = 0;

The problem is that inplen, length, old_length, and max_length are all ints, whereas they should be Py_ssize_t. I think replacing them should make the bug go away. (I can't test it right now though because I'm having trouble compiling zlibmodule on my current machine)
msg105186 - (view) Author: Sreejith Madhavan (Sreejith.Madhavan) Date: 2010-05-07 09:04
Attached a patch for Modules/zlibmodule.c that worked for me.  Tested with python (2.6.1 and 2.6.5) 64bit builds on Solaris (amd64 and sparc) and RHEL5.2 amd64.
msg105193 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-05-07 11:45
Confirmed. It even segfaults under 3.x.
msg105204 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-05-07 15:55
Here is a patch fixing the issue, and a similar one in compressobj.compress(). It also adds tests which are only enabled with the bigmem option.
msg105208 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2010-05-07 16:26
in the unittests there is some use of 'compress' and 'decompress'
mixed with 'zlib.decompress'.  which one is right (i'm only looking at
the diff so i can't see if they were imported from zlib or if the
zlib. is required at the moment).

otherwise, provided the new bigmem tests pass.  looks good to me.
commit and backport through to 2.6.

On Fri, May 7, 2010 at 8:56 AM, Antoine Pitrou <report@bugs.python.org> wrote:
>
> Antoine Pitrou <pitrou@free.fr> added the comment:
>
> Here is a patch fixing the issue, and a similar one in compressobj.compress(). It also adds tests which are only enabled with the bigmem option.
>
> ----------
> Added file: http://bugs.python.org/file17245/zlibbigbuf.patch
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue8571>
> _______________________________________
>
msg105210 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-05-07 16:36
> in the unittests there is some use of 'compress' and 'decompress'
> mixed with 'zlib.decompress'.  which one is right

When testing the decompressor, data needs to be prepared first
(compressed). I didn't bother creating a compressor object in this case.
msg105211 - (view) Author: Denis Dmitriev (ddmitriev) Date: 2010-05-07 16:39
Is there a reason to keep inplen and max_length ints instead of making them Py_ssize_t too? I'm a little worried that keeping them ints will cause a similar problem further down the line.
msg105213 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-05-07 16:46
> Is there a reason to keep inplen and max_length ints instead of making
> them Py_ssize_t too?

zlibmodule.c isn't 64-bit clean internally. Actually, the zlib itself
uses uInt in its z_stream structure. This should be the target of a
separate issue, and seems it will require more work than simply changing
the type of a variable.
msg105215 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-05-07 17:09
Patch committed in r80926 (trunk), r80927 (2.6), r80928 (py3k) and r80929 (3.1). Thank you!
History
Date User Action Args
2010-05-07 17:12:05pitrousetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2010-05-07 17:09:51pitrousetmessages: + msg105215
2010-05-07 16:46:40pitrousetmessages: + msg105213
2010-05-07 16:39:27ddmitrievsetmessages: + msg105211
2010-05-07 16:36:22pitrousetmessages: + msg105210
2010-05-07 16:26:28gregory.p.smithsetmessages: + msg105208
2010-05-07 15:55:57pitrousetfiles: + zlibbigbuf.patch

messages: + msg105204
2010-05-07 14:38:46pitrousetnosy: + gregory.p.smith
2010-05-07 11:45:24pitrousetversions: + Python 3.2
2010-05-07 11:45:04pitrousetpriority: normal -> high
versions: + Python 3.1, Python 2.7
nosy: + pitrou

messages: + msg105193

stage: patch review
2010-05-07 09:04:38Sreejith.Madhavansetfiles: + zlibmodule.patch

nosy: + Sreejith.Madhavan
messages: + msg105186

keywords: + patch
2010-04-29 16:50:29ddmitrievsetmessages: + msg104537
2010-04-29 15:35:55ddmitrievcreate