Message127637
The issue looks quite clear: cStringIO.write() asserts that the required storage size is less than INT_MAX. Therefore, in all likelihood, the pickle dump is simply larger than 2GB.
Now, the cStringIO structures seem 64-bit safe, so the comparison to INT_MAX seems useless. Also, in non-debug builds, the assert is skipped but the Py_ssize_t values are first cast to int before being saved into a Py_ssize_t struct member! Which would lead to data corruption and/or crashes.
The INT_MAX asserts were added in r42382 which also made cStringIO internals 64-bit safe. Martin, do you remember why you did this?
The asserts don't even protect against the following crash:
$ ./python -c "from cStringIO import StringIO; b=b'x'*(2**31+1); s=StringIO(); s.write(b)"
Erreur de segmentation
gdb shows the following stack excerpt:
#0 0x00007ffff724e8c2 in memcpy () from /lib64/libc.so.6
#1 0x00007ffff6b47a46 in O_cwrite (self=<unknown at remote 0xa3f790>, c=0x7fff76b44054 'x' <repeats 200 times>...,
l=-2147483647) at /home/antoine/cpython/27/Modules/cStringIO.c:415
Now, onto the problem of reproducing, here's another interesting thing: while some internal structures of cStringIO are 64-bit safe, the Python-facing API isn't:
>>> from cStringIO import StringIO
>>> b = b"x" * (2**32+1)
>>> s = StringIO()
>>> s.write(b)
>>> s.tell()
1
(the module doesn't use Py_SSIZE_T_CLEAN) |
|
Date |
User |
Action |
Args |
2011-01-31 18:24:57 | pitrou | set | recipients:
+ pitrou, loewis, belopolsky, vstinner, eric.smith, rybesh |
2011-01-31 18:24:57 | pitrou | set | messageid: <1296498297.76.0.227843294079.issue7358@psf.upfronthosting.co.za> |
2011-01-31 18:24:57 | pitrou | link | issue7358 messages |
2011-01-31 18:24:57 | pitrou | create | |
|