classification
Title: cPickle MemoryError when loading large file (while pickle works)
Type: crash Stage: resolved
Components: Extension Modules Versions: Python 2.7
process
Status: closed Resolution: fixed
Dependencies: 7358 Superseder:
Assigned To: serhiy.storchaka Nosy List: Arfrever, Ramchandra Apte, benjamin.peterson, flox, jcea, mark.dickinson, neologix, phillies, pitrou, python-dev, serhiy.storchaka, vstinner
Priority: release blocker Keywords: patch

Created on 2011-12-08 12:52 by phillies, last changed 2013-02-26 14:10 by jcea. This issue is now closed.

Files
File name Uploaded Description Edit
pickle_overflow.diff neologix, 2011-12-17 22:15
pickle_overflow-1.diff neologix, 2011-12-17 22:46
pickle_overflow-2.diff neologix, 2011-12-22 22:40
pickle_overflow-4.diff serhiy.storchaka, 2013-02-11 21:09 review
pickle Arfrever, 2013-02-25 19:57
Messages (34)
msg149027 - (view) Author: Philipp Lies (phillies) Date: 2011-12-08 12:52
When I try to load a large file (>1GB) cPickle crashes with a MemoryError:
$python test.py
Traceback (most recent call last):
  File "/tmp/test.py", line 8, in <module>
    A2 = cPickle.load(f2)
MemoryError

test.py contains following code:
import numpy as np
import cPickle
A = np.random.randn(196,240000)
f = open('test.pydat', 'w')
cPickle.dump(A,f)
f.close()
f2 = open('test.pydat', 'rb')
A2 = cPickle.load(f2)

System:
cPickle 1.71
python 2.7.2
Ubuntu 11.10 amd64

Memory is not an issue as a) pickle works nicely and b) my computer has 122GB free RAM
msg149028 - (view) Author: Ramchandra Apte (Ramchandra Apte) * Date: 2011-12-08 12:56
Maybe Ubuntu doesn't think it is safe to allocate the memory.
msg149029 - (view) Author: Philipp Lies (phillies) Date: 2011-12-08 13:02
Well, replace cPickle by pickle and it works. So if there is a memory allocation problem cPickle should be able to handle it, especially since it should be completely compatible to pickle.
msg149034 - (view) Author: Ramchandra Apte (Ramchandra Apte) * Date: 2011-12-08 13:38
Have you checked the system monitor after all cPickle can use more memory than 1GB.
msg149035 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-12-08 13:40
Is there a way to reproduce that doesn't involve numpy?
msg149049 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2011-12-08 18:57
> my computer has 122GB free RAM

122, or 1.22?

> Well, replace cPickle by pickle and it works.

cPickle makes some direct call to malloc()/realloc()/free(), contrarily to pickle which uses pymalloc. This could lead to heap fragmentation.

What does the following return ?
$ /usr/bin/time -v python test.py
msg149300 - (view) Author: Philipp Lies (phillies) Date: 2011-12-12 13:39
a) it's 122GB free RAM (out of 128GB total RAM)

b) when I convert the numpy array to a list it works. So seems to be a problem with cPickle and numpy at/from a certain array size

c) $ /usr/bin/time -v python test_np.py 
Traceback (most recent call last):
  File "test_np.py", line 12, in <module>
    A2 = cPickle.load(f2)
MemoryError
Command exited with non-zero status 1
	Command being timed: "python test_np.py"
	User time (seconds): 73.72
	System time (seconds): 4.56
	Percent of CPU this job got: 87%
	Elapsed (wall clock) time (h:mm:ss or m:ss): 1:29.52
	Average shared text size (kbytes): 0
	Average unshared data size (kbytes): 0
	Average stack size (kbytes): 0
	Average total size (kbytes): 0
	Maximum resident set size (kbytes): 7402448
	Average resident set size (kbytes): 0
	Major (requiring I/O) page faults: 0
	Minor (reclaiming a frame) page faults: 726827
	Voluntary context switches: 41043
	Involuntary context switches: 7793
	Swaps: 0
	File system inputs: 3368
	File system outputs: 2180744
	Socket messages sent: 0
	Socket messages received: 0
	Signals delivered: 0
	Page size (bytes): 4096
	Exit status: 1

hth
msg149319 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2011-12-12 16:00
@Antoine
Couldn't this be linked to #11564 (pickle not 64-bit ready)? AFAICT this wasn't fixed in 2.7. Basically, an integer overflow, and malloc() would bail out when asked a ridiculous size.

@Philipp
I'd be curious to see the last lines of
$ ltrace -e malloc python test_np.py

you'll probably see a call to malloc() return NULL.
msg149326 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-12-12 16:20
> Couldn't this be linked to #11564 (pickle not 64-bit ready)?

Well, I don't know anything about numpy, but:

>>> 196 * 240000
47040000
>>> 196 * 240000 * 8   # assuming 8 bytes per float
376320000
>>> 2**31
2147483648

So it seems unlikely to be the explanation. After all the report says ">1GB" for the file size, which is still comfortably in the capabilities of a 32-bit module.

Philipp, perhaps you could try to run Python under gdb and try to diagnose where the MemoryError occurs?

Also, posting the disassembly (using pickletools.dis()) of a very small equivalent pickle (e.g. of randn(2,3)) could help us know what is involved.
msg149327 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-12-12 16:35
Oh, what's more interesting is that it works here (Python 2.7.1 and numpy 1.6.1 under Mageia with 8GB RAM). Looking at a pickle disassembly, the only remarkable thing is the presence of a long binary string (the raw serialization of all IEEE floats), which shouldn't give any problem to cPickle (and indeed doesn't). What is your numpy version, Philipp?
msg149673 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2011-12-17 16:21
> So it seems unlikely to be the explanation.

Victor reproduced in on IRC, and it's indeed an overflow.
The problematic code is in readline_file:
"""
        bigger = self->buf_size << 1;
        if (bigger <= 0) {              /* overflow */
            PyErr_NoMemory();
            return -1;
        }
        newbuf = (char *)realloc(self->buf, bigger);
        if (!newbuf)  {
            PyErr_NoMemory();
            return -1;
        }
"""

self->buf_size is an int, which overflow pretty easily.

>>> 196 * 240000
47040000
>>> 196 * 240000 * 8   # assuming 8 bytes per float
376320000
>>> 2**31
2147483648

Hmmm... A byte is 8 bit, which gives:
>>> 196 * 240000 * 8 * 8
3010560000L
>>> 196 * 240000 * 8 * 8 > 2**31
True

Now, if it works on your box, it's probably due to the compiler optimizing the check away. Since `bigger` is cast to an unsigned 64-bit (size_t) when calling malloc(), it happens to work.
Maybe your distro doesn't build python with -fwrapv.

So, what do you suggest? Should we fix this (Py_ssize_t, overflow check before computation), as in #11564?
msg149674 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-12-17 16:24
> Should we fix this (Py_ssize_t, overflow check before computation), as in #11564?

Yes. Use Py_ssize_t type for the buf_size attribute, and replace "bigger <= 0" (test if an overflow occurred) by "self->buf_size > (PY_SSIZE_T_MAX >> 1)".
msg149676 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-12-17 16:35
Ah, I see. It's a bit of a pity not to be able to load files > 1GB, especially on a 64-bit build (!). Perhaps cPickle could be made partly 64-bit compatible? Or at least, indeed, do a proper anti-overflow check.
msg149711 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2011-12-17 22:15
Here's a patch which should fix this. However, I'm unable to test it.
msg149713 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-12-17 22:25
I think there's a problem here:

+    self->data = realloc(self->data, self->size * sizeof(PyObject *));
+    if (self->data == NULL)
         goto nomemory;

If realloc() fails, the old data pointer is lost and therefore will never get free()ed.

Same for:

+        self->buf = (char *)realloc(self->buf, self->buf_size);

Here:

-        int *marks;
-        s=self->marks_size+20;
-        if (s <= self->num_marks) s=self->num_marks + 1;
+        size_t alloc;
+        Py_ssize_t *marks;
+
+        /* Use the size_t type to check for overflow. */
+        alloc = ((size_t)self->num_marks << 1) + 20;

It seems you are changing the overallocation algorithm (from additive to multiplicative). I'm not sure it should be in the scope of the patch, although multiplicative overallocation is generally better.
msg149715 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2011-12-17 22:46
New version.
msg149757 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-12-18 14:21
> New version.

Looks good to me.
msg150117 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2011-12-22 22:40
Here's a new version with a test (untested).
Note that I'm absolutely not sure that the 'memsize' argument to
bigmemtest is correct.
msg150762 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2012-01-06 19:16
Antoine, could you test the last version (test_pickle and if possible
with the OP testcase)?
I can't test it myself (32-bit machine with 1 GB).
msg150765 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-01-06 19:33
Le vendredi 06 janvier 2012 à 19:17 +0000, Charles-François Natali a
écrit :
> Charles-François Natali <neologix@free.fr> added the comment:
> 
> Antoine, could you test the last version (test_pickle and if possible
> with the OP testcase)?
> I can't test it myself (32-bit machine with 1 GB).

Well, first, there are compilation warnings on a 64-bit box:

gcc -pthread -fPIC -fno-strict-aliasing -g -O2 -g -O0 -Wall -Wstrict-prototypes -I. -IInclude -I./Include -I/usr/local/include -I/home/antoine/cpython/27/Include -I/home/antoine/cpython/27 -c /home/antoine/cpython/27/Modules/cPickle.c -o build/temp.linux-x86_64-2.7-pydebug/home/antoine/cpython/27/Modules/cPickle.o
/home/antoine/cpython/27/Modules/cPickle.c: In function ‘put2’:
/home/antoine/cpython/27/Modules/cPickle.c:810:9: attention : format ‘%d’ expects type ‘int’, but argument 4 has type ‘Py_ssize_t’
/home/antoine/cpython/27/Modules/cPickle.c: In function ‘load_mark’:
/home/antoine/cpython/27/Modules/cPickle.c:4610:21: attention : assignment from incompatible pointer type
gcc -pthread -shared build/temp.linux-x86_64-2.7-pydebug/home/antoine/cpython/27/Modules/cPickle.o -L/usr/local/lib -o build/lib.linux-x86_64-2.7-pydebug/cPickle.so

Second, I can't seem to get the test to work with 8GB RAM (approximately
6.5GB free according to "free"). The MemoryError is quite expectable for
test_pickle, though, since the code there doesn't try to conserve memory
at all:

test test_pickle failed -- Traceback (most recent call last):
  File "/home/antoine/cpython/27/Lib/test/test_support.py", line 983, in wrapper
    return f(self, maxsize)
  File "/home/antoine/cpython/27/Lib/test/pickletester.py", line 1298, in test_huge_str_32b
    pickled = self.dumps(data, proto)
  File "/home/antoine/cpython/27/Lib/test/test_pickle.py", line 74, in dumps
    return pickle.dumps(arg, proto)
  File "/home/antoine/cpython/27/Lib/pickle.py", line 1374, in dumps
    Pickler(file, protocol).dump(obj)
  File "/home/antoine/cpython/27/Lib/pickle.py", line 224, in dump
    self.save(obj)
  File "/home/antoine/cpython/27/Lib/pickle.py", line 286, in save
    f(self, obj) # Call unbound method with explicit self
  File "/home/antoine/cpython/27/Lib/pickle.py", line 488, in save_string
    self.write(STRING + repr(obj) + '\n')
MemoryError

I would therefore suggest to enable the test only for cPickle.
For test_cpickle the behaviour is different:
- for protocol 0, I get a MemoryError (which may be expected, if the
test truly needs more than 6.5GB, for example because of suboptimal
buffer management)
- for protocol 1 and 2, I get "SystemError: error return without
exception set"
msg178042 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2012-12-24 09:14
Could someone with a 64-bit box take over this one?
I won't go anywhere with my 32-bit box...
msg178559 - (view) Author: Ramchandra Apte (Ramchandra Apte) * Date: 2012-12-30 08:24
Bump.
@neologix
I have a 64-bit laptop with 2 GB memory so I don't think I can do so. (though one could use swap)
msg178564 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-12-30 10:42
I may tackle this but rare 2.7-only bugs are pretty low on my priorities list.
msg178566 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2012-12-30 11:29
> I have a 64-bit laptop with 2 GB memory so I don't think I can do so. (though one could use swap)

AFAICT, a binary string a little longer than 1GB should be enough to
reproduce the bug.
Just make sure Python isn't built with '-fwrapv'.
msg180763 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-01-27 12:32
Charles-François, are you going to finish this before 2.7.4 RC released? The patch should be updated to address Antoine's comments.
msg180777 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2013-01-27 17:41
> Charles-François, are you going to finish this before 2.7.4 RC released? The patch should be updated to address Antoine's comments.

No.
I don't have access to a 64-bit box, which makes it difficult to write
(I don't get compiler warnings) and test.

Feel free to pick up the patch and update it!
msg180792 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-01-27 20:49
I have picked up it.

Actually BINSTRING format supports only strings less 2GiB, so test should be changed. But there are other bugs which prohibit pickling a lot of data. For one I open issue17054.
msg181750 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-02-09 18:21
Here is an updated patch. Fixed two bugs found by Antoine (an inappropriate format and a memory error in bigmemtest), fixed resizing of marks array and one possible integer overflow in write_other(). Used workaround to bypass limitations of cStringIO API.
msg181932 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-02-11 21:09
Test updated too. Now it doesn't try to write a string larger than 2 GiB (it's impossible), instead writes a lot of shorter strings with total size larger than 2 GiB.
msg181973 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2013-02-12 19:37
New changeset 680959a3ae2e by Serhiy Storchaka in branch '2.7':
Issue #13555: cPickle now supports files larger than 2 GiB.
http://hg.python.org/cpython/rev/680959a3ae2e
msg182979 - (view) Author: Arfrever Frehtes Taifersar Arahesis (Arfrever) * (Python triager) Date: 2013-02-25 19:57
680959a3ae2e has caused that g-ir-scanner tool from gobject-introspection (which seems to create some pickles and next load them?) fails with MemoryError, e.g. during building of GTK+.
MemoryError occurs only for a subset of pickles.
I attach a small pickle, which allows to reproduce this problem.
Download this pickle and save it in /tmp.

Additional preparation:
$ cd /tmp
$ wget -q http://ftp.gnome.org/pub/gnome/sources/gobject-introspection/1.34/gobject-introspection-1.34.2.tar.xz
$ tar -xJf gobject-introspection-1.34.2.tar.xz
$ sed -e '/^with LibtoolImporter/,/^$/d' -i gobject-introspection-1.34.2/giscanner/xmlwriter.py

Behavior before 680959a3ae2e:
$ PYTHONPATH="gobject-introspection-1.34.2" python2.7 -c 'import cPickle; print(cPickle.load(open("pickle")))'
<giscanner.girparser.GIRParser object at 0x7f506299be90>

Behavior after 680959a3ae2e:
$ PYTHONPATH="gobject-introspection-1.34.2" python2.7 -c 'import cPickle; print(cPickle.load(open("pickle")))'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
MemoryError
msg183025 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2013-02-26 08:08
New changeset f3f23ecdb1c6 by Serhiy Storchaka in branch '2.7':
Issue #13555: Fix an integer overflow check.
http://hg.python.org/cpython/rev/f3f23ecdb1c6
msg183026 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-02-26 08:18
Thank you for the report. Standard tests do not cover pickling/unpickling to real files.
msg183030 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-02-26 08:42
I have opened issue17299 for testing issue.
History
Date User Action Args
2013-02-26 14:10:33jceasetnosy: + jcea
2013-02-26 08:42:24serhiy.storchakasetstatus: open -> closed
resolution: fixed
messages: + msg183030

stage: resolved
2013-02-26 08:18:14serhiy.storchakasetmessages: + msg183026
2013-02-26 08:08:41python-devsetmessages: + msg183025
2013-02-25 19:57:14Arfreversetstatus: closed -> open
files: + pickle


nosy: + benjamin.peterson
stage: resolved -> (no value)
messages: + msg182979
resolution: fixed -> (no value)
priority: normal -> release blocker
2013-02-15 08:26:40serhiy.storchakasetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2013-02-12 19:37:47python-devsetnosy: + python-dev
messages: + msg181973
2013-02-11 21:09:50serhiy.storchakasetfiles: + pickle_overflow-4.diff

messages: + msg181932
2013-02-11 21:07:19serhiy.storchakasetfiles: - pickle_overflow-3.diff
2013-02-09 18:21:40serhiy.storchakasetfiles: + pickle_overflow-3.diff
assignee: serhiy.storchaka
messages: + msg181750

stage: needs patch -> patch review
2013-01-28 09:04:30serhiy.storchakasetdependencies: + cStringIO not 64-bit safe, - cStringIO.StringIO aborted when more then INT_MAX bytes written
2013-01-27 20:49:24serhiy.storchakasetdependencies: + cStringIO.StringIO aborted when more then INT_MAX bytes written
messages: + msg180792
2013-01-27 17:41:53neologixsetmessages: + msg180777
2013-01-27 12:32:38serhiy.storchakasetmessages: + msg180763
2012-12-30 11:29:16neologixsetmessages: + msg178566
2012-12-30 10:42:36pitrousetmessages: + msg178564
2012-12-30 08:24:03Ramchandra Aptesetmessages: + msg178559
2012-12-24 09:14:23neologixsetmessages: + msg178042
2012-12-02 10:23:39serhiy.storchakasetnosy: + serhiy.storchaka

stage: needs patch
2012-07-30 18:21:26Arfreversetnosy: + Arfrever
2012-01-06 19:33:26pitrousetmessages: + msg150765
2012-01-06 19:16:59neologixsetmessages: + msg150762
2011-12-22 22:40:59neologixsetfiles: + pickle_overflow-2.diff

messages: + msg150117
2011-12-18 14:21:10pitrousetmessages: + msg149757
2011-12-18 11:13:33floxsetnosy: + flox
2011-12-17 22:46:05neologixsetfiles: + pickle_overflow-1.diff

messages: + msg149715
2011-12-17 22:25:28pitrousetmessages: + msg149713
2011-12-17 22:15:35neologixsetfiles: + pickle_overflow.diff
keywords: + patch
messages: + msg149711
2011-12-17 16:35:30pitrousetmessages: + msg149676
2011-12-17 16:24:08vstinnersetnosy: + vstinner
messages: + msg149674
2011-12-17 16:21:29neologixsetmessages: + msg149673
2011-12-12 16:35:35pitrousetmessages: + msg149327
2011-12-12 16:20:22pitrousetmessages: + msg149326
2011-12-12 16:03:24neologixsetmessages: - msg149321
2011-12-12 16:02:21neologixsetmessages: + msg149321
2011-12-12 16:00:33neologixsetmessages: + msg149319
2011-12-12 13:39:15philliessetmessages: + msg149300
2011-12-08 18:57:44neologixsetnosy: + neologix
messages: + msg149049
2011-12-08 13:40:36pitrousetnosy: + mark.dickinson
2011-12-08 13:40:28pitrousetnosy: + pitrou
messages: + msg149035
components: + Extension Modules, - None
2011-12-08 13:38:18Ramchandra Aptesetmessages: + msg149034
2011-12-08 13:02:08philliessetmessages: + msg149029
2011-12-08 12:56:55Ramchandra Aptesetnosy: + Ramchandra Apte
messages: + msg149028
2011-12-08 12:52:02philliescreate