This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author Philipp.Lies
Recipients Philipp.Lies
Date 2012-07-30.15:43:47
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1343663028.38.0.42350418133.issue15504@psf.upfronthosting.co.za>
In-reply-to
Content
I just stumbled upon a very serious bug in cPickle where cPickle stores the data passed to it only partially without a warning/error:

#creating a >8GB long random data sting
import os
import cPickle
random_string = os.urandom(int(1.1*2**33))
print len(random_string)
fout = open('test.pickle', 'wb')
cPickle.dump(random_string, fout, 2)
fout.close()
fin = open('test.pickle', 'rb')
random_string2 = cPickle.load(fin)
print len(random_string2)
print random_string == random_string2

The loaded string is significantly shorter, meaning that some of the data got lost while storing the string. This is a serious issue. However, when I use pickle, writing fails with 
error: 'i' format requires -2147483648 <= number <= 2147483647
so I guess pickle is not able to handle large data, therefore cPickle should either throw an error as well of pickle/cPickle should be patched to handle larger data.

Code to reproduce error using numpy (that's how I stumbled upon it):
import numpy as np
import cPickle as pickle
A = np.random.randn(1080,1920,553)
fout = open('test.pickle', 'wb')
pickle.dump(A, fout, 2)
fout.close()
fin = open('test.pickle', 'rb')
B = pickle.load(fin)
Here, numpy detects that the amount of data is wrong and throws an error. However, still serious because saving does not lead to an error so the user expects that the data are safely stored.

I guess might be related to http://bugs.python.org/issue13555 which is still open.

Python 2.7.3 on latest Ubuntu with numpy 1.6.2, 64bit architecture, 128GB RAM
History
Date User Action Args
2012-07-30 15:43:48Philipp.Liessetrecipients: + Philipp.Lies
2012-07-30 15:43:48Philipp.Liessetmessageid: <1343663028.38.0.42350418133.issue15504@psf.upfronthosting.co.za>
2012-07-30 15:43:47Philipp.Lieslinkissue15504 messages
2012-07-30 15:43:47Philipp.Liescreate