Message63915
It would seem that pickling arrays directly exposes the underlying
machine words, making the pickle non-portable to platforms with
different layout of array elements. The guts of array.__reduce__ look
like this:
if (array->ob_size > 0) {
result = Py_BuildValue("O(cs#)O",
array->ob_type,
array->ob_descr->typecode,
array->ob_item,
array->ob_size * array->ob_descr->itemsize,
dict);
}
The byte string that is pickled is directly created from the array's
contents. Unpickling calls array_new which in turn calls
array_fromstring, which ends up memcpying the string data to the new array.
As far as I can tell, array pickles created on one platform cannot be
unpickled on a platform with different endianness (in case of integer
arrays), wchar_t size (in case of unicode arrays) or floating-point
representation (rare in practice, but possible). If pickles are
supposed to be platform-independent, this should be fixed.
Maybe the "typecode" field when used with the constructor could be
augmented to include information about the elements, such as endianness
and floating-point format. Or we should simply punt and pickle the
array as a list of Python objects that comprise it...? |
|
Date |
User |
Action |
Args |
2008-03-18 14:38:12 | hniksic | set | spambayes_score: 0.00911031 -> 0.009110308 recipients:
+ hniksic |
2008-03-18 14:38:12 | hniksic | set | spambayes_score: 0.00911031 -> 0.00911031 messageid: <1205851091.96.0.575523128038.issue2389@psf.upfronthosting.co.za> |
2008-03-18 14:38:08 | hniksic | link | issue2389 messages |
2008-03-18 14:38:07 | hniksic | create | |
|