This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author pobrien
Recipients
Date 2002-12-18.15:23:33
SpamBayes Score
Marked as misclassified
Message-id
In-reply-to
Content
Please see the previous bug report for background 
details: 
 
[ python-Bugs-654866 ] pickle and cPickle not 
equivalent 
 
The basic problem is that in certain rather rare (but not 
*that* rare either, imo) situations cPickle produces a 
pickle file that cannot reconstruct the same class 
instance identities as the pickle file produced by pickle. 
In addition, this appears to be true about pickling 
old-style class instances, but not pickling new-style 
class instances. So changing a class from old-style to 
new-style can change what gets pickled by cPickle. 
 
Here are items from the current docs that are candidates 
for clarification. My commentary appears within brackets: 
 
The data streams the two modules produce are 
guaranteed to be interchangeable. [Depends on the 
definition of interchangeable. I'd like to see something 
about reconstructability as well.] 
 
The pickle module keeps track of the objects it has 
already serialized, so that later references to the same 
object won't be serialized again. [Not true for cPickle, 
which apparently only keeps track if the refcount > 1. 
Note that this statement has never been true about 
instances of simple object types, like int and string.] 
 
If the same object is pickled by multiple dump() calls, the 
load() will all yield references to the same object. 
[Depends on whether you consider an object pickled as 
part of a container, and later pickled independently, as 
pickling the same object. If you expect that load() will 
yield references to the same object (and why wouldn't 
you, right? But that's why I'm disturbed by this.) then you 
need to be aware of the situations in which cPickle 
decides not to keep track.] 
 
The pickle data stream produced by pickle and cPickle 
are identical, so it is possible to use pickle and cPickle 
interchangeably with existing pickles. [This statement is 
part true and part false. The pickle data streams are not 
identical - they are often cosmetically different and 
occassionally substantially different. And this isn't really 
the reason the data streams are interchangeable. That 
has to do with the structure of the data stream, not the 
content of the data stream. We need to make it clear that 
the content isn't guaranteed to be identical, even though 
the structure of existing pickles can be read by either 
pickle or cPickle.] 
 
There are additional minor differences in API between 
cPickle and pickle, however for most applications, they 
are interchangable. More documentation is provided in 
the pickle module documentation, which includes a list of 
the documented differences. [Applications that care 
about object identity will want to be aware of the 
limitation of the cPickle memoization capability and how it 
differs from the pickle version.] 
 
Footnotes 
 
... pickles3.13  
Since the pickle data format is actually a tiny 
stack-oriented programming language, and some freedom 
is taken in the encodings of certain objects, it is possible 
that the two modules produce different data streams for 
the same input objects. However it is guaranteed that 
they will always be able to read each other's data 
streams. [Again, readability is not good enough for 
applications that expect object reconstuction 
equivalence.] 
 
History
Date User Action Args
2007-08-23 14:09:27adminlinkissue655802 messages
2007-08-23 14:09:27admincreate