Title: cPickle.dumps change after for
Type: Stage:
Components: Library (Lib) Versions:
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: arigo, marcoberi, mwh, tim.peters
Priority: normal Keywords:

Created on 2002-11-19 11:42 by marcoberi, last changed 2004-01-12 12:30 by arigo. This issue is now closed.

Messages (9)
msg13394 - (view) Author: Marco Beri (marcoberi) Date: 2002-11-19 11:42
try this program:

import cPickle
print cPickle.dumps(a)
for x in a:

It prints:

Why on earth it does dumps return different string?
And after just a for with a pass in it...
msg13395 - (view) Author: Michael Hudson (mwh) (Python committer) Date: 2002-11-19 12:00
Logged In: YES 


>>> a=([{}])
>>> cPickle.dumps(a)

you're only printing the first pickle...
msg13396 - (view) Author: Marco Beri (marcoberi) Date: 2002-11-19 12:07
Logged In: YES 

I forgot to print the second dumps but the bug remains:
Before the for:
After the for:
msg13397 - (view) Author: Michael Hudson (mwh) (Python committer) Date: 2002-11-19 12:56
Logged In: YES 

You're right, sorry.
msg13398 - (view) Author: Michael Hudson (mwh) (Python committer) Date: 2002-11-19 13:03
Logged In: YES 

OK, I think I know what's happening.

Look at this:

>>> a = {}       
>>> cPickle.dumps(a)
>>> cPickle.dumps({})

the only difference is in the number of references to the
object getting pickled.  I don't know why that matters, but
it seems to.

Why does this matter to you?
msg13399 - (view) Author: Marco Beri (marcoberi) Date: 2002-11-19 13:56
Logged In: YES 

I would like to use hash(cPickle.dumps(a)) to have a unique 
hash also for dict that aren't hashable.
msg13400 - (view) Author: Michael Hudson (mwh) (Python committer) Date: 2002-11-19 14:35
Logged In: YES 

Well, that's asking for something you're not promised.  It
takes some imagination to categorize this as a bug, and I'm
not going to put any of my time into fixing it.

It seems using pickle in place of cPickle *may* work, fwiw.
msg13401 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2002-11-19 16:19
Logged In: YES 

Closed as Not-A-Bug.  The internals of pickle strings aren't 
guaranteed, just that "they work" when unpickled again, and 
these do.  If you want a hash code for a dict, don't dare use 
pickle for this either, even if it appears "to work":  it doesn't.  
The order in which dict keys are enumerated isn't defined 
either, and can and does vary across releases, and even 
across program runs.

So a reliable hash code for a dict needs to be independent of 
key iteration order, and no version of pickle spends time 
trying to force that issue (waste of time -- it isn't needed for 
pickling or unpickling).
msg13402 - (view) Author: Armin Rigo (arigo) * (Python committer) Date: 2004-01-12 12:30
Logged In: YES 

Someone else ran into the problem of unexpectedly varying
cPickle strings:

Maybe this behavior should be documented.
Date User Action Args
2002-11-19 11:42:57marcobericreate