Title: cPickle can misread data type
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 2.6
Status: closed Resolution: wont fix
Assigned To: alexandre.vassalotti Nosy List: ac.james, alexandre.vassalotti, georg.brandl, mark.dickinson
Priority: normal Keywords: patch

Created on 2009-06-15 23:19 by ac.james, last changed 2022-04-11 14:56 by admin. This issue is now closed.

File name Uploaded Description Edit ac.james, 2009-06-22 21:01 test case
issue6290.patch mark.dickinson, 2010-07-12 15:35
Messages (9)
msg89418 - (view) Author: Alex James (ac.james) Date: 2009-06-15 23:19
When using cPickle to pickle / unpickle an object instance whose
__dict__ contains a dictionary of NumPy Arrays (on a windows32 system),
some of the array elements have the wrong type raising a ValueError:
could not convert string to float.  

On UNIX platform this error does not occur, and the data is read out in
the correct type every time.  
Forcing the caller to use module instead removed the issue.

Statements about the imprecision of cPickle (such as issue1536, 655802),
or its deprecaition (now I can't find where that was mentioned), would
assist.  By contrast the current state of the documentation implies that
cPickle is better overall, and thus should be used preferentially.
msg89421 - (view) Author: Alexandre Vassalotti (alexandre.vassalotti) * (Python committer) Date: 2009-06-16 01:24
Could you provide a test case? The behaviour you are describing sounds
like a bug in cPickle.
msg89612 - (view) Author: Alex James (ac.james) Date: 2009-06-22 21:01
I have now pinpointed the error to a list of infinities (see attached).
When using to read the cPickle'd data we get a different, and
more, informative error:
ValueError: invalid literal for float(): 1.#INF
msg89619 - (view) Author: Alexandre Vassalotti (alexandre.vassalotti) * (Python committer) Date: 2009-06-23 01:24
Thanks for the test case. I will take a look.
msg89620 - (view) Author: Alexandre Vassalotti (alexandre.vassalotti) * (Python committer) Date: 2009-06-23 01:52
Could you give me the output of this?

  import cPickle
  print repr(cPickle.dumps([float('+inf'), float('-inf'), float('nan')]))
  print [float('+inf'), float('-inf'), float('nan')]

By the way, are you sure this bug occurs on Python 2.6? Python 2.6 uses
a platform-independent float to string converter (i.e.,
PyOS_double_to_string) which shouldn't output stuff like "1.#INF"

Also, can you verify that the bug does not occur with pickle protocol 1
and over?
msg89638 - (view) Author: Alex James (ac.james) Date: 2009-06-23 18:40
Your test prints:
[inf, -inf, nan]

My installation is Python 2.6.2 as currently distributed.

Specifying protocol 1 or 2 does circumvent the error.  
Thank you.
msg110094 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-07-12 15:20
I think this can be closed.  It should no longer be a problem in Python 2.7 or Python 3.x, and there's a workaround (use protocol 1 or 2) for Python 2.6.

In theory, it *could* still be fixed for Python 2.6.6, but changing the pickle output in a bugfix release seems like it might be dangerous.
msg110097 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-07-12 15:35
However, here's a patch.  I haven't tested it on Windows.
msg112952 - (view) Author: Alexandre Vassalotti (alexandre.vassalotti) * (Python committer) Date: 2010-08-05 07:23
It is too late for 2.6.6 now that it is released.
