This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: pickle can pickle the wrong function
Type: behavior Stage: resolved
Components: Interpreter Core Versions: Python 2.7
process
Status: closed Resolution: wont fix
Dependencies: Superseder:
Assigned To: alexandre.vassalotti Nosy List: alexandre.vassalotti, amaury.forgeotdarc, barry, belopolsky, benjamin.peterson, mark.dickinson, nnorwitz, pitrou, r.david.murray, tim.peters
Priority: normal Keywords:

Created on 2008-08-24 07:01 by nnorwitz, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Messages (21)
msg71830 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2008-08-24 07:01
test_pickletools fails sporadically on at least two platforms I've seen.

http://www.python.org/dev/buildbot/all/x86%20gentoo%20trunk/builds/4120/step-test/0
http://www.python.org/dev/buildbot/all/ppc%20Debian%20unstable%20trunk/builds/1908/step-test/0

File
"/home/buildslave/python-trunk/trunk.norwitz-x86/build/Lib/pickletools.py",
line ?, in pickletools.__test__.disassembler_test
Failed example:
    dis(pickle.dumps(random.random, 0))
Expected:
        0: c    GLOBAL     'random random'
       15: p    PUT        0
       18: .    STOP
    highest protocol among opcodes = 0
Got:
        0: c    GLOBAL     'bsddb.test.test_thread random'
       31: p    PUT        0
       34: .    STOP
    highest protocol among opcodes = 0
**********************************************************************
1 items had failures:
   1 of  25 in pickletools.__test__.disassembler_test
msg71898 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2008-08-24 23:04
The valgrind errors below are possibly related.

Conditional jump or move depends on uninitialised value(s)
   PyUnicodeUCS2_EncodeUTF8 (unicodeobject.c:2216)
   _PyUnicode_AsString (unicodeobject.c:1417)
   save (_pickle.c:930)
   Pickler_dump (_pickle.c:2292)

Conditional jump or move depends on uninitialised value(s)
   PyUnicodeUCS2_EncodeUTF8 (unicodeobject.c:2220)
   _PyUnicode_AsString (unicodeobject.c:1417)
   save (_pickle.c:930)
   Pickler_dump (_pickle.c:2292)

Conditional jump or move depends on uninitialised value(s)
   PyUnicodeUCS2_EncodeUTF8 (unicodeobject.c:2227)
   _PyUnicode_AsString (unicodeobject.c:1417)
   save (_pickle.c:930)
   Pickler_dump (_pickle.c:2292)

Conditional jump or move depends on uninitialised value(s)
   PyUnicodeUCS2_EncodeUTF8 (unicodeobject.c:2229)
   _PyUnicode_AsString (unicodeobject.c:1417)
   save (_pickle.c:930)
   Pickler_dump (_pickle.c:2292)

Conditional jump or move depends on uninitialised value(s)
   PyUnicodeUCS2_EncodeUTF8 (unicodeobject.c:2233)
   _PyUnicode_AsString (unicodeobject.c:1417)
   save (_pickle.c:930)
   Pickler_dump (_pickle.c:2292)
msg71904 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2008-08-24 23:50
Indeed.  The problem was an incorrect conversion of str -> unicode,
instead of converting to bytes.  On getting the buffer from unicode, it
tried to read data which was uninitialized.

Hmmm, this fix is for 3.0 only, but the problem is happening in 2.6. 
Leaving open.

Committed revision 66021.
msg71910 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2008-08-25 06:39
It seems that if the tests are run in this order:

./python -E -tt ./Lib/test/regrtest.py -u all test_xmlrpc test_ctypes
test_json test_bsddb3 test_pickletools

The error will trigger consistently.  That is in 2.6 with a debug build
on a dual cpu box.  A debug build of 3.0 on the same machine did not
fail though I don't know if 3.0 has this problem. I was unable to prune
the list further.  The 3 tests (xmlrpc, ctypes and json) can be run in
any order prior to bsdb3 and then pickletools.
msg72673 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2008-09-06 17:24
Neal, can you verify that this is still a problem now that bsddb has
been removed?  The tests, run in the order of the last comment, succeed
for me in both 2.6 and 3.0, debug build or not, on both my single
processor Ubuntu box and dual core Mac.
msg73061 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2008-09-11 22:02
I can reproduce this on 2.6.
msg73081 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2008-09-12 11:34
The explanation is actually simple, do not blame bsddb :-)

random.random is a built-in method, so its __module__ attribute is not
set. To find it, pickle.whichmodule loops through all sys.modules, and
pick the first one that contains the function.

I reproduced the problem with a simple file:
    # someModule.py
    from random import random

then, start python and:
>>> import someModule, random, pickle
>>> pickle.dumps(random.random, 0)
'csomeModule\nrandom\np0\n.'

You may have to change the name of "someModule", to be sure that it
appears before "random" in sys.modules. Tested on Windows and x86_64 Linux.

To correct this, one direction would be to search only built-in or
extension modules; for bound methods (random.random.__self__ is not
None), try to dump separately the instance and the method name (but
getattr(random._inst, 'random') is not random.random).
Or simply change the test...
msg73082 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2008-09-12 12:05
I'm not sure that pickling random.random is a good idea; did you try to
pickle the random.seed function?
Their definition look very similar (at the end of random.py:
   _inst = Random()
   seed = _inst.seed
   random = _inst.random
) but Random.seed is a python method, whereas Random.random is inherited
from _randommodule.c.
msg73126 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2008-09-12 19:28
No thought went into picking random.random in the test -- it was just a
random ;-) choice.  Amaury's analysis of the source of non-determinism
is on target, and the easiest fix is to pick a coded-in-Python function
to pickle instead.  I suggest, e.g., changing the sporadically failing
doctest to:

>>> import pickletools
>>> dis(pickle.dumps(pickletools.dis, 0))
    0: c    GLOBAL     'pickletools dis'
   17: p    PUT        0
   20: .    STOP
highest protocol among opcodes = 0
msg73138 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-09-12 20:21
But it still means pickling a function/method defined in a builtin
extension module can give wrong results, doesn't it deserve being fixed?
msg73142 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2008-09-12 21:40
Amaury, yes, it would be nice if pickle were more reliable here.  But
that should be the topic of a different tracker issue.  Despite the
Title, this issue is really about sporadic pickletools.py failures, and
the pickletools tests are trying to test pickletools, not pickle.py (or
cPickle).  Changing the test as suggested makes it reliably test what
it's trying to test (namely that pickletools.dis() produces sensible
output for pickle's GLOBAL opcode).  Whether pickle/cPickle should do a
better job of building GLOBAL opcodes in the first place is a distinct
issue.

In any case, since pickle/cPickle have worked this way forever, and the
only known bad consequence to date is accidental sporadic test failures
in pickletools.py, the underlying pickle/cPickle issue shouldn't be a
release blocker regardless.
msg73165 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2008-09-13 04:14
BTW, note that the Title of this issue is misleading: 
pickle.whichmodule() uses object identity ("is":

    if ...  getattr(module, funcname, None) is func:

) to determine whether the given function object is supplied by a
module, so it's /not/ the case that a "wrong" function can be pickled. 
The worst that can happen is that the correct function is pickled but
obtained from a possibly surprising module.  For example, random.random
can't be confused with any other function named "random".

I expect this is why nobody has ever complained about it:  unless you're
looking at the strings embedded in the pickle GLOBAL opcode, it's
unlikely to have a visible consequence.

It would still be nice if pickle could identify "the most natural"
module for a given function, but hard to make a case that doing so would
be much more than /just/ "nice".
msg73174 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-09-13 10:59
> I expect this is why nobody has ever complained about it:  unless you're
> looking at the strings embedded in the pickle GLOBAL opcode, it's
> unlikely to have a visible consequence.

Well, it may have a consequence if pickle picks the "random" function
from a third-party module named "foobar", and you give the pickle to a
friend and expect it to work for him while he hasn't installed module
"foobar".
msg80479 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-01-24 21:13
I just committed Tim's suggested change in r68906.  This seemed a no-
brainer, regardless of what should be done about pickle.whichmodule.  One 
fewer sporadic buildbot failure sounds like a good thing to me.

(I hadn't noticed the pickletools failures until just after I committed a 
pickle module change, so I was rather thrown until Antoine directed me 
here...)
msg105539 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2010-05-11 20:45
'works for me' contradicts 'open', so I unset that
msg110461 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2010-07-16 16:08
Revision 68903 was merged in py3k in r68908.  It looks like a similar issue shows up in test_random:

======================================================================
ERROR: test_pickling (test.test_random.MersenneTwister_TestBasicOps)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/buildbot/slave/py-build/3.1.norwitz-amd64/build/Lib/test/test_random.py", line 107, in test_pickling
    state = pickle.dumps(self.gen)
  File "/home/buildbot/slave/py-build/3.1.norwitz-amd64/build/Lib/pickle.py", line 1358, in dumps
    Pickler(f, protocol, fix_imports=fix_imports).dump(obj)
_pickle.PicklingError: Can't pickle <class 'random.Random'>: it's not the same object as random.Random

See http://www.python.org/dev/buildbot/all/builders/amd64%20gentoo%203.1/builds/819/steps/test/logs/stdio
msg110462 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-07-16 16:24
> Revision 68903 was merged in py3k in r68908.  It looks like a similar issue shows up in test_random:
> 
> ======================================================================
> ERROR: test_pickling (test.test_random.MersenneTwister_TestBasicOps)
> ----------------------------------------------------------------------
> Traceback (most recent call last):
>   File "/home/buildbot/slave/py-build/3.1.norwitz-amd64/build/Lib/test/test_random.py", line 107, in test_pickling
>     state = pickle.dumps(self.gen)
>   File "/home/buildbot/slave/py-build/3.1.norwitz-amd64/build/Lib/pickle.py", line 1358, in dumps
>     Pickler(f, protocol, fix_imports=fix_imports).dump(obj)
> _pickle.PicklingError: Can't pickle <class 'random.Random'>: it's not the same object as random.Random

Actually, this might have to do with the fix I committed to
test_threaded_import in r82885.
In order for test_threaded_import to work, we have to unload the "Guinea
pig" module before importing it from several threads at once. For
whatever reason, test_threaded_import uses random as its Guinea pig
module, which means random gets unloaded and reimported again.

But at this point, I must admit I don't even understand the failure
message.
msg110470 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-07-16 18:39
It should be noted that, contrary to Amaury's suggestion, pickling random.seed fails under 3.x:

>>> pickle.dumps(random.seed)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/antoine/py3k/__svn__/Lib/pickle.py", line 1314, in dumps
    Pickler(f, protocol, fix_imports=fix_imports).dump(obj)
_pickle.PicklingError: Can't pickle <class 'method'>: attribute lookup builtins.method failed

Furthermore, the original problem can also be reproduced under 3.x, using Amaury's trick:

>>> pickle.dumps(random.random)
b'\x80\x03crandom\nrandom\nq\x00.'
>>> list(sys.modules.values())[0].random = random.random
>>> pickle.dumps(random.random)
b'\x80\x03cheapq\nrandom\nq\x00.'

I think a possible heuristic in whichmodule() would be, if __module__ is not found or None, to look for a __module__ attribute on __self__:

>>> random.random.__module__
>>> random.random.__self__.__module__
'random'
msg110472 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-07-16 19:17
Antoine's fix in r82919 / r82920 fixes the test_random failure for me.  (Before the fix, 

./python.exe ./Lib/test/regrtest.py test___all__ test_threaded_import test_random

was enough to produce the failure.)
msg123918 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010-12-14 02:16
The randomly failing tests seem to have been the high priority issue.  The remaining, eponymous issue seems to be of rather lower priority, so I'm setting it to normal.  Although Tim wanted a separate issue for the pickling problem, I think there's too much useful info about the underlying problem in this issue for it to make sense to open a new one.
msg204957 - (view) Author: Alexandre Vassalotti (alexandre.vassalotti) * (Python committer) Date: 2013-12-01 20:29
This was fixed in 3.4 with the introduction of method pickling. I don't think it would be appropriate to backport this to 2.7. Thus, I am closing this as a won't fix for 2.x.
History
Date User Action Args
2022-04-11 14:56:38adminsetgithub: 47907
2013-12-01 20:29:42alexandre.vassalottisetstatus: open -> closed
versions: - Python 3.1, Python 3.2
messages: + msg204957

assignee: alexandre.vassalotti
resolution: wont fix
stage: needs patch -> resolved
2010-12-14 02:16:12r.david.murraysetpriority: high -> normal

type: behavior
assignee: nnorwitz -> (no value)
versions: + Python 3.1, Python 3.2, - Python 2.6
nosy: + r.david.murray

messages: + msg123918
stage: needs patch
2010-07-16 20:02:14terry.reedysetnosy: - terry.reedy
2010-07-16 19:17:46mark.dickinsonsetmessages: + msg110472
2010-07-16 18:39:24pitrousetmessages: + msg110470
2010-07-16 16:24:03pitrousetmessages: + msg110462
2010-07-16 16:08:52belopolskysetnosy: + belopolsky
messages: + msg110461
2010-05-11 20:45:12terry.reedysetversions: + Python 2.7
nosy: + terry.reedy

messages: + msg105539

resolution: works for me -> (no value)
2009-01-24 21:13:20mark.dickinsonsetmessages: + msg80479
2009-01-24 20:00:19mark.dickinsonsetnosy: + mark.dickinson
2008-09-13 10:59:09pitrousetmessages: + msg73174
2008-09-13 04:14:46tim.peterssetmessages: + msg73165
2008-09-12 23:03:03barrysetpriority: release blocker -> high
2008-09-12 21:40:25tim.peterssetmessages: + msg73142
2008-09-12 20:21:02pitrousetnosy: + pitrou
messages: + msg73138
2008-09-12 20:19:16alexandre.vassalottisetnosy: + alexandre.vassalotti
2008-09-12 19:28:54tim.peterssetnosy: + tim.peters
messages: + msg73126
2008-09-12 12:05:24amaury.forgeotdarcsetmessages: + msg73082
2008-09-12 11:34:58amaury.forgeotdarcsetnosy: + amaury.forgeotdarc
messages: + msg73081
2008-09-11 22:02:42benjamin.petersonsetnosy: + benjamin.peterson
messages: + msg73061
2008-09-09 13:14:36barrysetpriority: deferred blocker -> release blocker
2008-09-06 17:24:23barrysetpriority: release blocker -> deferred blocker
resolution: works for me
messages: + msg72673
nosy: + barry
2008-08-25 06:40:00nnorwitzsetmessages: + msg71910
2008-08-24 23:50:25nnorwitzsetassignee: nnorwitz
messages: + msg71904
2008-08-24 23:04:16nnorwitzsetmessages: + msg71898
2008-08-24 07:01:47nnorwitzcreate