New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pickle breakage with reduction of recursive structures #41150
Comments
Fix problems related to reduce cycles during pickling. A
This shouldn't change any semantics. That reduce shouldn't Possible improvement: If we want to support reducing of real See also: http://mail.python. It would be very good if someone familiar with pickle |
Logged In: YES IMO, this is more of a feature request than a bug. Also, it |
Logged In: YES Triggering assertions (pickle) and producing incorrect output |
Patch has test, can someone try it? I think it might be fixed already. |
The issue is still present in py3k. Attaching an updated patch with tests only. Is this the same as bpo-998998? |
As I explained in msg110617 under bpo-9269, it is possible that we can do better than simply detect reduce cycles and bail out. I am torn between two options:
My only concern about this patch is that it is likely to affect performance because pickling will have to maintain the "reducing_now" dictionary that is parallel to the memo dictionary. |
To recap, the issue is that pickle doesn't handle recursion via reduce arguments (i.e. arguments to the constructor function as returned in 2nd element of the tuple from __reduce__). This leads to 2 kind of effects: class C:
def __init__(self, x=None):
self.x = x if x is not None else self
def __reduce__(self):
return C, (self.x,) A. Recursion error:
>>> pickle.dumps(C())
Traceback (most recent call last):
File "<pyshell#5>", line 1, in <module>
pickle.dumps(C())
RuntimeError: maximum recursion depth exceeded while calling a Python object This cannot be helped with the current reduce protocol. The error may be improved, but that's about it. B. Duplication of object when unpickling:
>>> c = C([])
>>> c.x.append(c)
>>> c.x[0] is c
True
>>> c2 = pickle.loads(pickle.dumps(c))
>>> c2.x[0] is c2
False
This happens because list (or another recursion-friendly type) inside the problematic object handles recursion, but we still get the outer object, identical to the inner one.
This can be solved the same way as for tuple:
>>> t=([],1,2)
>>> t[0].append(t)
>>> t2 = pickle.loads(pickle.dumps(t))
>>> t2[0][0] is t2
True
>>> pickletools.dis(pickle.dumps(t))
0: \x80 PROTO 3
2: ] EMPTY_LIST
3: q BINPUT 0
5: h BINGET 0
7: K BININT1 1
9: K BININT1 2
11: \x87 TUPLE3
12: q BINPUT 1
14: a APPEND
15: K BININT1 1
17: K BININT1 2
19: 0 POP
20: 0 POP
21: 0 POP
22: h BINGET 1
24: . STOP After pickling its elements tuple checks if it got into memo. If it did, this means it was pickled by one of the elements, so it POPs all elements from the stack and fetches itself via GET. This is somewhat inefficient, but probably the best it can do. I suggest we do 3 things:
|
Problem "B" has been resolved, but problem "A" is still there. Python 3.4.1 (default, Jun 29 2014, 15:26:46)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-4)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> class C:
... def __init__(self, x=None):
... self.x = x if x is not None else self
... def __reduce__(self):
... return C, (self.x,)
...
>>> import pickle
>>> pickle.dumps(C())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
RuntimeError: maximum recursion depth exceeded while calling a Python object
>>> c = C([])
>>> c.x.append(c)
>>> c.x[0] is c
True
>>> c2 = pickle.loads(pickle.dumps(c))
>>> c2.x[0] is c2
True |
Re msg110868: it's impossible to resolve a __reduce__ loop that involves only immutable intermediate objects (including none at all): class Direct:
def __reduce__(self): return id,(self,) # obviously impossible
class Indirect:
# Can't create either the new object or the tuple first!
def __reduce__(self): return id,((self,),) With pre-existing mutable objects, the same trick as for tuples certainly could be applied: class Mutable:
def __init__(self): self.bracketed=[self]
# Create list, REDUCE, then populate list
def __reduce__(self): return id,(self.bracketed,) The way an analog to the tuple implementation would deal with this would be something like id # the dummy "constructor" in this example If the list were created directly in __reduce__, you'd still recurse infinitely trying to create each new list object. On the other hand, even complicated cases like this should work: class Switch:
def __reduce__(self): return id,(self.lst,)
a=Switch(); b=Switch()
a.list=[b,a]; b.list=[a,b] where the pickling of, say, a would look something like id Perhaps a technique for distinguishing these cases is to look for new objects having been created since the last time we visited an object. If there are none, we're in the immutable case and we lose. If there are yet more created between the second and third visits, on the other hand, we're generating more objects from __reduce__ every time and should again abort. |
Davis, the possible cases (i.e. recursion via appropriate mutable objects) are already supported, and the impossible cases are, well, impossible. The only open issue is whether to implement better error handling for infinite recursion problems, instead of relying on built-in maximum recursion depth. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: