As discussed in discord thread https://discuss.python.org/t/sixth-element-of-tuple-from-reduce-inconsistency-between-pickle-and-copy/12902 where guido suggested to open this issue.
Both the pickle and copy modules of the standard library make use of a class’s __reduce__() method for customizing their pickle/copy process. They seem to have a consistent view of the first 5 elements of the returned tuple:
(func, args, state, listiter, dictiter) but the 6th element seems different. For pickle it’s state_setter , a callable with signature state_setter(obj, state)->None , but for copy it’s deepcopy with signature deepcopy(arg: T, memo) -> T .
This seems to be unintentional, since the pickle documentation states:
> As we shall see, pickle does not use directly the methods described
> above. In fact, these methods are part of the copy protocol which
> implements the __reduce__() special method. The copy protocol provides
> a unified interface for retrieving the data necessary for pickling
> and copying objects
It seems like in order to make a class definition for __reduce__() returning all 6 elements, then the __reduce__() would have to do something very awkward like examining its call stack in order to determine if it is being called in pickle or copy context in order to return an appropriate callable? (Naively providing the same callable in both contexts would cause errors for one or the other).
I attach a test file which defines two classes making use of a __reduce__() returning a 6 element tuple. One class Pickleable can be duplicated via pickling, but not deepcopied. The converse is true for the Copyable class.
Other than the 6th element of the tuple returned from __reduce__() the classes are identical.
Guido dug into the history and found that:
> it looks like these are independent developments:
> the 6th arg for deepcopy was added 6 years ago via Issue 26167: Improve copy.copy speed for built-in types (list/set/dict) - Python tracker
> the 6th arg for pickle was adde 3 years ago via Issue 35900: Add pickler hook for the user to customize the serialization of user defined functions and types. - Python tracker
> I’m guessing the folks doing the latter weren’t aware that deepcopy already uses the 6th arg. Sorting this out will be painful.
|