classification
Title: python doc does not say that the state kwarg in Pickler.save_reduce can be a tuple (and not only a dict)
Type: Stage: resolved
Components: Documentation Versions: Python 3.8
process
Status: closed Resolution:
Dependencies: Superseder:
Assigned To: docs@python Nosy List: alexandre.vassalotti, docs@python, pierreglaser, pitrou, rhettinger, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2019-02-07 17:55 by pierreglaser, last changed 2019-05-18 12:39 by pierreglaser. This issue is now closed.

Files
File name Uploaded Description Edit
test_slots.py pierreglaser, 2019-02-08 09:37
Pull Requests
URL Status Linked Edit
PR 11955 closed pierreglaser, 2019-02-20 15:24
Messages (9)
msg335031 - (view) Author: Pierre Glaser (pierreglaser) * Date: 2019-02-07 17:55
Hello all,
This 16-year old commit (*) allows an object's state to be updated
using its slots instead of its __dict__ at unpickling time. To use this
functionality, the state keyword-argument of Pickler.save_reduce (which maps to
the third item of the tuple returned by __reduce__) should be a length-2 tuple.
As far as I can tell, this is not mentioned in the documentation (**). I suggest having the docs updated. What do you think?

(*) https://github.com/python/cpython/commit/ac5b5d2e8b849c499d323b0263ace22e56b4f0d9
(**) https://docs.python.org/3.8/library/pickle.html#object.__reduce__
msg335033 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2019-02-07 18:18
Does it still work?  With both the C and Python pickler?
Can you post an example?
msg335041 - (view) Author: Pierre Glaser (pierreglaser) * Date: 2019-02-07 20:59
It turns out that both pickle and _pickle implement this feature, but the behavior is inconsistent.

- As a reminder, instances of slotted classes do not have a dict attribute (1)
- On the other side, when pickling slotted class instances, __getstate__ can return a tuple of 2 dicts. The first dict represents the __dict__ attribute. Because of (1), this first dict should simply be a sentinel value. In pickle, the condition is that it evaluates to False, but in _pickle, it should be explicitly None.

(- Finally, The second dict in state contains the slotted attribute. )

Here are the lines in the two files causing the inconsistent behavior:
https://github.com/python/cpython/blob/master/Modules/_pickle.c#L6236
https://github.com/python/cpython/blob/master/Lib/pickle.py#L1549

I included an example that illustrates it.
msg335043 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2019-02-07 22:21
You can have both a dict and slots by subclassing:

>>> class A: 
...:     __slots__ = ('x',) 
...:                                                                                                                                                                   
>>> class B(A): pass                                                                                                                                                   
>>>                                                                                                                                                                    
>>> b = B()                                                                                                                                                            
>>> b.x = 5                                                                                                                                                            
>>> b.y = 6                                                                                                                                                            
>>> b.__dict__                                                                                                                                                         
{'y': 6}
>>> A.x                                                                                                                                                                
<member 'x' of 'A' objects>
>>> A.x.__get__(b)                                                                                                                                                     
5
msg335044 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2019-02-07 22:23
Interestingly, you can also put an instance dict in slots:

>>> class A:
	__slots__ = ['x', '__dict__']
	
>>> a = A()
>>> a.x = 5
>>> a.y = 6
>>> a.__dict__
{'y': 6}
>>> a.x
5
msg335066 - (view) Author: Pierre Glaser (pierreglaser) * Date: 2019-02-08 09:34
Thanks Antoine and Raymond for the feedback.

Indeed, a subclass of a slotted class can have a dict: I enriched the script, pickling_depickling instances of such subclasses, with the length-2 tuple __getstate__ method, and made sure their attributes were properly retrieved.

Apart from the different checks on state carried out in the c load_build and the python load_build, AFAICT, it seems like this feature works :)
msg336102 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2019-02-20 15:31
A slotted class will have a dict also when it inherits it from a non-slotted class. This is why the base class of slotted class should have slots if you do not want an instance dict.

__getstate__ and __setstate__ for slotted classes are described in PEP 307. Unfortunately this was not copied to the module documentation.
msg336105 - (view) Author: Pierre Glaser (pierreglaser) * Date: 2019-02-20 15:41
I added a PR with a small patch to document this behavior and reconcile _pickle.c and pickle.py

Some explanations on why I am pushing this forward:
 
Pickling instances of classes/subclasses with slots is done natively for pickle protocol >= 2. Mentioning this behavior in the docs should *not* make the user worry about implementing custom __getstate__ methods just to preserve slots.

Here is the reason why I think this functionality (allowing state and slotstate) is worth documenting: pickle gives us a lot of flexibility for reducing thanks to the dispatch_table. But unpickling is rather rigid: if no __setstate__ exists, we have to deal with the default state updating procedure present in load_build. Might as well document all of it ;)
msg336106 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2019-02-20 15:46
See also issue26579 which propose to add a function or method for standard implementation of __getstate__ and use it consistently in custom __getstate__ implementations. I am not sure about API yet.
History
Date User Action Args
2019-05-18 12:39:24pierreglasersetstatus: open -> closed
stage: patch review -> resolved
2019-02-20 15:46:30serhiy.storchakasetmessages: + msg336106
2019-02-20 15:41:42pierreglasersetmessages: + msg336105
2019-02-20 15:31:31serhiy.storchakasetmessages: + msg336102
2019-02-20 15:24:05pierreglasersetkeywords: + patch
stage: patch review
pull_requests: + pull_request11981
2019-02-08 09:37:11pierreglasersetfiles: + test_slots.py
2019-02-08 09:34:50pierreglasersetfiles: - test_slots.py
2019-02-08 09:34:32pierreglasersetmessages: + msg335066
2019-02-07 22:23:56rhettingersetnosy: + rhettinger
messages: + msg335044
2019-02-07 22:21:34pitrousetmessages: + msg335043
2019-02-07 21:01:59pierreglasersetfiles: + test_slots.py
2019-02-07 21:01:38pierreglasersetfiles: - test_slots.py
2019-02-07 21:00:57pierreglasersetfiles: + test_slots.py
2019-02-07 21:00:28pierreglasersetfiles: - test_slots.py
2019-02-07 20:59:04pierreglasersetfiles: + test_slots.py

messages: + msg335041
2019-02-07 18:18:55pitrousetmessages: + msg335033
2019-02-07 17:55:22pierreglasercreate