classification
Title: pickle and _pickle accelerator have different behavior when unpickling an object with falsy __getstate__ return
Type: Stage:
Components: Documentation Versions: Python 3.6, Python 3.2, Python 3.3, Python 3.4, Python 3.5, Python 2.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: docs@python Nosy List: ZackerySpytz, docs@python, josh.r
Priority: normal Keywords:

Created on 2016-04-05 15:01 by josh.r, last changed 2019-07-22 23:27 by ZackerySpytz.

Messages (2)
msg262908 - (view) Author: Josh Rosenberg (josh.r) * (Python triager) Date: 2016-04-05 15:01
According to a note on the pickle docs ( https://docs.python.org/3/library/pickle.html#object.__getstate__ ): "If __getstate__() returns a false value, the __setstate__() method will not be called upon unpickling."

The phrasing is a little odd (since according to the __setstate__ docs, there is a behavior for classes without __setstate__ where it just assigns the contents of the pickled state dict to the __dict__ of the object), but to me, this means that any falsy value should prevent any __setstate__-like behavior.

But this is not how it works. Both the C accelerator and Python code treat None specially (they don't pickle state at all if it's None), which prevents __setstate__ or the __setstate__-like fallback from being executed.

But if it's any other falsy value, the behaviors differ, and diverge from the docs. Specifically, on load of a pickle with a non-None falsy state (say, False itself, or 0, or () or []):

Without __setstate__:
Pure Python pickle: Does not execute fallback code, behaves as expected (it just stored state it will never use), matching spirit of docs
C accelerated _pickle: Fails on anything but the empty dict with an UnpicklingError: state is not a dictionary, violating spirit of docs

With __setstate__:
Both versions call __setstate__ even though the documentation explicitly says they will not.

Seems like if nothing else, the docs should agree with the code, and the C and Python modules should agree on behavior.

I would not be at all surprised if outside code depends on being able to pickle falsy state and have its __setstate__ receive the falsy state (if nothing else, when the state is a container or number, being empty or 0 would be reasonable; failing to call __setstate__ in that case would be surprising). So it's probably not a good idea to make the implementation match the docs.

My proposal would be that at pickle time, if the class lacks __setstate__, treat any falsy return value as None. This means:

1. pickles are smaller (no storing junk that the default __setstate__-like behavior can't use)
2. pickles are valid (no UnpicklingError from the default __setstate__-like behavior)

The docs would also have to change, to indicate that, if defined, __setstate__ will be called even if __getstate__ returned a falsy (but not None) value.

Downside is the description of what happens is a little complex, since the behavior for non-None falsy values differs depending on the presence of a real __setstate__. Upside is that any code depending on the current behavior of falsy state being passed to __setstate__ keeps working, CPython and other interpreters will match behavior, and classes without __setstate__ will have smaller pickles.
msg348309 - (view) Author: Zackery Spytz (ZackerySpytz) * (Python triager) Date: 2019-07-22 23:27
Josh, would you consider creating a pull request for this issue?
History
Date User Action Args
2019-07-22 23:27:06ZackerySpytzsetnosy: + ZackerySpytz
messages: + msg348309
2016-04-05 15:01:46josh.rcreate