This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: unpickling vs. __getattr__
Type: behavior Stage:
Components: Documentation, Library (Lib) Versions: Python 3.0, Python 3.1, Python 2.7, Python 2.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: georg.brandl Nosy List: georg.brandl, ggenellina, mwm
Priority: normal Keywords: patch

Created on 2009-02-25 21:26 by mwm, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
pickletest.py mwm, 2009-02-25 21:26 Simple python 2.5/2.6/3.0 example showing problem.
pp mwm, 2009-03-10 05:58
pickle.diff ggenellina, 2009-03-14 01:50 Doc update
Messages (11)
msg82721 - (view) Author: Mike Meyer (mwm) Date: 2009-02-25 21:26
The attached short file causes all of python 2.5, 2.6 and 3.0 to drop
into infinite recursion trying to unpickle the created object, but the
2.5 version has the cleanest The problem appears to be that the unpickle
code (C in 3.0; Python in 2.5 - I assume the others are similar) uses
getattr to try and get the __setstate__ method (if any) of the opject
being built before it's dictionary is populated. This causes the
invocation of the __getattr__ method, which tries to use the
non-existent "attrs" attribute, thus starting the recursion.

A simple workaround is to provide this:

  def __setstate__(self, data):
      self.__dict__.update(data)

but methinks that either 1) the interpreter shouldn't be falling into
the infinite loop in this case, or 2) the pickle code should be using
hasattr to check for the attribute (which won't trigger this problem)
before trying getattr on it.
msg82889 - (view) Author: Gabriel Genellina (ggenellina) Date: 2009-02-28 05:55
The __getattr__ method, as written, assumes that an attribute 'args' 
already exists. That's unsafe - a more robust approach would be:

    def __getattr__(self, name):
        if 'attrs' in self.__dict__ and name in self.attrs:
            return self.attrs[name]
        raise AttributeError(name)

The suggestion of using hasattr in pickle code doesn't work, because 
hasattr is implemented using getattr.

I could not reproduce it with 2.6, nor the trunk -- there is another 
recursion problem involving __subclasscheck__ but unpickling is 
unaffected. 3.0 does show this problem. FWIW, 2.2 did not.
msg82941 - (view) Author: Mike Meyer (mwm) Date: 2009-02-28 19:31
The args attribute gets created by __init__ and nothing in the class
removes it. I don't think it's unreasonable for the class to expect the
attribute to not vanish on it. Possibly it should be spelled __args (or
declared private :-), but neither of those would make a difference in
this case.

Any feature access can be made more robust by checking for it in
__dict__ first, but such a practice is neither practical nor pragmatic.
It may be different for __getargs__, in which case the bug is in the
documentation for __getargs__, which should mention this issue.
msg82964 - (view) Author: Gabriel Genellina (ggenellina) Date: 2009-03-01 15:09
Perhaps this should be made more clear in the 
documentation for the pickle module. Probably here:

http://docs.python.org/library/pickle.html#the-pickle-
protocol

Could you come with some enhancements?

(Note that it already states that __init__ is not 
called when unpickling)
msg83413 - (view) Author: Mike Meyer (mwm) Date: 2009-03-10 05:32
I don't believe in documenting bugs instead of fixing them. If this bug
is going to stay in the code, I can either fix my install of Python to
have non-broken Pickle modules, or I can fix every third-party libraries
objects I use whose authors didn't worry about pickling them enough to
find this. The fornmer is almost certainly easier, given a little time.
Probably even easier than agreeing on proper document wording.

The __init__ case is different - there isn't a common use case (this one
comes from the standard library) for __init__ that breaks pickling, and
the problems tend to manifest in the resulting pickles, not in the
unpickling process.
msg83415 - (view) Author: Mike Meyer (mwm) Date: 2009-03-10 05:58
QAD patch for 2.6 pickle.py to fix this bug. Passes the 2.6 pickle unit
tests.
msg83542 - (view) Author: Gabriel Genellina (ggenellina) Date: 2009-03-13 23:11
Are you sure you uploaded the right patch? I've not tested it, but I 
don't think this actually fixes the reported bug.

__setstate__ is *very* unlikely to be found in the instance's dict, so 
you end up not calling __setstate__ at all.

If no test failed, this may indicate bad test coverage.
msg83544 - (view) Author: Mike Meyer (mwm) Date: 2009-03-13 23:30
True. But hat's why it was a QAD hack - all I did was make sure it didn't blow up on the 
test code, and that it passed the provided unit tests.

It really needs to walk the class tree. So something like  hasattr(inst.__class__, 
'__setstate__') is more likely to be correct.

And yes, the pickle unit tests apparently don't test the __*state__ routines, but that's a 
different bug.
msg83550 - (view) Author: Gabriel Genellina (ggenellina) Date: 2009-03-14 00:28
I think that trying to emulate all "getattr" details in the middle of 
unpickling (but omiting calls to __getattr__) isn't the right thing to 
do. What if in some cases __getattr__ (or __getattribute__) *does* the 
right thing, but pickle doesn't call it? Other people would be 
rightfully upset.

There should be a warning note about __getattr__ in the pickle 
documentation, and people should be encouraged to write __getattr__ in 
a more robust way, if class instances are meant to be pickled.
msg83569 - (view) Author: Gabriel Genellina (ggenellina) Date: 2009-03-14 01:50
This doc update should warn people about the need to implement 
__getinitargs__ / __getnewargs__ when the instance relies on some 
internal invariants that must be preserved, and reiterates the 
important fact that __init__ / __new__ are NOT normally invoked.

The patch is against trunk; wording for 3.x should omit references to 
__getinitargs__ and __init__.
msg85503 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2009-04-05 14:40
Updated docs in r71240.
History
Date User Action Args
2022-04-11 14:56:46adminsetgithub: 49620
2009-04-05 14:40:15georg.brandlsetstatus: open -> closed
resolution: fixed
messages: + msg85503
2009-03-14 01:50:43ggenellinasetfiles: + pickle.diff

assignee: georg.brandl
components: + Documentation, Library (Lib)
versions: + Python 3.1, Python 2.7, - Python 2.5
keywords: + patch
nosy: + georg.brandl

messages: + msg83569
2009-03-14 00:28:06ggenellinasetmessages: + msg83550
2009-03-13 23:30:34mwmsetmessages: + msg83544
2009-03-13 23:11:01ggenellinasetmessages: + msg83542
2009-03-10 05:58:29mwmsetfiles: + pp

messages: + msg83415
2009-03-10 05:32:56mwmsetmessages: + msg83413
2009-03-01 15:09:59ggenellinasetmessages: + msg82964
2009-02-28 19:31:28mwmsetmessages: + msg82941
2009-02-28 05:55:27ggenellinasetnosy: + ggenellina
messages: + msg82889
2009-02-25 21:26:14mwmcreate