classification
Title: Proto 2 pickle vs dict subclass
Type: Stage:
Components: Extension Modules Versions: Python 2.4, Python 2.3, Python 2.5
process
Status: closed Resolution: wont fix
Dependencies: Superseder:
Assigned To: Nosy List: akuchling, terry.reedy, tim.peters
Priority: normal Keywords:

Created on 2003-10-20 14:28 by tim.peters, last changed 2008-12-05 20:35 by benjamin.peterson. This issue is now closed.

Messages (3)
msg60413 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2003-10-20 14:28
From c.l.py:

"""
From: Jimmy Retzlaff
Sent: Thursday, October 16, 2003 1:56 AM
To: python-list@python.org
Subject: Pickle dict subclass instances using new 
protocol in PEP 307


I have a subclass of dict that acts kind of like Windows' 
file systems - keys are case insensitive but case 
preserving (keys are assumed to be strings, or at least 
they have to support .lower()). It's worked well for quite 
a while - it used to inherit from UserDict and it has 
inherited from dict since that became possible.

I just tried to pickle an instance of this class for the first 
time using Python 2.3.2 on Windows. If I use protocols 0 
(text) or 1 (binary) everything works great. If I use 
protocol 2 (PEP 307) then I have a problem when loading 
my pickle. Here is a small sample to illustrate:

######

import pickle

class myDict(dict):
    def __init__(self, *args, **kwargs):
        self.x = 1
        dict.__init__(self, *args, **kwargs)

    def __getstate__(self):
        print '__getstate__ returning', (self.copy(), self.x)
        return (self.copy(), self.x)

    def __setstate__(self, (d, x)):
        print '__setstate__'
        print '    object already in state:', self
        print '    x already in self:', 'x' in dir(self)
        self.x = x
        self.update(d)

    def __setitem__(self, key, value):
        print '__setitem__', (key, value)
        dict.__setitem__(self, key, value)


d = myDict()
d['key'] = 'value'

protocols = [(0, 'Text'), (1, 'Binary'), (2, 'PEP 307')]
for protocol, description in protocols:
    print '--------------------------------------'
    print 'Pickling with Protocol %s (%s)' % (protocol, 
description)
    pickle.dump(d, file('test.pickle', 'wb'), protocol)
    del d
    print 'Unpickling'
    d = pickle.load(file('test.pickle', 'rb'))

######

When run it prints:

__setitem__ ('key', 'value') - self.x exists: True
--------------------------------------
Pickling with Protocol 0 (Text)
__getstate__ returning ({'key': 'value'}, 1)
Unpickling
__setstate__
    object already in state: {'key': 'value'}
    x already in self: False
--------------------------------------
Pickling with Protocol 1 (Binary)
__getstate__ returning ({'key': 'value'}, 1)
Unpickling
__setstate__
    object already in state: {'key': 'value'}
    x already in self: False
--------------------------------------
Pickling with Protocol 2 (PEP 307)
__getstate__ returning ({'key': 'value'}, 1)
Unpickling
__setitem__ ('key', 'value') - self.x exists: False
__setstate__
    object already in state: {'key': 'value'}
    x already in self: False


The problem I'm having stems from the fact that the 
subclass' __setitem__ is called before __setstate__ 
when loading a protocol 2 pickle (the subclass' 
__setitem__ is not called at all with protocols 0 or 1). If 
I don't define __get/setstate__ then I have the same 
problem in that the subclass' __setitem__ is called 
before the subclass' instance variables are created by 
the pickle mechanism. I need to access one of those 
instance variables in my __setitem__.

I suppose my question is one of practicality. I'd like my 
class instances to work with all pickle protocols. Am I 
getting too fancy trying to inherit from dict? Should I go 
back to UserDict or maybe to DictMixin? Should I submit 
a bug report on this, or am I getting too close to 
internals to expect a certain behavior across pickle 
protocols?
"""
msg60414 - (view) Author: A.M. Kuchling (akuchling) * (Python committer) Date: 2004-06-05 19:45
Logged In: YES 
user_id=11375

Bug #964868 is a duplicate of this one.
msg77047 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2008-12-05 17:55
James Stroud ran into this same issue with 2.5.  Here is his 'ugly fix'
for working with protocol 2 only.

class DictPlus(dict):
  def __init__(self, *args, **kwargs):
    self.extra_thing = ExtraThingClass()
    dict.__init__(self, *args, **kwargs)
  def __setitem__(self, k, v):
    try:
      do_something_with(self.extra_thing, k, v)
    except AttributeError:
      self.extra_thing = ExtraThingClass()
      do_something_with(self.extra_thing, k, v)
    dict.__setitem__(self, k, v)
  def __setstate__(self, adict):
    pass

Can this be closed as "won't fix", since there seems nothing to fix?
This issue of working with all protocols would seem dead by now, and for
protocol 2, it is a 'gotcha' that can be avoided with knowledge.
History
Date User Action Args
2008-12-05 20:35:56benjamin.petersonsetstatus: open -> closed
resolution: wont fix
2008-12-05 17:55:27terry.reedysetnosy: + terry.reedy
messages: + msg77047
versions: + Python 2.5, Python 2.4
2003-10-20 14:28:10tim.peterscreate