This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author alexandre.vassalotti
Recipients Arfrever, alexandre.vassalotti, asvetlov, neologix, pitrou, rhettinger, serhiy.storchaka
Date 2013-05-03.17:37:22
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1367602642.59.0.966054663456.issue17810@psf.upfronthosting.co.za>
In-reply-to
Content
I am currently fleshing out an improved implementation for the reduce protocol version 4. One thing I am curious about is whether we should keep the special cases we currently have there for dict and list subclasses.

I recall Raymond expressed disagreement in #msg83098 about this behavior. I agree that having __setitem__ called before __init__ make it harder for dict and list subclasses to support pickling. To take advantage of the special case, subclasses need to do their required initialization in the __new__ method.

On the other hand, it does decrease the memory requirements for unpickling such subclasses---i.e., we can build the object in-place instead of building an intermediary list or dict. Reading PEP 307 confirms indeed that was the original intention.

One possible solution, other than removing the special case completely, is to make sure we initialize the object (using the BUILD opcode) before we call __setitem__ or append on it. This would be a simple change that would solve the initialization issue. However, I would still feel uneasy about the default object.__reduce__ behavior depending on the object's subtype.

I think it could be worthwhile to investigate a generic API for pickling collections in-place. For example, a such API would helpful for pickling set subclasses in-place.

__items__() or       Return an iterator of the items in the collection. Would be
__getitems__()       equivalent to iter(dict.items()) on dicts and iter(list) on
                     lists.

__additems__(items)  Add a batch of items to the collection. By default, it would
                     be defined as:

                         for item in items:
                             self.__additem__(item)

                     However, subclasses would be free to provide a more efficient
                     implementation of the method. Would be equivalent to
                     dict.update on dicts and list.extend on lists.

__additem__(item)    Add a single item to the collection. Would be equivalent to
                     dict[item[0]] = item[1] on dicts and list.append on lists.

The collections module's ABCs could then provide default implementations of this API, which would give its users efficient in-place pickling automatically.
History
Date User Action Args
2013-05-03 17:37:22alexandre.vassalottisetrecipients: + alexandre.vassalotti, rhettinger, pitrou, Arfrever, asvetlov, neologix, serhiy.storchaka
2013-05-03 17:37:22alexandre.vassalottisetmessageid: <1367602642.59.0.966054663456.issue17810@psf.upfronthosting.co.za>
2013-05-03 17:37:22alexandre.vassalottilinkissue17810 messages
2013-05-03 17:37:22alexandre.vassalotticreate