This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Error copying an instance of a subclass of OrderedDict
Type: enhancement Stage:
Components: Library (Lib) Versions: Python 3.10
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: erezinman, eric.snow, rhettinger, serhiy.storchaka
Priority: normal Keywords:

Created on 2020-09-09 10:09 by erezinman, last changed 2022-04-11 14:59 by admin.

Messages (10)
msg376627 - (view) Author: Erez Zinman (erezinman) Date: 2020-09-09 10:09
This bug occurs when copying/pickling an ordered-dict subtype that has default items. The initialization function that's returned is **not** `object.__new__` so the default items are set when the copied/pickled item is created. The problem I encountered is that when deleting an initial item, it appears in the copy. See the MWE below:

```
from collections import OrderedDict
import copy


class A(OrderedDict):
    def __init__(self):
        self['123'] = 123

a = A()
del a['123']
copy.copy(a)


# --> A([('123', 123)])

```


This can cause other problems as well, because you don't assume that the class is re-initialized on deserialization/copy.
msg376628 - (view) Author: Erez Zinman (erezinman) Date: 2020-09-09 10:21
Proposed solution - change OrderedDict's `__reduce_ex__ ` function. The code below works like expected if I override `__reduce_ex__` like so:

```

    def __reduce_ex__(self, protocol):
        ret = list(super().__reduce_ex__(protocol))
        ret[0] = type(self).__new__
        ret[1] = type(self), 
        return tuple(ret, )


```
msg376630 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-09-09 11:21
It would break the code which expect that __init__() be called (I do not know if there are much such code, but still). It also would not work with the original pure Python implementation of OrderedDict, which initializes the OrderedDict object in __init__(), not in __new__().
msg376631 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-09-09 11:24
So we are restricted by backward compatibility with the original pure Python implementation.

Sometimes it is feasible to break backward compatibility, but there should be very good reasons for it.
msg376636 - (view) Author: Erez Zinman (erezinman) Date: 2020-09-09 14:04
Looking at the implementation of `__init__` (https://github.com/python/cpython/blob/76553e5d2eae3e8a47406a6de4f354fe33ff370b/Lib/collections/__init__.py#L109), it seems that you can fix this bug while retaining backward compatibility if you would use `__new__` in the `__reduce__` function, followed by an `OrderedDict.__init__`. The difference is that you don't call `__class__.__init__`, but rather `OrderedDict.__init__`. Some thing like the following:


    def __reduce__(self):
        'Return state information for pickling'
        inst_dict = vars(self).copy()
        for k in vars(OrderedDict()):
            inst_dict.pop(k, None)
        
        def initializer():
            inst = __class__.__new__(__class__)
            OrderedDict.__init__(inst)

        return initializer, (), inst_dict or None, None, iter(self.items())


The items will "restored" later using the `odict_iter`.
msg376637 - (view) Author: Erez Zinman (erezinman) Date: 2020-09-09 14:06
small correction: meant `self.__class__` in the above function.

Also, it is based on https://github.com/python/cpython/blob/76553e5d2eae3e8a47406a6de4f354fe33ff370b/Lib/collections/__init__.py#L302
msg376644 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-09-09 15:29
First, closures cannot be pickled. You have to pass self.__class__ as an argument.

Second, it will break compatibility with older Python versions. Pickles created in new Python could not be unpickled in older Pythons.

Third, it is the behavior which does not have precedents. Unpickling usually calls either just __new__() or both __new__() and __init__() of the same class.
msg376647 - (view) Author: Erez Zinman (erezinman) Date: 2020-09-09 16:46
The first argument can be remedied by creating a new method that deals with that (something like `def _new_init(cls)`, and passing the `cls` as argument).

The third argument is not really an argument - there is a bug that needs a precedent to be solved.

Concerning the second argument, I think it is only problematic when pickling in a new version of python and unpickling in an old one, and not vice versa (because the reduction from previous versions calls __init__). In the old->new direction, there wouldn't be much of a problem because the worst thing that happens, is that this bug could occur.
If that doesn't satisfy you, I hope that this change can be made in the future, when a version that allows such a lack of backward-compatibility is released.
msg376648 - (view) Author: Erez Zinman (erezinman) Date: 2020-09-09 16:57
Just to be clear, what I did is (works with both "copy" and "pickle"):


```

def _new_and_init(cls):
    inst = cls.__new__(cls)
    OrderedDict.__init__(inst)
    return inst


class A(OrderedDict):

    def __reduce__(self):
        # This fixes a bug in Python 3.6
        ret = list(super().__reduce__())

        ret[0] = _new_and_init
        ret[1] = (self.__class__, )

        return tuple(ret)

```

And that works... I also verified that indeed old->new pickling will just call _init__ on unpickling, so this would just retain the previous behavior.
msg376650 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-09-09 17:41
I do not think there is a bug. If we ever decide to change the behavior, the changes can only go in future Python version.

And if we decide to make changes, I think that it would be better to make the pure Python implementation using __new__() instead of __init__() and make __reduce__() using copyreg.__newobj__.
History
Date User Action Args
2022-04-11 14:59:35adminsetgithub: 85917
2020-09-09 17:41:08serhiy.storchakasettype: behavior -> enhancement
messages: + msg376650
versions: + Python 3.10, - Python 3.6
2020-09-09 17:20:04rhettingersetnosy: + eric.snow
2020-09-09 16:57:08erezinmansetmessages: + msg376648
2020-09-09 16:46:21erezinmansetmessages: + msg376647
2020-09-09 15:29:07serhiy.storchakasetmessages: + msg376644
2020-09-09 14:06:40erezinmansetmessages: + msg376637
2020-09-09 14:04:59erezinmansetmessages: + msg376636
2020-09-09 11:24:39serhiy.storchakasetmessages: + msg376631
2020-09-09 11:21:47serhiy.storchakasetnosy: + rhettinger, serhiy.storchaka
messages: + msg376630
2020-09-09 10:21:44erezinmansetmessages: + msg376628
2020-09-09 10:09:46erezinmancreate