classification
Title: OrderedDict.__reduce__ not threadsafe
Type: behavior Stage:
Components: Library (Lib) Versions: Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: rhettinger Nosy List: amaury.forgeotdarc, daniel.urban, mjuric, python-dev, rhettinger
Priority: normal Keywords:

Created on 2011-04-19 08:28 by mjuric, last changed 2011-04-20 06:39 by rhettinger. This issue is now closed.

Messages (10)
msg134023 - (view) Author: Mario Juric (mjuric) Date: 2011-04-19 08:28
The implementation of OrderedDict.__reduce__() in Python 2.7.1 is not thread safe because of the following four lines:

        tmp = self.__map, self.__root
        del self.__map, self.__root
        inst_dict = vars(self).copy()
        self.__map, self.__root = tmp

If one thread is pickling an OrderedDict, while another accesses it, a race condition occurs if the accessing thread accesses the dict after self.__map and self.__root have been delated, and before they've been set again (above).

This leads to an extremely difficult bug to diagnose when using multiprocessing.Queue to exchange OrderedDicts between multiple processes (because Queue uses a separate feeder thread to do the pickling).

The fix seems relatively easy -- use:

        inst_dict = vars(self).copy()
        del inst_dict['_OrderedDict__map'], inst_dict['_OrderedDict__root']

instead of the above four lines.

PS: This issue+fix may also apply to Python 3.x, although I haven't tested it there.
msg134027 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2011-04-19 09:34
To avoid hardcoding the mangled names:

@@ -155,10 +155,9 @@
     def __reduce__(self):
         'Return state information for pickling'
         items = [[k, self[k]] for k in self]
-        tmp = self.__map, self.__root, self.__hardroot
-        del self.__map, self.__root, self.__hardroot
         inst_dict = vars(self).copy()
-        self.__map, self.__root, self.__hardroot = tmp
+        for k in vars(OrderedDict()):
+            inst_dict.pop(k, None)
         if inst_dict:
             return (self.__class__, (items,), inst_dict)
         return self.__class__, (items,)
msg134079 - (view) Author: Roundup Robot (python-dev) Date: 2011-04-19 16:54
New changeset 3150b6939731 by Raymond Hettinger in branch '2.7':
Issue 11875: Keep OrderedDict's __reduce__ from temporarily mutating the object.
http://hg.python.org/cpython/rev/3150b6939731
msg134085 - (view) Author: Roundup Robot (python-dev) Date: 2011-04-19 18:15
New changeset 50a89739836f by Raymond Hettinger in branch '3.1':
Issue 11875: Keep OrderedDict's __reduce__ from temporarily mutating the object.
http://hg.python.org/cpython/rev/50a89739836f

New changeset a7ac7a7c8c78 by Raymond Hettinger in branch '3.2':
Issue 11875: Keep OrderedDict's __reduce__ from temporarily mutating the object.
http://hg.python.org/cpython/rev/a7ac7a7c8c78

New changeset 928f17923b0d by Raymond Hettinger in branch 'default':
Issue 11875: Keep OrderedDict's __reduce__ from temporarily mutating the object.
http://hg.python.org/cpython/rev/928f17923b0d

New changeset 937385024601 by Raymond Hettinger in branch '3.2':
Issue 11875: Keep OrderedDict's __reduce__ from temporarily mutating the object.
http://hg.python.org/cpython/rev/937385024601
msg134089 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2011-04-19 19:08
Mario, I removed the unnecessary mutation, but remember that OrderedDict's are not thread-safe in general.  Anything that updates an OrderedDict will temporarily leave it in an inconsistent state while the links and mappings are being updated.  

That being said, it isn't hard to create a subclass with a lock around each of the mutating methods: setitem, delitem, clear, popitem, pop, move_to_end, setdefault, init, and update.
msg134096 - (view) Author: Mario Juric (mjuric) Date: 2011-04-19 20:26
Hi Raymond,

Excellent! Many thanks for such a quick fix, this has been bugging us for months!
msg134099 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2011-04-19 20:58
The call to self.__class__() can break subclasses of OrderedDict for two reasons:
- The subclass constructor may have a different signature
- Attributes set by the subclass.__init__ are removed from the pickle::

import collections, pickle

class Mydict(collections.OrderedDict):
    def __init__(self, *args, name=None, **kwargs):
        super().__init__(*args, **kwargs)
        self.name = name

a = Mydict(name='foo')
b = pickle.loads(pickle.dumps(a))
print(a.name, b.name)

Previously, the 'name' would be correctly copied, now it is reset to None.
msg134109 - (view) Author: Roundup Robot (python-dev) Date: 2011-04-20 00:01
New changeset db66eaf353a6 by Raymond Hettinger in branch '2.7':
Issue #11875: Alter the previous fix to work better with subclasses
http://hg.python.org/cpython/rev/db66eaf353a6
msg134110 - (view) Author: Roundup Robot (python-dev) Date: 2011-04-20 00:19
New changeset 408f23b6cec5 by Raymond Hettinger in branch '3.1':
Issue #11875: Alter the previous fix to work better with subclasses
http://hg.python.org/cpython/rev/408f23b6cec5

New changeset 4e6840477d96 by Raymond Hettinger in branch '3.2':
Issue #11875: Alter the previous fix to work better with subclasses
http://hg.python.org/cpython/rev/4e6840477d96

New changeset 64968d226b61 by Raymond Hettinger in branch 'default':
Issue #11875: Alter the previous fix to work better with subclasses
http://hg.python.org/cpython/rev/64968d226b61
msg134125 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2011-04-20 06:39
Mario, thanks for the bug report.
History
Date User Action Args
2011-04-20 06:39:51rhettingersetstatus: open -> closed

messages: + msg134125
2011-04-20 00:19:26python-devsetmessages: + msg134110
2011-04-20 00:01:12python-devsetmessages: + msg134109
2011-04-19 21:08:45daniel.urbansetnosy: + daniel.urban
2011-04-19 20:58:09amaury.forgeotdarcsetstatus: closed -> open
nosy: + amaury.forgeotdarc
messages: + msg134099

2011-04-19 20:26:37mjuricsetmessages: + msg134096
2011-04-19 19:08:53rhettingersetstatus: open -> closed
resolution: fixed
messages: + msg134089
2011-04-19 18:15:34python-devsetmessages: + msg134085
2011-04-19 16:55:00python-devsetnosy: + python-dev
messages: + msg134079
2011-04-19 09:34:04rhettingersetmessages: + msg134027
2011-04-19 08:46:24rhettingersetassignee: rhettinger

nosy: + rhettinger
2011-04-19 08:28:24mjuriccreate