classification
Title: Strange interaction between methods in subclass of C OrderedDict
Type: behavior Stage: resolved
Components: Versions: Python 3.6, Python 3.5
process
Status: closed Resolution: duplicate
Dependencies: Superseder: KeyError thrown by optimised collections.OrderedDict.popitem()
View: 27275
Assigned To: Nosy List: eric.snow, josh.r, xiang.zhang, zach.ware
Priority: normal Keywords:

Created on 2016-09-08 03:28 by zach.ware, last changed 2016-10-26 00:40 by josh.r. This issue is now closed.

Files
File name Uploaded Description Edit
simple_lru.py zach.ware, 2016-09-08 03:28
Messages (3)
msg274959 - (view) Author: Zachary Ware (zach.ware) * (Python committer) Date: 2016-09-08 03:28
I'm not certain that the implementation of this subclass of OrderedDict is actually sane, but it works in 3.4 and fails in 3.5+.

The buggy implementation:

class SimpleLRUCache(OrderedDict):

    def __init__(self, size):
        super().__init__()
        self.size = size

    def __getitem__(self, item):
        value = super().__getitem__(item)
        self.move_to_end(item)
        return value

    def __setitem__(self, key, value):
        while key not in self and len(self) >= self.size:
            self.popitem(last=False)
        super().__setitem__(key, value)
        self.move_to_end(key)


When trying to add a new item after `size` items are already in the cache, it will throw a KeyError with the key of the item in the oldest position.  Something like:

>>> s = SimpleLRUCache(2)
>>> s['t1'] = 1
>>> s['t2'] = 2
>>> s['t2'] # gives 2, properly moves 2 to the end
>>> s['t3'] = 3

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "simple_lru.py", line 14, in __setitem__
    self.popitem(last=False)
  File "simple_lru.py", line 9, in __getitem__
    self.move_to_end(item)
KeyError: 't3'


I can work around the failure by implementing __getitem__ as follows:

    def __getitem__(self, item):
        value = super().__getitem__(item)
        del self[item]
        self[item] = value
        return value

Attached is a script with a couple of tests that pass with 3.4 and fail with 3.5+.
msg274961 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016-09-08 03:32
Please see issue27275. Someone has already complained about this. And I think the cause is the gap between Pure Python implementation and C implementation when it comes to subclasses.
msg279471 - (view) Author: Josh Rosenberg (josh.r) * (Python triager) Date: 2016-10-26 00:40
Note: This class doesn't actually work on 3.4 in other ways (because __getitem__ is not idempotent, while OrderedDict assumes it is):

>>> s = SimpleLRUCache(2)
>>> s['t1'] = 1
>>> s
SimpleLRUCache([('t1', 1)])
>>> s['t2'] = 2
>>> s
SimpleLRUCache([('t1', 1)])
>>> s
SimpleLRUCache([('t2', 2)])  # <-- No changes, repr different

If your __getitem__ isn't idempotent, you've broken a basic assumption built into the logic of the other methods you inherited, and you're going to need to override other methods to avoid misbehavior.
History
Date User Action Args
2016-10-26 00:40:20josh.rsetnosy: + josh.r
messages: + msg279471
2016-09-08 03:41:20zach.waresetstatus: open -> closed
superseder: KeyError thrown by optimised collections.OrderedDict.popitem()
resolution: duplicate
stage: resolved
2016-09-08 03:32:48xiang.zhangsetnosy: + xiang.zhang
messages: + msg274961
2016-09-08 03:28:02zach.warecreate