classification
Title: PEP 372: OrderedDict
Type: Stage: patch review
Components: Library (Lib) Versions: Python 3.1
process
Status: closed Resolution: accepted
Dependencies: Superseder:
Assigned To: rhettinger Nosy List: aronacher, forest, georg.brandl, jimjjewett, matthieu.labbe, pitrou, rhettinger
Priority: normal Keywords: patch

Created on 2009-03-01 10:30 by rhettinger, last changed 2009-09-08 15:09 by matthieu.labbe. This issue is now closed.

Files
File name Uploaded Description Edit
od7.diff rhettinger, 2009-03-02 21:22 Updated __eq__()
popoldest.diff rhettinger, 2009-03-03 20:00 pop_oldest_item patch
Messages (25)
msg82955 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2009-03-01 10:30
Here is a working patch implementing PEP372 ordered dictionaries.
msg82959 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2009-03-01 12:48
Doc nits:

* "items are returned in the order they were first added": it should be
made clear that it matters when the *key* was first added
* "An *OrderedDict* remembers order that entries were inserted": misses
a word somewhere?
* "OrderDict" should be "OrderedDict"
* "compare their ordered :attr:`items` list": ... lists?

Implementation nits:

* "raise TypeError('expected at 1 argument, got %d', len(args))"
  should read "at most" and use "%" instead of ","
* "raise KeyError" in popitem(): should it get a message?
* eval()ing the repr() will not construct the dict in the same order
msg82962 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009-03-01 14:57
In SubclassMappingTests, MyOrderedDict should subclass OrderedDict
rather than dict, shouldn't it?
msg82972 - (view) Author: Armin Ronacher (aronacher) * (Python committer) Date: 2009-03-01 18:21
@Georg

> * eval()ing the repr() will not construct the dict in the same order
The alternative would be a list of dicts inside the constructor call,
but that feels ugly.  defaultdict from the same module is not evaluable
at all, so I guess it wouldn't be that much of a problem.
msg82973 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2009-03-01 18:30
Thanks for the review comments guys.  An updated patch is attached.
msg82974 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2009-03-01 18:32
I still see an "OrderDict" in the docs, and the TypeError in __init__
still needs to use "%" instead of ",".
msg82975 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2009-03-01 18:49
Fixed.
msg82976 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2009-03-01 19:08
Added a few more tests.
msg82996 - (view) Author: Jim Jewett (jimjjewett) Date: 2009-03-02 02:43
I would try to make it more explicit that updates do not reset the 
order, but deleting a item and re-inserting it *does*.  (So it isn't 
the first insertion of the key, it is the first that hasn't yet been 
followed by a deletion.)

Maybe change:

"""
When iterating over an ordered dictionary,
the items are returned in the order their keys were first added.
"""

to [removed "first", added longer explanation]

"""
When iterating over an ordered dictionary,
the items are returned in the order their keys were added.  (This 
implies that updating an item to a new value does not reset the order, 
but deleting the item and re-inserting it moves the item to the end.)
"""
msg82998 - (view) Author: Jim Jewett (jimjjewett) Date: 2009-03-02 03:45
I would also recommend strengthening some of the tests with regard to 
ordering stability across update vs delete-and-reinsert.

TestOrderedDict.test_update does verify that updated items are not 
moved to the end, but the comment suggests it is only checking that the 
previous contents had not been cleared entirely.

test_iterators, test_popitem, and test_pop check only single 
insertions; I think the tests would be stronger if you updated the dict 
with reversed(pairs) before iterating, so that it would be obvious the 
ordering was based on the original insertion, rather than the latest 
access.

test_setdefault should also verify that x is the final key, and that 
the a key hasn't changed position.

I didn't see any tests for insert a, insert b, delete a, re-insert a, 
verify that a is now later than b.
msg83005 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2009-03-02 06:44
Jim, I updated the docs to talk cover delete-reinsert.

Also, added a few more tests as suggested but am leaving test_iterators,
test_popitem, and test_pop unchanged.  It is enough to test
delete-reinsertion once or twice and know that we've separately tested
the component operations (setting new keys, setting existing keys, and
deleting existing keys).
msg83025 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2009-03-02 17:46
At Antoine's request, strengthened the tests in test_copying.
msg83038 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2009-03-02 21:22
Attaching update reflecting Guido's change to __eq__().
msg83039 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2009-03-02 21:30
Checked-in r70101 and r70102
msg83040 - (view) Author: Armin Ronacher (aronacher) * (Python committer) Date: 2009-03-02 21:41
Maybe premature optimization but maybe it would make sense to implement
__eq__ like this:

def __eq__(self, other):
    if isinstance(other, OrderedDict):
        if not dict.__eq__(self, other):
            return False
        return all(p == q for p, q in _zip_longest(self.items(),
                                                   other.items()))
    return dict.__eq__(self, other)

For the most likely case (that dicts are different) this should give a
speedup.
msg83083 - (view) Author: Forest (forest) Date: 2009-03-03 19:23
I was just reading the PEP, and caught this bit:

"Does OrderedDict.popitem() return a particular key/value pair?
Yes. It pops-off the most recently inserted new key and its
corresponding value."

Okay, but I'd also like a convenient and fast way to find the oldest
key, which I think I'd need for an LRU cache.  I didn't notice one in
the patch.  Perhaps I simply missed something.  Shouldn't popitem()
allow the caller to choose which end from which to pop?
msg83084 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2009-03-03 19:26
I wonder if, instead of all kinds of new APIs, the _keys list could just
be made public (under a different name of course).

Of course, that makes further optimization or a rewrite in C harder.
msg83085 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2009-03-03 19:32
The internal data structure *must* remain private so that we can build a
C replacement or switch to one of the other possible algorithms.
msg83086 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2009-03-03 19:36
Forest, for your use case I recommend copying the code to a new class
and replacing the _keys list with a deque so that you can efficiently
pop from the other end.
msg83087 - (view) Author: Forest (forest) Date: 2009-03-03 19:39
> Shouldn't popitem() allow the caller to choose which end from
> which to pop?

Thinking it through a bit more, and LRU cache would actually need to
access the oldest item without necessarily removing it.  Besides,
popitem() should probably retain the signature of dict.popitem().  I
think I'll take this matter to python-dev.
msg83089 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2009-03-03 19:48
> The internal data structure *must* remain private so that we can build a
> C replacement or switch to one of the other possible algorithms.

Even then the keys list could be offered as a property.
msg83090 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2009-03-03 20:00
Forest, I've taken another look at what's involved and am inclined to
accept the idea.  It can be done without mucking-up the regular dict API
and without precluding any of the other possible underlying algorithms.
 I like that it supports an entire new categoy of use cases with only a
trivial, easily-understood API extension.  Patch is attached.
msg83091 - (view) Author: Armin Ronacher (aronacher) * (Python committer) Date: 2009-03-03 20:07
Please no.  We just decided to *not* extend the API.  The PEP originally
had a well designed list of dict API extensions that already provided
exactly that.  If we really want to provide access to that, we can roll
back to where we came from.

I don't think implementing an LRUCache on an ordered dict is a good
idea.  Especially because LRU caches are a shared resource and should be
thread safe because of that.  Which the ordered dict is not.
msg83095 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2009-03-03 20:13
OK, disregard my suggestions, it's better not to have a property that
does almost exactly the same as keys().
msg83096 - (view) Author: Forest (forest) Date: 2009-03-03 20:56
Agreed here.  Thanks, gents.
History
Date User Action Args
2009-09-08 15:09:33matthieu.labbesetnosy: + matthieu.labbe
2009-03-03 20:56:35forestsetmessages: + msg83096
2009-03-03 20:13:45georg.brandlsetmessages: + msg83095
2009-03-03 20:08:18aronachersetmessages: - msg83092
2009-03-03 20:08:06aronachersetmessages: + msg83092
2009-03-03 20:07:53aronachersetmessages: + msg83091
2009-03-03 20:06:57rhettingersetfiles: - od6.diff
2009-03-03 20:06:53rhettingersetfiles: - od5.diff
2009-03-03 20:06:49rhettingersetfiles: - od4.diff
2009-03-03 20:00:44rhettingersetfiles: + popoldest.diff
messages: + msg83090
2009-03-03 19:48:41georg.brandlsetmessages: + msg83089
2009-03-03 19:39:43forestsetmessages: + msg83087
2009-03-03 19:36:30rhettingersetmessages: + msg83086
2009-03-03 19:32:45rhettingersetmessages: + msg83085
2009-03-03 19:26:33georg.brandlsetmessages: + msg83084
2009-03-03 19:23:58forestsetnosy: + forest
messages: + msg83083
2009-03-02 21:41:10aronachersetmessages: + msg83040
2009-03-02 21:30:26rhettingersetstatus: open -> closed
resolution: accepted
messages: + msg83039
2009-03-02 21:22:54rhettingersetfiles: + od7.diff
messages: + msg83038
2009-03-02 17:46:22rhettingersetfiles: + od6.diff
assignee: rhettinger
messages: + msg83025
2009-03-02 06:44:17rhettingersetfiles: + od5.diff
messages: + msg83005
2009-03-02 03:45:12jimjjewettsetmessages: + msg82998
2009-03-02 02:43:39jimjjewettsetnosy: + jimjjewett
messages: + msg82996
2009-03-01 19:08:12rhettingersetfiles: - od3.diff
2009-03-01 19:08:04rhettingersetfiles: + od4.diff
messages: + msg82976
2009-03-01 18:49:17rhettingersetfiles: - od.diff
2009-03-01 18:49:10rhettingersetfiles: - od2.diff
2009-03-01 18:49:00rhettingersetfiles: + od3.diff
messages: + msg82975
2009-03-01 18:32:53georg.brandlsetmessages: + msg82974
2009-03-01 18:30:40rhettingersetfiles: + od2.diff
messages: + msg82973
2009-03-01 18:21:44aronachersetmessages: + msg82972
2009-03-01 14:57:01pitrousetnosy: + pitrou
messages: + msg82962
2009-03-01 12:48:26georg.brandlsetnosy: + aronacher, georg.brandl
messages: + msg82959
2009-03-01 10:30:24rhettingercreate