msg82955 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2009-03-01 10:30 |
Here is a working patch implementing PEP372 ordered dictionaries.
|
msg82959 - (view) |
Author: Georg Brandl (georg.brandl) *  |
Date: 2009-03-01 12:48 |
Doc nits:
* "items are returned in the order they were first added": it should be
made clear that it matters when the *key* was first added
* "An *OrderedDict* remembers order that entries were inserted": misses
a word somewhere?
* "OrderDict" should be "OrderedDict"
* "compare their ordered :attr:`items` list": ... lists?
Implementation nits:
* "raise TypeError('expected at 1 argument, got %d', len(args))"
should read "at most" and use "%" instead of ","
* "raise KeyError" in popitem(): should it get a message?
* eval()ing the repr() will not construct the dict in the same order
|
msg82962 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2009-03-01 14:57 |
In SubclassMappingTests, MyOrderedDict should subclass OrderedDict
rather than dict, shouldn't it?
|
msg82972 - (view) |
Author: Armin Ronacher (aronacher) *  |
Date: 2009-03-01 18:21 |
@Georg
> * eval()ing the repr() will not construct the dict in the same order
The alternative would be a list of dicts inside the constructor call,
but that feels ugly. defaultdict from the same module is not evaluable
at all, so I guess it wouldn't be that much of a problem.
|
msg82973 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2009-03-01 18:30 |
Thanks for the review comments guys. An updated patch is attached.
|
msg82974 - (view) |
Author: Georg Brandl (georg.brandl) *  |
Date: 2009-03-01 18:32 |
I still see an "OrderDict" in the docs, and the TypeError in __init__
still needs to use "%" instead of ",".
|
msg82975 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2009-03-01 18:49 |
Fixed.
|
msg82976 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2009-03-01 19:08 |
Added a few more tests.
|
msg82996 - (view) |
Author: Jim Jewett (jimjjewett) |
Date: 2009-03-02 02:43 |
I would try to make it more explicit that updates do not reset the
order, but deleting a item and re-inserting it *does*. (So it isn't
the first insertion of the key, it is the first that hasn't yet been
followed by a deletion.)
Maybe change:
"""
When iterating over an ordered dictionary,
the items are returned in the order their keys were first added.
"""
to [removed "first", added longer explanation]
"""
When iterating over an ordered dictionary,
the items are returned in the order their keys were added. (This
implies that updating an item to a new value does not reset the order,
but deleting the item and re-inserting it moves the item to the end.)
"""
|
msg82998 - (view) |
Author: Jim Jewett (jimjjewett) |
Date: 2009-03-02 03:45 |
I would also recommend strengthening some of the tests with regard to
ordering stability across update vs delete-and-reinsert.
TestOrderedDict.test_update does verify that updated items are not
moved to the end, but the comment suggests it is only checking that the
previous contents had not been cleared entirely.
test_iterators, test_popitem, and test_pop check only single
insertions; I think the tests would be stronger if you updated the dict
with reversed(pairs) before iterating, so that it would be obvious the
ordering was based on the original insertion, rather than the latest
access.
test_setdefault should also verify that x is the final key, and that
the a key hasn't changed position.
I didn't see any tests for insert a, insert b, delete a, re-insert a,
verify that a is now later than b.
|
msg83005 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2009-03-02 06:44 |
Jim, I updated the docs to talk cover delete-reinsert.
Also, added a few more tests as suggested but am leaving test_iterators,
test_popitem, and test_pop unchanged. It is enough to test
delete-reinsertion once or twice and know that we've separately tested
the component operations (setting new keys, setting existing keys, and
deleting existing keys).
|
msg83025 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2009-03-02 17:46 |
At Antoine's request, strengthened the tests in test_copying.
|
msg83038 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2009-03-02 21:22 |
Attaching update reflecting Guido's change to __eq__().
|
msg83039 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2009-03-02 21:30 |
Checked-in r70101 and r70102
|
msg83040 - (view) |
Author: Armin Ronacher (aronacher) *  |
Date: 2009-03-02 21:41 |
Maybe premature optimization but maybe it would make sense to implement
__eq__ like this:
def __eq__(self, other):
if isinstance(other, OrderedDict):
if not dict.__eq__(self, other):
return False
return all(p == q for p, q in _zip_longest(self.items(),
other.items()))
return dict.__eq__(self, other)
For the most likely case (that dicts are different) this should give a
speedup.
|
msg83083 - (view) |
Author: Forest (forest) |
Date: 2009-03-03 19:23 |
I was just reading the PEP, and caught this bit:
"Does OrderedDict.popitem() return a particular key/value pair?
Yes. It pops-off the most recently inserted new key and its
corresponding value."
Okay, but I'd also like a convenient and fast way to find the oldest
key, which I think I'd need for an LRU cache. I didn't notice one in
the patch. Perhaps I simply missed something. Shouldn't popitem()
allow the caller to choose which end from which to pop?
|
msg83084 - (view) |
Author: Georg Brandl (georg.brandl) *  |
Date: 2009-03-03 19:26 |
I wonder if, instead of all kinds of new APIs, the _keys list could just
be made public (under a different name of course).
Of course, that makes further optimization or a rewrite in C harder.
|
msg83085 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2009-03-03 19:32 |
The internal data structure *must* remain private so that we can build a
C replacement or switch to one of the other possible algorithms.
|
msg83086 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2009-03-03 19:36 |
Forest, for your use case I recommend copying the code to a new class
and replacing the _keys list with a deque so that you can efficiently
pop from the other end.
|
msg83087 - (view) |
Author: Forest (forest) |
Date: 2009-03-03 19:39 |
> Shouldn't popitem() allow the caller to choose which end from
> which to pop?
Thinking it through a bit more, and LRU cache would actually need to
access the oldest item without necessarily removing it. Besides,
popitem() should probably retain the signature of dict.popitem(). I
think I'll take this matter to python-dev.
|
msg83089 - (view) |
Author: Georg Brandl (georg.brandl) *  |
Date: 2009-03-03 19:48 |
> The internal data structure *must* remain private so that we can build a
> C replacement or switch to one of the other possible algorithms.
Even then the keys list could be offered as a property.
|
msg83090 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2009-03-03 20:00 |
Forest, I've taken another look at what's involved and am inclined to
accept the idea. It can be done without mucking-up the regular dict API
and without precluding any of the other possible underlying algorithms.
I like that it supports an entire new categoy of use cases with only a
trivial, easily-understood API extension. Patch is attached.
|
msg83091 - (view) |
Author: Armin Ronacher (aronacher) *  |
Date: 2009-03-03 20:07 |
Please no. We just decided to *not* extend the API. The PEP originally
had a well designed list of dict API extensions that already provided
exactly that. If we really want to provide access to that, we can roll
back to where we came from.
I don't think implementing an LRUCache on an ordered dict is a good
idea. Especially because LRU caches are a shared resource and should be
thread safe because of that. Which the ordered dict is not.
|
msg83095 - (view) |
Author: Georg Brandl (georg.brandl) *  |
Date: 2009-03-03 20:13 |
OK, disregard my suggestions, it's better not to have a property that
does almost exactly the same as keys().
|
msg83096 - (view) |
Author: Forest (forest) |
Date: 2009-03-03 20:56 |
Agreed here. Thanks, gents.
|
|
Date |
User |
Action |
Args |
2022-04-11 14:56:46 | admin | set | github: 49647 |
2009-09-08 15:09:33 | matthieu.labbe | set | nosy:
+ matthieu.labbe
|
2009-03-03 20:56:35 | forest | set | messages:
+ msg83096 |
2009-03-03 20:13:45 | georg.brandl | set | messages:
+ msg83095 |
2009-03-03 20:08:18 | aronacher | set | messages:
- msg83092 |
2009-03-03 20:08:06 | aronacher | set | messages:
+ msg83092 |
2009-03-03 20:07:53 | aronacher | set | messages:
+ msg83091 |
2009-03-03 20:06:57 | rhettinger | set | files:
- od6.diff |
2009-03-03 20:06:53 | rhettinger | set | files:
- od5.diff |
2009-03-03 20:06:49 | rhettinger | set | files:
- od4.diff |
2009-03-03 20:00:44 | rhettinger | set | files:
+ popoldest.diff messages:
+ msg83090 |
2009-03-03 19:48:41 | georg.brandl | set | messages:
+ msg83089 |
2009-03-03 19:39:43 | forest | set | messages:
+ msg83087 |
2009-03-03 19:36:30 | rhettinger | set | messages:
+ msg83086 |
2009-03-03 19:32:45 | rhettinger | set | messages:
+ msg83085 |
2009-03-03 19:26:33 | georg.brandl | set | messages:
+ msg83084 |
2009-03-03 19:23:58 | forest | set | nosy:
+ forest messages:
+ msg83083 |
2009-03-02 21:41:10 | aronacher | set | messages:
+ msg83040 |
2009-03-02 21:30:26 | rhettinger | set | status: open -> closed resolution: accepted messages:
+ msg83039 |
2009-03-02 21:22:54 | rhettinger | set | files:
+ od7.diff messages:
+ msg83038 |
2009-03-02 17:46:22 | rhettinger | set | files:
+ od6.diff assignee: rhettinger messages:
+ msg83025 |
2009-03-02 06:44:17 | rhettinger | set | files:
+ od5.diff messages:
+ msg83005 |
2009-03-02 03:45:12 | jimjjewett | set | messages:
+ msg82998 |
2009-03-02 02:43:39 | jimjjewett | set | nosy:
+ jimjjewett messages:
+ msg82996 |
2009-03-01 19:08:12 | rhettinger | set | files:
- od3.diff |
2009-03-01 19:08:04 | rhettinger | set | files:
+ od4.diff messages:
+ msg82976 |
2009-03-01 18:49:17 | rhettinger | set | files:
- od.diff |
2009-03-01 18:49:10 | rhettinger | set | files:
- od2.diff |
2009-03-01 18:49:00 | rhettinger | set | files:
+ od3.diff messages:
+ msg82975 |
2009-03-01 18:32:53 | georg.brandl | set | messages:
+ msg82974 |
2009-03-01 18:30:40 | rhettinger | set | files:
+ od2.diff messages:
+ msg82973 |
2009-03-01 18:21:44 | aronacher | set | messages:
+ msg82972 |
2009-03-01 14:57:01 | pitrou | set | nosy:
+ pitrou messages:
+ msg82962 |
2009-03-01 12:48:26 | georg.brandl | set | nosy:
+ aronacher, georg.brandl messages:
+ msg82959 |
2009-03-01 10:30:24 | rhettinger | create | |