Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PEP 372: OrderedDict #49647

Closed
rhettinger opened this issue Mar 1, 2009 · 25 comments
Closed

PEP 372: OrderedDict #49647

rhettinger opened this issue Mar 1, 2009 · 25 comments
Assignees
Labels
stdlib Python modules in the Lib dir

Comments

@rhettinger
Copy link
Contributor

BPO 5397
Nosy @birkenfeld, @rhettinger, @pitrou, @mitsuhiko
Files
  • od7.diff: Updated eq()
  • popoldest.diff: pop_oldest_item patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/rhettinger'
    closed_at = <Date 2009-03-02.21:30:26.594>
    created_at = <Date 2009-03-01.10:30:24.125>
    labels = ['library']
    title = 'PEP 372:  OrderedDict'
    updated_at = <Date 2009-09-08.15:09:33.649>
    user = 'https://github.com/rhettinger'

    bugs.python.org fields:

    activity = <Date 2009-09-08.15:09:33.649>
    actor = 'matthieu.labbe'
    assignee = 'rhettinger'
    closed = True
    closed_date = <Date 2009-03-02.21:30:26.594>
    closer = 'rhettinger'
    components = ['Library (Lib)']
    creation = <Date 2009-03-01.10:30:24.125>
    creator = 'rhettinger'
    dependencies = []
    files = ['13231', '13235']
    hgrepos = []
    issue_num = 5397
    keywords = ['patch']
    message_count = 25.0
    messages = ['82955', '82959', '82962', '82972', '82973', '82974', '82975', '82976', '82996', '82998', '83005', '83025', '83038', '83039', '83040', '83083', '83084', '83085', '83086', '83087', '83089', '83090', '83091', '83095', '83096']
    nosy_count = 7.0
    nosy_names = ['georg.brandl', 'rhettinger', 'jimjjewett', 'pitrou', 'forest', 'aronacher', 'matthieu.labbe']
    pr_nums = []
    priority = 'normal'
    resolution = 'accepted'
    stage = 'patch review'
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue5397'
    versions = ['Python 3.1']

    @rhettinger
    Copy link
    Contributor Author

    Here is a working patch implementing PEP-372 ordered dictionaries.

    @rhettinger rhettinger added the stdlib Python modules in the Lib dir label Mar 1, 2009
    @birkenfeld
    Copy link
    Member

    Doc nits:

    • "items are returned in the order they were first added": it should be
      made clear that it matters when the *key* was first added
    • "An *OrderedDict* remembers order that entries were inserted": misses
      a word somewhere?
    • "OrderDict" should be "OrderedDict"
    • "compare their ordered :attr:`items` list": ... lists?

    Implementation nits:

    • "raise TypeError('expected at 1 argument, got %d', len(args))"
      should read "at most" and use "%" instead of ","
    • "raise KeyError" in popitem(): should it get a message?
    • eval()ing the repr() will not construct the dict in the same order

    @pitrou
    Copy link
    Member

    pitrou commented Mar 1, 2009

    In SubclassMappingTests, MyOrderedDict should subclass OrderedDict
    rather than dict, shouldn't it?

    @mitsuhiko
    Copy link
    Member

    @georg

    • eval()ing the repr() will not construct the dict in the same order
      The alternative would be a list of dicts inside the constructor call,
      but that feels ugly. defaultdict from the same module is not evaluable
      at all, so I guess it wouldn't be that much of a problem.

    @rhettinger
    Copy link
    Contributor Author

    Thanks for the review comments guys. An updated patch is attached.

    @birkenfeld
    Copy link
    Member

    I still see an "OrderDict" in the docs, and the TypeError in __init__
    still needs to use "%" instead of ",".

    @rhettinger
    Copy link
    Contributor Author

    Fixed.

    @rhettinger
    Copy link
    Contributor Author

    Added a few more tests.

    @jimjjewett
    Copy link
    Mannequin

    jimjjewett mannequin commented Mar 2, 2009

    I would try to make it more explicit that updates do not reset the
    order, but deleting a item and re-inserting it *does*. (So it isn't
    the first insertion of the key, it is the first that hasn't yet been
    followed by a deletion.)

    Maybe change:

    """
    When iterating over an ordered dictionary,
    the items are returned in the order their keys were first added.
    """

    to [removed "first", added longer explanation]

    """
    When iterating over an ordered dictionary,
    the items are returned in the order their keys were added. (This
    implies that updating an item to a new value does not reset the order,
    but deleting the item and re-inserting it moves the item to the end.)
    """

    @jimjjewett
    Copy link
    Mannequin

    jimjjewett mannequin commented Mar 2, 2009

    I would also recommend strengthening some of the tests with regard to
    ordering stability across update vs delete-and-reinsert.

    TestOrderedDict.test_update does verify that updated items are not
    moved to the end, but the comment suggests it is only checking that the
    previous contents had not been cleared entirely.

    test_iterators, test_popitem, and test_pop check only single
    insertions; I think the tests would be stronger if you updated the dict
    with reversed(pairs) before iterating, so that it would be obvious the
    ordering was based on the original insertion, rather than the latest
    access.

    test_setdefault should also verify that x is the final key, and that
    the a key hasn't changed position.

    I didn't see any tests for insert a, insert b, delete a, re-insert a,
    verify that a is now later than b.

    @rhettinger
    Copy link
    Contributor Author

    Jim, I updated the docs to talk cover delete-reinsert.

    Also, added a few more tests as suggested but am leaving test_iterators,
    test_popitem, and test_pop unchanged. It is enough to test
    delete-reinsertion once or twice and know that we've separately tested
    the component operations (setting new keys, setting existing keys, and
    deleting existing keys).

    @rhettinger
    Copy link
    Contributor Author

    At Antoine's request, strengthened the tests in test_copying.

    @rhettinger rhettinger self-assigned this Mar 2, 2009
    @rhettinger
    Copy link
    Contributor Author

    Attaching update reflecting Guido's change to __eq__().

    @rhettinger
    Copy link
    Contributor Author

    Checked-in r70101 and r70102

    @mitsuhiko
    Copy link
    Member

    Maybe premature optimization but maybe it would make sense to implement
    __eq__ like this:

    def __eq__(self, other):
        if isinstance(other, OrderedDict):
            if not dict.__eq__(self, other):
                return False
            return all(p == q for p, q in _zip_longest(self.items(),
                                                       other.items()))
        return dict.__eq__(self, other)

    For the most likely case (that dicts are different) this should give a
    speedup.

    @forest
    Copy link
    Mannequin

    forest mannequin commented Mar 3, 2009

    I was just reading the PEP, and caught this bit:

    "Does OrderedDict.popitem() return a particular key/value pair?
    Yes. It pops-off the most recently inserted new key and its
    corresponding value."

    Okay, but I'd also like a convenient and fast way to find the oldest
    key, which I think I'd need for an LRU cache. I didn't notice one in
    the patch. Perhaps I simply missed something. Shouldn't popitem()
    allow the caller to choose which end from which to pop?

    @birkenfeld
    Copy link
    Member

    I wonder if, instead of all kinds of new APIs, the _keys list could just
    be made public (under a different name of course).

    Of course, that makes further optimization or a rewrite in C harder.

    @rhettinger
    Copy link
    Contributor Author

    The internal data structure *must* remain private so that we can build a
    C replacement or switch to one of the other possible algorithms.

    @rhettinger
    Copy link
    Contributor Author

    Forest, for your use case I recommend copying the code to a new class
    and replacing the _keys list with a deque so that you can efficiently
    pop from the other end.

    @forest
    Copy link
    Mannequin

    forest mannequin commented Mar 3, 2009

    Shouldn't popitem() allow the caller to choose which end from
    which to pop?

    Thinking it through a bit more, and LRU cache would actually need to
    access the oldest item without necessarily removing it. Besides,
    popitem() should probably retain the signature of dict.popitem(). I
    think I'll take this matter to python-dev.

    @birkenfeld
    Copy link
    Member

    The internal data structure *must* remain private so that we can build a
    C replacement or switch to one of the other possible algorithms.

    Even then the keys list could be offered as a property.

    @rhettinger
    Copy link
    Contributor Author

    Forest, I've taken another look at what's involved and am inclined to
    accept the idea. It can be done without mucking-up the regular dict API
    and without precluding any of the other possible underlying algorithms.
    I like that it supports an entire new categoy of use cases with only a
    trivial, easily-understood API extension. Patch is attached.

    @mitsuhiko
    Copy link
    Member

    Please no. We just decided to *not* extend the API. The PEP originally
    had a well designed list of dict API extensions that already provided
    exactly that. If we really want to provide access to that, we can roll
    back to where we came from.

    I don't think implementing an LRUCache on an ordered dict is a good
    idea. Especially because LRU caches are a shared resource and should be
    thread safe because of that. Which the ordered dict is not.

    @birkenfeld
    Copy link
    Member

    OK, disregard my suggestions, it's better not to have a property that
    does almost exactly the same as keys().

    @forest
    Copy link
    Mannequin

    forest mannequin commented Mar 3, 2009

    Agreed here. Thanks, gents.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants