classification
Title: Remove doubly-linked list from C OrderedDict
Type: Stage:
Components: Interpreter Core Versions: Python 3.7
process
Status: open Resolution: later
Dependencies: Superseder:
Assigned To: rhettinger Nosy List: arigo, aronacher, eric.snow, inada.naoki, rhettinger, serhiy.storchaka, vstinner
Priority: normal Keywords:

Created on 2017-08-23 10:24 by inada.naoki, last changed 2017-11-09 12:38 by vstinner.

Pull Requests
URL Status Linked Edit
PR 3193 open inada.naoki, 2017-08-23 10:29
Messages (25)
msg300748 - (view) Author: INADA Naoki (inada.naoki) * (Python committer) Date: 2017-08-23 10:24
Since dict preserves insertion order, doubly linked list in OrderedDict
can be removed.

There is small performance improvement for odict creation:

$ curl https://api.github.com/orgs/python/repos > repos.json
$ ./py-patched -m perf timeit --compare-to `pwd`/py-default -s 'from collections import OrderedDict as od; import json; data=open("repos.json").read()' -- 'json.loads(data, object_pairs_hook=od)'
py-default: ..................... 1.53 ms +- 0.01 ms
py-patched: ..................... 1.30 ms +- 0.01 ms

Mean +- std dev: [py-default] 1.53 ms +- 0.01 ms -> [py-patched] 1.30 ms +- 0.01 ms: 1.18x faster (-15%)

And more memory efficient:

$ ./py-default -c 'from collections import OrderedDict; import sys; print(sys.getsizeof(OrderedDict.fromkeys(range(1000))))'
85416

$ ./py-patched -c 'from collections import OrderedDict; import sys; print(sys.getsizeof(OrderedDict.fromkeys(range(1000))))'
36992

But most important benefit is smaller code.  It make easy to maintain.
msg300749 - (view) Author: INADA Naoki (inada.naoki) * (Python committer) Date: 2017-08-23 10:45
There are some failing tests remaining, and I want to discuss about
some of them here.

Traceback (most recent call last):
  File "/home/inada-n/work/python/nodebug/Lib/test/test_ordered_dict.py", line 261, in test_pop
    self.assertEqual(m.pop('a', default=6), 6)
TypeError: pop() takes no keyword arguments

dict.pop doesn't take keyword argument.
Since OrderedDict is pure Python at first, C implementation of
OrderedDict.pop() takes keyword too.
May I change `dict.pop()` to take keyword too. It reduce odict
specific method.

Some test expect KeyError in some edge cases.
But new implementation behaves differently.
For example,

    def test_dict_delitem(self):
        OrderedDict = self.OrderedDict
        od = OrderedDict()
        od['spam'] = 1
        od['ham'] = 2
        dict.__delitem__(od, 'spam')
        with self.assertRaises(KeyError):
            repr(od)

Since current implementation uses linked list, it raises KeyError.
But this is totally OK for new C implementation.

---

Personally speaking, I want to stop keeping compatibility with pure Python
implementation.
Is it possible to make "builtin _collections.OrderedDict" as requirement for
all Python 3.7 implementations and remove pure Python implementation stdlib?
msg300752 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2017-08-23 14:22
I would be very careful going down this path.  Regular dict ordering is not yet guaranteed and is subject to change.  Its design was primarily about compaction and the ordering was a side-effect.  In contrast, the data structure for collections.OrderedDict() was designed specifically for maintaining order.  It is likely that there are some workloads where the regular dict would degrade significantly compared to ordered dicts.
msg300753 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-08-23 14:37
I like the idea. Actually I wanted to write such patch myself, but this is very delicate thing and needs to be very careful. The largest benefit is not just memory saving and performance, but robustness. Currently it is easy to went OrderedDict in incorrect state by using pure dict API. This can cause crashes, hangs or invalid bahavior (see issue24726 and issue25410). The new implementation should pass all existing tests and also fix the above issues.

See also issue28239. lru_cache uses simplified version of ordered dict.
msg300756 - (view) Author: INADA Naoki (inada.naoki) * (Python committer) Date: 2017-08-23 15:11
I think typical usage is: get, set (incl, creating), and iterate.

* get: there is no difference
* set: inserting new key is little faster since no updating linked list
* iterate: in typical case, new odict is faster because current odict iterator do lookup for each key.

$ ./py-patched -m perf timeit --compare-to `pwd`/py-default \
  -s 'from collections import OrderedDict as odict; od = odict.fromkeys(range(10000))' -- 'list(od)'

py-default: ..................... 223 us +- 10 us
py-patched: ..................... 93.7 us +- 3.3 us
Mean +- std dev: [py-default] 223 us +- 10 us -> [py-patched] 93.7 us +- 3.3 us: 2.38x faster (-58%)


On the other hand, there are some cases new odict is slower:

* iterating sparse dict. (but same speed as normal dict)
* comparing two odict, because new odict do `list(od1) == list(od2)` to compare keys.

For now, new odict uses dict's iterator (with adding `reversed` order support) and it's weak
against modify while iterating.  That's why I used temporal list while comparing.
And there is one failing test for modify-while-iterate case.

So I'm thinking about implementing robust odict iterator which detect modify for now.
msg300757 - (view) Author: INADA Naoki (inada.naoki) * (Python committer) Date: 2017-08-23 15:27
od.move_to_end() is slower too:

$ ./py-patched -m perf timeit --compare-to `pwd`/py-default -s 'from collections import OrderedDict as odict; od = odict.fromkeys("abcdefghijklmnopqrstuvwxyz")' -- 'od.move_to_end("a"); od.move_to_end("b")'
py-default: ..................... 196 ns +- 4 ns
py-patched: ..................... 227 ns +- 3 ns

Mean +- std dev: [py-default] 196 ns +- 4 ns -> [py-patched] 227 ns +- 3 ns: 1.16x slower (+16%)

$ ./py-patched -m perf timeit --compare-to `pwd`/py-default -s 'from collections import OrderedDict as odict; od = odict.fromkeys("abcdefghijklmnopqrstuvwxyz")' -- 'od.move_to_end("z")'
py-default: ..................... 74.7 ns +- 0.4 ns
py-patched: ..................... 91.1 ns +- 3.5 ns

Mean +- std dev: [py-default] 74.7 ns +- 0.4 ns -> [py-patched] 91.1 ns +- 3.5 ns: 1.22x slower (+22%)

$ ./py-patched -m perf timeit --compare-to `pwd`/py-default -s 'from collections import OrderedDict as odict; od1 = odict.fromkeys(range(10000)); od2=odict.fromkeys(range(10000))' -- 'od1==od2'
py-default: ..................... 451 us +- 3 us
py-patched: ..................... 632 us +- 6 us

Mean +- std dev: [py-default] 451 us +- 3 us -> [py-patched] 632 us +- 6 us: 1.40x slower (+40%)


# pros

* 1000 less lines of code
* 50% less memory usage
* 15% faster creation
* 100% (2x) faster iteration

# cons

* Some inconsistency against pure Python implementation
* 20% slower move_to_end
* 40% slower comparison
msg300758 - (view) Author: INADA Naoki (inada.naoki) * (Python committer) Date: 2017-08-23 15:34
Another idea is making "dict preserves insertion order" as language spec.
There are still some difference between dict and odict: .move_to_end() and __eq__.

But most use cases of odict is just keep insertion order.
For example, parsing config file, json, or csv.
In such cases, people can get all benefits (less memory and faster creation, iteration and deallocation) of dict if it's guaranteed.
msg300759 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-08-23 16:53
What about move_to_end(last=False)?
msg300760 - (view) Author: INADA Naoki (inada.naoki) * (Python committer) Date: 2017-08-23 17:30
When no change happens:

$ ./py-patched -m perf timeit --compare-to `pwd`/py-default -s 'from collections import Ordedict; od = odict.fromkeys("abcdefghijklmnopqrstuvwxyz")' -- 'od.move_to_end("a", 0)'                                                                          
py-default: ..................... 90.6 ns +- 1.0 ns
py-patched: ..................... 107 ns +- 1 ns

Mean +- std dev: [py-default] 90.6 ns +- 1.0 ns -> [py-patched] 107 ns +- 1 ns: 1.18x slower (+18%)

When change happens:

$ ./py-patched -m perf timeit --compare-to `pwd`/py-default -s 'from collections import OrderedDict as odict; od = odict.fromkeys("abcdefghijklmnopqrstuvwxyz")' -- 'od.move_to_end("y", 0); od.move_to_end("z", 0)'
py-default: ..................... 233 ns +- 4 ns
py-patched: ..................... 304 ns +- 2 ns

Mean +- std dev: [py-default] 233 ns +- 4 ns -> [py-patched] 304 ns +- 2 ns: 1.30x slower (+30%)
msg300902 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2017-08-26 19:50
I'll bring this up at the language sprints.  Last year, there was an explicit decision to not go down this path.

Here are a few thoughts for now:  It is a big deal to change the API.  We try hard not to do that (breaking code for no reason and making it more challenging for people to upgrade their python -- the standard library is intended to be standard rather than highly mutable). It is a big deal to rip-out the pure python version and impose obligations on other implementations.  Over time, we've tried to move in the direction of more pure python code rather than less.  The pure python code is easier to understand, easier to maintain, generally less buggy, and is more portable.  People currently using OrderedDict are choosing it specifically because they want the ordering behavior.  That is currently implemented in an algorithmically correct way that preserves the big-oh behavior in the face of deletions and updates.  In contrast, the regular dict only happens to be ordered and internally is achieving order by maintaining a sequence of consecutive pointers.  This doesn't have the same performance characteristics and would be a step backwards for some use cases.  Lastly, if the ordering of regular dicts becomes guaranteed, then collections.OrderedDict() will mostly become irrelevant so there is no need to change it all.

For now, this proposal is on hold because 1) it isn't clear that it should be done, 2) it needs a lot of serious discussion before proceeding, 3) it may be premature while the status of the regular dict is still in flux.
msg301238 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2017-09-04 18:20
> For now, this proposal is on hold because
> 1) it isn't clear that it should be done,
> 2) it needs a lot of serious discussion before proceeding,
> 3) it may be premature while the status of the regular dict
> is still in flux.

+1

When writing the C implementation my biggest constraint was compatibility with the Python implementation, including algorithmic complexity.  I'm in favor of making use of dict's new ordering, but it's not urgent and Raymond's concerns should be addressed first (particularly since the collections module is his).
msg301291 - (view) Author: INADA Naoki (inada.naoki) * (Python committer) Date: 2017-09-05 01:40
FYI, my approach is not original, but copied from PyPy.

https://bitbucket.org/pypy/pypy/src/3e52029e9a5a677f7de62ef49f36090465cfbf4c/rpython/rtyper/lltypesystem/rordereddict.py?at=default&fileviewer=file-view-default#rordereddict.py-1437:1551
msg301836 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2017-09-10 22:28
Based on the python-dev discussion, can we close this now?
msg301837 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2017-09-10 22:40
Just for the record, here is the draft of the post I was going to make on python-dev but didn't prove to be necessary.

---------------------------------------------

I think would be a mistake to replace the implementation of collections.OrderedDict() with the builtin compact-and-ordered dict.  They serve different use cases and deserve different implementations that best serve those use cases.

When PEP 372 for the OrderedDict was accepted a decade ago, Armin and I looked at various implementation strategies.  It came down to a contest between keeping the order in an arraylike structure (like a Python list) or using a doubly-linked list.  The latter was chosen because it had superior algorithmic performance for workloads that involved a mix of insertions and deletions.  In particular, it offered a constant time guarantee for popping from any location or doing a move_to_front or move_to_last operation (which was helpful in implementing an LRU cache for example).  It supported arbitrary deletions and insertions without leaving holes that require periodic compaction.  When Eric Snow implemented OrderedDict in C for Python 3.5, he kept the doubly-linked list design to maintain those benefits.

In contrast, when I proposed the design for the current Python built-in dictionary, it had a different goal, namely compact storage.  It is very good at meeting that goal. However, the ordering of keys was a side-effect and somewhat of an afterthought rather than being something it was designed to handle well (which may be one of the reasons that Guido did not guarantee the ordering behavior).

At last year’s sprints, there was a deliberate decision to not replace the OrderedDict with the built-in dict.  I believe that decision was a good one and should be maintained.   It will allow us to have a separation of concerns and continue to improve the OrderedDict in a way that best supports applications that do interesting things with ordering (such as Antoine’s idea to use an OrderedDict as a circular queue).

I fully support efforts to improve the code for collections.OrderedDict() but think it is essential for it to evolve separately from the built-in dict and for its implementation strategy to be primarily focused efficiently maintaining order (its raison d'etre).
msg301839 - (view) Author: INADA Naoki (inada.naoki) * (Python committer) Date: 2017-09-10 23:12
The discussion on [1] was for removing pure Python implementation,
not about changing C implementation.

[1]: https://mail.python.org/pipermail/python-dev/2017-September/149147.html

While I withdrawed my suggestion about removing pure Python implementation,
I still think this new implementation is valuable.

Dict ordering is not language spec and many libraries including stdlib
uses OrderedDict to keep insertion order.
Faster creation, iteration and 1/2 memory usage is good enhancement
for most use cases.
msg301841 - (view) Author: INADA Naoki (inada.naoki) * (Python committer) Date: 2017-09-10 23:54
> Just for the record, here is the draft of the post I was going to make on python-dev but didn't prove to be necessary.

Thank you for write down your thought.

For move_to_end(), I admit new behavior is *amortized* O(1) and
current behavior is *worst-case* O(1).

When I implemented compact ordered dict in last year, my motivation
was porting PyPy's efficiency to CPython.
And this issue is based on same motivation.

So I want to hear Armin's opinion before closing this issue.
msg301843 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2017-09-11 00:52
ISTM that what is being proposed is an algorithmically flawed re-implementation of the ordered dictionary.  I'm unclear about whether you understand and acknowledge why the doubly-linked list was chosen and what kind of workloads it supports (we didn't choose it because it was either convenient or fun, we chose it because it was an algorithmically correct way of supporting arbitrary deletion, move-to-front, move-to-back, pop-first, and pop-last operations all of which have legitimate use cases).

Side note: part of the goal of the collections module is to provide builtin datatypes alternatives which different performance characteristics.  For example, deque() has a list-like API but is there to support efficient appends and pops from both ends while giving up efficient random access (another goal was obtaining more predictable performance by avoiding realloc()).

On a procedural note, it isn't a good practice to go "opinion shopping" as a way of trying to override the recommendations of the current maintainers of the code.  That seems like uncomfortable political gamesmanship.  Instead of throwing-out all the code, it would be a better to submit implementation improvements that preserve the core design (for example there are faster and more compact ways to store and update the links -- I can help get you started with this if you're interested).
msg301844 - (view) Author: INADA Naoki (inada.naoki) * (Python committer) Date: 2017-09-11 01:05
>  I'm unclear about whether you understand and acknowledge why the doubly-linked list was chosen and what kind of workloads it supports (we didn't choose it because it was either convenient or fun, we chose it because it was an algorithmically correct way of supporting arbitrary deletion, move-to-front, move-to-back, pop-first, and pop-last operations all of which have legitimate use cases).

Arbitrary deletion: New and current implementation has same complexity, because current implementation relying on dict deletion.  Only difference is current implementation need to remove item from link list, not only dict.

move-to-front, move-to-back: Current implementation is worst case O(1) and new implementation is amortized O(1), like insertion.

pop-first, pop-last: Current implementation is worst case O(1).  New implementation is typical case (when dict is dense) O(1).  When mixed with arbitrary deletion operation (dict is sparse), it's become amortized O(1).
msg301857 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-09-11 05:27
Note that mixed insertion and deletion is worst-case O(n) in current implementation.

Original Raymond's design didn't preserve ordering during deletion. It had worst-case O(1) for mixed insertion and deletion. I didn't follow the numerous discussions about compact dict implementation, but at some point it became keeping holes and preserving ordering during deletion. This leads to the need of periodical O(n) reallocation for mixed insertion and deletion. Perhaps the technical reason of this was the couple relation between C implementation of OrderedDict and dict.
msg301859 - (view) Author: INADA Naoki (inada.naoki) * (Python committer) Date: 2017-09-11 05:59
> Original Raymond's design didn't preserve ordering during deletion.

Original Raymond's pure Python implementation rebuilds index table.
Without it, proving can be very long or infinite loop.
See L89-92 in http://code.activestate.com/recipes/578375/
Strictly speaking, it's worst case is O(n) too.

Raymond's compact dict allocates entry table and index table separately.
I want to allocate them at once, as one memory block.
It makes better cache hit ratio on small dict.

That's why I choose "preserve insertion order" over "compaction".
Since index table and entry table grow at same time, rebuilding
index table and entry table at once makes sense to me.

Anyway, "namespace is ordered" is Python's language spec now.

class C:
    a = 1
    b = 2
    c = 3
    del a

In this case, C's namespace should {"b": 2, "c": 3}, not {"c": 3, "b": 2}.
msg301863 - (view) Author: Armin Rigo (arigo) * (Python committer) Date: 2017-09-11 06:39
I would side with Inada in thinking they both give the same amortized complexity, but beyond that, benchmarks are the real answer.  There is little value in keeping the current implementation of OrderedDict *if* benchmarks show that it is rarely faster.
msg301933 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2017-09-12 01:23
On Sun, Sep 10, 2017 at 10:27 PM, Serhiy Storchaka
<report@bugs.python.org> wrote:
> Note that mixed insertion and deletion is worst-case O(n) in current implementation.

Could you elaborate?  Note that every operation of the current
implementation matches the complexity of the Python implementation.
msg301941 - (view) Author: INADA Naoki (inada.naoki) * (Python committer) Date: 2017-09-12 07:27
> Eric Snow added the comment:
>
> On Sun, Sep 10, 2017 at 10:27 PM, Serhiy Storchaka
> <report@bugs.python.org> wrote:
>> Note that mixed insertion and deletion is worst-case O(n) in current implementation.
>
> Could you elaborate?  Note that every operation of the current
> implementation matches the complexity of the Python implementation.
>

It means rebuilding hash table to clean up dummy entries.
So, even when dict size is not increasing, remove + insert loop has
worst case O(n), amortized O(1) complexity.
msg301942 - (view) Author: INADA Naoki (inada.naoki) * (Python committer) Date: 2017-09-12 07:40
> I would side with Inada in thinking they both give the same amortized complexity, but beyond that, benchmarks are the real answer.  There is little value in keeping the current implementation of OrderedDict *if* benchmarks show that it is rarely faster.

As I wrote in here:
https://bugs.python.org/issue31265#msg300757

PyPy-like ODict implementation is:

* 1000 less lines of code
* 50% less memory usage
* 15% faster creation
* 100% (2x) faster iteration
* 20% slower move_to_end
* 40% slower comparison

Since current implementation is still draft, there are some
potential optimization.

* Optimize over allocation ratio for move_to_end.  I have not tried to adjust it for nice balance between speed and space.
* I used temporary list to comparing keys safely.  But it can be avoided like normal iteration.

I'll try them in this or next week.
msg301958 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2017-09-12 13:33
> It means rebuilding hash table to clean up dummy entries.
> So, even when dict size is not increasing, remove + insert loop has
> worst case O(n), amortized O(1) complexity.

Ah, so it matches the pure Python implementation then.
History
Date User Action Args
2017-11-09 12:38:25vstinnersetnosy: + vstinner
2017-09-12 13:33:14eric.snowsetmessages: + msg301958
2017-09-12 07:40:07inada.naokisetmessages: + msg301942
2017-09-12 07:27:13inada.naokisetmessages: + msg301941
2017-09-12 01:23:47eric.snowsetmessages: + msg301933
2017-09-11 06:39:22arigosetmessages: + msg301863
2017-09-11 05:59:27inada.naokisetmessages: + msg301859
2017-09-11 05:27:11serhiy.storchakasetmessages: + msg301857
2017-09-11 01:05:03inada.naokisetmessages: + msg301844
2017-09-11 00:52:21rhettingersetnosy: + aronacher
messages: + msg301843
2017-09-10 23:54:22inada.naokisetnosy: + arigo
messages: + msg301841
2017-09-10 23:12:18inada.naokisetmessages: + msg301839
2017-09-10 22:40:38rhettingersetmessages: + msg301837
2017-09-10 22:28:34rhettingersetmessages: + msg301836
2017-09-05 01:40:11inada.naokisetmessages: + msg301291
2017-09-04 18:20:51eric.snowsetmessages: + msg301238
2017-08-26 19:50:32rhettingersetassignee: rhettinger
resolution: later
messages: + msg300902
2017-08-24 12:51:11inada.naokisetnosy: + eric.snow
2017-08-23 17:30:30inada.naokisetmessages: + msg300760
2017-08-23 16:53:57serhiy.storchakasetmessages: + msg300759
2017-08-23 15:34:52inada.naokisetmessages: + msg300758
2017-08-23 15:27:01inada.naokisetmessages: + msg300757
2017-08-23 15:11:54inada.naokisetmessages: + msg300756
2017-08-23 14:37:46serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg300753
2017-08-23 14:22:13rhettingersetmessages: + msg300752
2017-08-23 13:42:05rhettingersetnosy: + rhettinger
2017-08-23 10:45:31inada.naokisetmessages: + msg300749
2017-08-23 10:29:32inada.naokisetpull_requests: + pull_request3232
2017-08-23 10:24:19inada.naokicreate