classification
Title: Linked list API for ordereddict
Type: enhancement Stage: resolved
Components: Versions: Python 3.5
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: rhettinger Nosy List: ezio.melotti, loewis, pitrou, rhettinger, serhiy.storchaka, vstinner, yaubi
Priority: normal Keywords: patch

Created on 2014-07-28 21:30 by pitrou, last changed 2014-08-04 20:30 by rhettinger. This issue is now closed.

Files
File name Uploaded Description Edit
issue22097.patch yaubi, 2014-07-29 00:49 review
Messages (17)
msg224191 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2014-07-28 21:30
Right now Python doesn't have an ordered container which allows O(1) insertion at any place. It would be nice to have such an API on ordered dicts (e.g. insert_after() / insert_before()).
msg224207 - (view) Author: Yoann Aubineau (yaubi) * Date: 2014-07-29 00:49
Here is an attempt to implement such an API. The patch adds 4 methods to OrderedDict: move_before, move_after, insert_before and insert_after with corresponding tests.

The method signatures do not feel right though. I am not sure where the reference key should be placed in the argument list. It is especially ambiguous with move_* methods which simply take two keys with no hint on the one that will actually be moved.
msg224210 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2014-07-29 01:22
I would like any proposed API extensions to be use case driven.

The omission of these methods wasn't accidental.  I studied pre-existing OD implementations for Python and looked at how they were used in the real-world (back when we still had Google Code Search available as a research tool).

For now, I *really* like the learning curve for OD is near zero and would like to keep it that way unless there is a compelling advantage.
msg224211 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2014-07-29 01:45
My current use case is manipulation of Numba IR code. I'm working on a transformation pass that adds instructions at arbitrary places in existing IR. The primitive I need is "add an instruction after another one" (which is exactly the insert_after() primitive proposed here :-)). Since Python lacks a linked list-like structure, my current solution is to use a regular list, meaning O(n) inserts.

(a tedious solution with a regular list would be lazy inserts and batching)

The code has nothing special but you can see it here:
https://github.com/pitrou/numba/blob/7538d4c96b64a19691c2f9b6ec894f777db1a996/numba/ir.py#L538

It may be argued that the general problem more has to do with the lack of a linked-list structure rather than ordereddict itself. However, I can't think of a nice plain linked-list API in Python, while the ordereddict abstraction is adequate.
msg224217 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2014-07-29 09:40
ISTM that a dictionary is not the proper data structure for an instruction list. The support for the Mapping interface is not needed at all (AFAICT).

It's possible to make a list implementation with O(1) insert_after, using the same strategy that OrderedDict uses (i.e. maintain a mapping from value to link element).
msg224220 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-07-29 11:09
About your use case. Note that all indices after inserted element should be shifted. If you inserted more than one statement in the same ir_block, all but first insertions can be at wrong position.
msg224224 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2014-07-29 13:27
Le 29/07/2014 07:09, Serhiy Storchaka a écrit :
>
> About your use case. Note that all indices after inserted element
should be shifted. If you inserted more than one statement in the same
ir_block, all but first insertions can be at wrong position.

Thanks for noticing! This is precisely why I'm not using indices, and 
why a regular sequence isn't fit for the use case.
msg224226 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2014-07-29 13:34
Le 29/07/2014 05:40, Martin v. Löwis a écrit :
>
> ISTM that a dictionary is not the proper data structure for an
instruction list. The support for the Mapping interface is not needed at
all (AFAICT).

This is true.

> It's possible to make a list implementation with O(1) insert_after,
using the same strategy that OrderedDict uses (i.e. maintain a mapping
from value to link element).

Are you suggesting the collections module is ready for a linked list 
implementation to go into it?
msg224231 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2014-07-29 15:39
> Are you suggesting the collections module is ready for a linked list 
> implementation to go into it?

I don't know about the collections module. All I'm saying is that
a linked list with an efficient insert_after(x) could be implemented.
I'm not seeing one on PyPI, so if this was a desirable thing to have
in the standard library, it should probably have a life on PyPI first.
msg224232 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2014-07-29 15:41
> I don't know about the collections module. All I'm saying is that
> a linked list with an efficient insert_after(x) could be implemented.
> I'm not seeing one on PyPI

See https://pypi.python.org/pypi/llist/
(and probably others)
msg224233 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2014-07-29 16:09
Am 29.07.14 17:41, schrieb Antoine Pitrou:
> 
> Antoine Pitrou added the comment:
> 
>> I don't know about the collections module. All I'm saying is that
>> a linked list with an efficient insert_after(x) could be implemented.
>> I'm not seeing one on PyPI
> 
> See https://pypi.python.org/pypi/llist/
> (and probably others)

This does not meet your requirement of supporting insert_after(x),
its insert(x, before) requires to give a dllistnode. To support that,
it would need a mapping from list item value to list link element,
which it doesn't have.
msg224234 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2014-07-29 16:17
> This does not meet your requirement of supporting insert_after(x),
> its insert(x, before) requires to give a dllistnode.

Indeed it lacks that piece of prettiness in the API. And adding a mapping to allow lookups from item to node really means using a orderedset or ordereddict-like structure. Which is why I was proposing to reuse the existing infrastructure.

(note that I *cannot* reuse it from my own third-party class, since the necessary bits are all internal)

Still you may make the point that ordereddict is not appropriate, andorderedset would be better for this API. Raymond, would you support adding an orderedset to the collections module?
msg224282 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2014-07-30 01:15
Just for the record, when I originally looked at insert_before() and insert_after(), here's some of things that bugged me a little:

* insert_before(currkey, newkey, newvalue) ncan raise TypeErrors for hashability from either key.   It can raise a KeyError when currkey doesn't exist and a KeyError when newkey already exists.  It can raise a ValueError with currkey is equal to newkey.   That is a mess of exceptions that a user might need to untangle with exception chaining or custom exceptions.

* One use case for the insert methods is trying to maintain a sort order (such as an alphabetical order) but we don't have any efficient search methods such as a binary search to find an insertion point.  For example, if I read an alphabetically sorted config file of key / value pairs, how would I insert a new key/value pair in the proper position?

I also looked at adding an OrderedSet at one point and held-off because:

* there were no feature requests from users
* at the time, I wasn't finding third-party implementations on pypi,
  in the wild, or in the rich toolsets like Enthought's distro.
* it was so easy to build one by inheriting from a MutableSet
  and using an underlying OrderedDict.
* I also put a direct impplementation in a recipe on ASPN
  http://code.activestate.com/recipes/576694/
  to see if there was any uptake
* it was unclear what the ordering semantics should be on the
  various set-to-set operations (it would entail turning off
  some of the current optimizations such as intersection()
  looping over the smaller input set rather than the first
  listed set).
* In Python 3 and 2.7, fromkeys() readily constructs a set
  and the views on keys() and items() already support set
  operations.  AFAICT, nobody uses these even though all the
  desired set functionality is already there.

Since that time, there has been very little interest shown in an ordered set.  There have been a few upvotes on the ASPN recipe and a little activity on PyPi with a C version using Cython (see https://pypi.python.org/pypi?%3Aaction=search&term=ordered+set&submit=search ).

Another thing that caused me to hold-off was the possibility that regular sets could be made to be ordered with very little overhead (completely eliminating the need for a separate type).  I have two different approaches ready if we wanted to go down that path.

Now that PyPI has an ordered set offering, there is even less of a need for us to put one in the standard library unless it becomes more of an everyday need.
msg224285 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2014-07-30 01:42
Le 29/07/2014 21:15, Raymond Hettinger a écrit :
>
> * One use case for the insert methods is trying to maintain a sort
order (such as an alphabetical order) but we don't have any efficient
search methods such as a binary search to find an insertion point. For
example, if I read an alphabetically sorted config file of key / value
pairs, how would I insert a new key/value pair in the proper position?

You'd certainly prefer a tree for that use case (O(log n) search and 
insertion rather than O(n) search and O(1) insertion).

I hadn't thought about the set operations. The use case here is really 
linked-list-alike, not set-alike.

I'm mildly relieved that, even though O(n), middle-of-list insertions 
are still plenty fast for reasonable sizes, which means Numba shouldn't 
suffer here (even though we do seem to have users generating Python code 
and then JIT-compiling it...).
msg224682 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2014-08-04 05:20
Antoine, can I close this one?
msg224714 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2014-08-04 13:09
Yes, I think it's clear it's not going anywhere. In a way the standard list is good enough for those uses, except when your lists start growing really big (I don't know where the threshold is, but tens of thousands I'd say). Perhaps someone should make a blog post comparing theoretical and actual performance of various data types (hint hint ;-)).
msg224768 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2014-08-04 20:30
>  Perhaps someone should make a blog post comparing
> theoretical and actual performance of various data 
> types (hint hint ;-)).

Great idea.

Thanks Antoine.
History
Date User Action Args
2014-08-04 20:30:48rhettingersetstatus: open -> closed
resolution: rejected
messages: + msg224768

stage: resolved
2014-08-04 13:09:17pitrousetmessages: + msg224714
2014-08-04 05:20:15rhettingersetmessages: + msg224682
2014-07-30 01:42:29pitrousetmessages: + msg224285
2014-07-30 01:15:20rhettingersetmessages: + msg224282
2014-07-29 16:17:42pitrousetmessages: + msg224234
2014-07-29 16:09:11loewissetmessages: + msg224233
2014-07-29 15:41:18pitrousetmessages: + msg224232
2014-07-29 15:39:37loewissetmessages: + msg224231
2014-07-29 13:34:53pitrousetmessages: + msg224226
2014-07-29 13:27:21pitrousetmessages: + msg224224
2014-07-29 11:09:22serhiy.storchakasetmessages: + msg224220
2014-07-29 09:40:04loewissetnosy: + loewis
messages: + msg224217
2014-07-29 01:45:43pitrousetmessages: + msg224211
2014-07-29 01:22:18rhettingersetassignee: rhettinger
messages: + msg224210
2014-07-29 00:49:53yaubisetfiles: + issue22097.patch
keywords: + patch
messages: + msg224207
2014-07-28 22:18:24yaubisetnosy: + yaubi
2014-07-28 21:39:00ezio.melottisetnosy: + ezio.melotti
2014-07-28 21:33:28vstinnersetnosy: + vstinner
2014-07-28 21:30:29pitroucreate