msg258893 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2016-01-24 17:47 |
The behavior of deque.insert() in a bounded deque is a bit odd:
>>> from collections import deque
>>> d = deque(range(20), maxlen=10)
>>> d
deque([10, 11, 12, 13, 14, 15, 16, 17, 18, 19], maxlen=10)
>>> d.insert(3, 'New')
>>> d
deque([19, 10, 11, 'New', 13, 14, 15, 16, 17, 18], maxlen=10)
# ^ ^ ^--- The 12 got replaced
# | \--- The elements before the insertion point got rotated
# \--- The final element got rotated to first position
With append/appendleft/extend/extendleft, the intended and useful behavior for maxlen is to pop from the opposite end. But for deque.insert(), the best and most useful behavior invariants are less obvious.
Ideally, the effect of d.insert(0, item) would be the same as d.appendleft(item), and the effect of d.insert(len(d), item) would be the same as d.append(item).
Also, d.insert(i, newitem) should have the post-condition:
assert d[i] == newitem if i < len(d) else d[-1] == newitem
Having decided where to put the newitem, the question turns to deciding which element should be popped-off to maintain the size invariant.
I'm thinking that it should always be the rightmost element that gets popped, so that items before the inserted element never get moved. This matches what insert() does for unbounded deques, making it the least surprising choice.
|
msg258943 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2016-01-26 02:45 |
Mark, Serhiy and Timmy, do you have any thoughts on what is the right behavior?
One option is to always pop the rightmost element to make room, but that results in a weird asymmetry between d.insert(len(d), item) and what d.append(item) would do. I can't seem to come with a coherent set of invariants that doesn't have a surprising discontinuity.
Another option is to "refuse the temptation to guess" at what the user intends for the popped-off element to be and to raise an exception. But that isn't very user friendly either.
|
msg258947 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2016-01-26 06:38 |
Another take: Do a regular insertion with no special cases, followed by a pop from the right:
>>> for i in range(len(d) + 1):
s = list(d)
s.insert(i, None)
_ = s.pop()
print(i, s)
0 [None, 'a', 'b']
1 ['a', None, 'b']
2 ['a', 'b', None]
3 ['a', 'b', 'c']
Nice parts:
* Doesn't change pop direction depending on the inputs
* Guarantee that entries to the left of i will keep their position.
* Post condition of d[i]==newelem applies whenever i exists.
Not nice part:
* Initially surprising that d.insert(len(d), newelem) does not actually insert newelem.
* d.insert(len(d), newelem) not the same as d.append(newelem).
Another alternative is to raise an exception for the one case where index==maxlen, with the theory being that d[index] won't be a valid index so you can't insert anything there.
|
msg258948 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2016-01-26 08:41 |
The plain-insert-followed-by-a-pop-on-the-right is giving reasonable results with nice properties:
d.insert(0, i)
-----------------------------------
0 --> deque([0], maxlen=4)
1 --> deque([1, 0], maxlen=4)
2 --> deque([2, 1, 0], maxlen=4)
3 --> deque([3, 2, 1, 0], maxlen=4)
4 --> deque([4, 3, 2, 1], maxlen=4)
5 --> deque([5, 4, 3, 2], maxlen=4)
6 --> deque([6, 5, 4, 3], maxlen=4)
7 --> deque([7, 6, 5, 4], maxlen=4)
8 --> deque([8, 7, 6, 5], maxlen=4)
9 --> deque([9, 8, 7, 6], maxlen=4)
d.insert(len(d), i)
-----------------------------------
0 --> deque([0], maxlen=4)
1 --> deque([0, 1], maxlen=4)
2 --> deque([0, 1, 2], maxlen=4)
3 --> deque([0, 1, 2, 3], maxlen=4)
4 --> deque([0, 1, 2, 3], maxlen=4)
5 --> deque([0, 1, 2, 3], maxlen=4)
6 --> deque([0, 1, 2, 3], maxlen=4)
7 --> deque([0, 1, 2, 3], maxlen=4)
8 --> deque([0, 1, 2, 3], maxlen=4)
9 --> deque([0, 1, 2, 3], maxlen=4)
|
msg258995 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2016-01-27 05:46 |
New changeset ce3d47eaeb21 by Raymond Hettinger in branch '3.5':
Issue #26194: Fix undefined behavior for deque.insert() when len(d) == maxlen
https://hg.python.org/cpython/rev/ce3d47eaeb21
|
msg258996 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2016-01-27 06:59 |
New behavior looks less surprising to me.
Old behavior is even more weird for negative indices:
>>> from collections import deque
>>> d = deque(range(20), maxlen=10)
>>> d
deque([10, 11, 12, 13, 14, 15, 16, 17, 18, 19], maxlen=10)
>>> d.insert(-3, 'New')
>>> d
deque([11, 12, 13, 14, 15, 16, 'New', 18, 19, 10], maxlen=10)
Note that new element not just replaced the old one in the middle of the deque, but all context was rotated one position left.
Patched code behave less surprising.
>>> d.insert(-3, 'New')
>>> d
deque([10, 11, 12, 13, 14, 15, 'New', 16, 17, 18], maxlen=10)
|
msg259004 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2016-01-27 09:11 |
But the postcondition d[i] == newitem is broken if i is negative.
>>> d = deque('ABC', maxlen=3)
>>> d.insert(-1, None)
>>> d
deque(['A', None, 'B'], maxlen=3)
I would expected ['A', 'B', None].
|
msg259005 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2016-01-27 09:14 |
> I would expected ['A', 'B', None].
Hum, it seems consistent with list behaviour, no?
>>> x=list("abc")
>>> x.insert(-1, "X")
>>> x
['a', 'b', 'X', 'c']
-1 is the same than len(x)-1 (2 in this example).
deque(['A', None, 'B'], maxlen=3) seems correct to me.
|
msg259007 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2016-01-27 09:24 |
-1 is the position before the last element ('C'). After popping-off extra element the result can be ['A', 'B', None] or ['B', None, 'C'].
|
msg259010 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2016-01-27 09:27 |
> -1 is the position before the last element ('C'). After popping-off extra element the result can be ['A', 'B', None] or ['B', None, 'C'].
Oh ok :-) Now I'm confused, I don't know what is the expected behaviour :-)
|
msg259043 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2016-01-27 18:31 |
The expected behavior should be "insert normally as if the deque were unbounded and then pop-off the rightmost element to restore the maxlen invariant".
The applied fix does that for non-negative indices but gives the wrong result for negative indicies:
>>> from collections import deque
>>> d = deque('abcde', maxlen=5)
>>> d.insert(-1, 'f')
>>> list(d)
['a', 'b', 'c', 'f', 'd']
>>> s = list('abcde')
>>> s.insert(-1, 'f'); del s[-1]
>>> s
['a', 'b', 'c', 'd', 'f']
I think the behavior can be made explainable and also be useful for common cases, but there is likely no getting around odd looking results with negative index insertions into bounded deques already at their maximum size.
|
msg259046 - (view) |
Author: Tim Peters (tim.peters) *  |
Date: 2016-01-27 18:56 |
I'd raise an exception when trying to insert into a bounded deque that's already full. There's simply no way to guess what was _intended_; it's dead easy for the user to implement what they _do_ intend (first make room by deleting the specific item they no longer want); and I can't think of a use case compelling enough to justify whatever arbitrary non-exceptional behavior may be implemented instead.
WRT the behavior you settled on, sure, it's explainable. That doesn't imply it's useful, though ;-) I'd rather have an exception. It's plain bizarre that after
d.insert(i, x)
one can't even rely on
assert any(y is x for y in d)
succeeding. Implementing behavior that allows that invariant to fail is _really_ user-unfriendly ;-)
In contrast, what .append() and .appendleft() do for a full bounded deque are compelling (and don't violate the weak invariant above).
|
msg259050 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2016-01-27 19:53 |
What if implement the bahavior described by Raymond for index -len(d) <= i < len(d), but raise an exception if the index is out of this range?
The result of deque('abc', maxlen=3).insert(i, 'X') would be:
-4: error
-3: ['X', 'a', 'b']
-2: ['a', 'X', 'b']
-1: ['a', 'b', 'X']
0: ['X', 'a', 'b']
1: ['a', 'X', 'b']
2: ['a', 'b', 'X']
3: error
|
msg259055 - (view) |
Author: Tim Peters (tim.peters) *  |
Date: 2016-01-27 20:18 |
My opinion doesn't change: I'd rather see an exception. I see no use case for inserting "into the middle" of a full bounded queue. If I had one, it would remain trivial to force the specific behavior I intended.
|
msg259063 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2016-01-27 21:24 |
My only aversion to raising an exception is that it goes against the original intent of maxlen as giving an automatic-pop-to-make-room behavior.
Rather that introduce a discontinuity, I like the "smoothness" of letting d.insert(len(d), obj) follow the insert-normally-then-pop-from-the-right rule as opposed to having a special case.
|
msg259072 - (view) |
Author: Josh Rosenberg (josh.r) *  |
Date: 2016-01-27 23:36 |
I agree with Tim that arbitrarily deciding that insert should behave more like appendleft is surprising. This really feels like guessing at what the user wants, silently doing something that really isn't predictable.
Any of the following seem nicer (not exactly arguing for any but #1):
1. Raising an exception for full deque
2. Making insert pop left when full (and possibly make insertleft that will pop right; the name isn't perfect, but it would make the symmetry between "no suffix pops left and 'left' suffix pops right" line up with other deque methods)
3. Adding an optional argument to say which end should be popped (defaulting to "raise an exception", probably no good though, since it would break Sequence rules.
|
msg259115 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2016-01-28 09:52 |
full_deque2.diff LGTM. But I have added minor suggestions on Rietveld.
|
msg259156 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2016-01-28 19:59 |
Since slicing is going to be added to deques, it is worth thinking about what the behavior should be there as well:
d = deque('abcde', maxlen=7)
d[3:4] = 'efghijklm'
Side note: repetition behaves like extend() using maxlen to decide how much to truncate from the left:
>>> from collections import deque
>>> d = deque('abcde', maxlen=8)
>>> d *= 2
>>> d
deque(['c', 'd', 'e', 'a', 'b', 'c', 'd', 'e'], maxlen=8)
Ideally, the deque API should be coherent and simple taken as a whole so that the behaviors are consistent and predictable as possible even if the general rules lead to some oddities in some uncommon cases.
One such general rule could be: "When maxlen is defined, operations proceed as if the deque were unbounded and then truncation is applied to the left" (this is like the rule in decimal where inputs to operations are treated as precise and context rounding is applied after the operation). The general rule would fit a mental models of "inserted new data is prioritized over old data", that "maxlen is all about automatically evicting data to make room for the newer data", and that "methods without 'left' in their name correspond to newest data on the right and next to be evicted on the left".
|
msg259216 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2016-01-29 17:21 |
*All* reasons look reasonable. May be discuss this on Python-Dev or ask BDFL?
From my point it looks that correct implementation of the insertion is too complex and this feature should go only in development release (if any).
|
msg259359 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2016-02-02 05:21 |
New changeset 7dabb94bdf63 by Raymond Hettinger in branch '3.5':
Issue #26194: Inserting into a full deque to raise an IndexError
https://hg.python.org/cpython/rev/7dabb94bdf63
|
msg259369 - (view) |
Author: Martin Panter (martin.panter) *  |
Date: 2016-02-02 08:24 |
FYI Raymond, your revision 0731f097157b didn’t actually merge the 3.5 branch (It only had one parent). Also, the default branch never got any NEWS entry in 58266f5101cc, so maybe it needs to be added?
|
msg259372 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2016-02-02 09:42 |
> FYI Raymond, your revision 0731f097157b didn’t actually merge the 3.5 branch (It only had one parent). Also, the default branch never got any NEWS entry in 58266f5101cc, so maybe it needs to be added?
Maybe it's also worth to document the change in Python 3.5 and 3.6 documentation with a ".. versionchanged::"?
https://docs.python.org/dev/library/collections.html#deque-objects
It can be suprising to get a different behaviour between two bugfix versions (Python 3.5.1 and 3.5.2). I'm not saying that we must not fix bugs, just that it helps users to document them :-)
It will also avoid bug reports about this behaviour change. Example of user report: issue #26260, "it works on Python 2 but no more on Python 3" ;-)
|
msg259405 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2016-02-02 17:09 |
Reopened. Otherwise Martin's comments can be lost in spam.
> Maybe it's also worth to document the change in Python 3.5 and 3.6 documentation with a ".. versionchanged::"?
I don't think this is needed. This isn't new feature, this is just a bugfix, old behavior obviously was incorrect. We don't add the versionchanged directive for every bugfix.
|
msg259418 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2016-02-02 19:03 |
Ok no problem, it was just a suggestion.
In asyncio, we try to document behaviour change but asyncio is different.
Major functions are still modified, asyncio API is still provisional, but
people rely on tehe exact API.
|
msg259501 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2016-02-03 19:01 |
[Martin]
> the default branch never got any NEWS entry in
> 58266f5101cc, so maybe it needs to be added?
No need for a news entry in default. We haven't released 3.6 yet, so the bugfix from 3.5 is just a carry forward of the corrected behavior. That isn't news ;-)
[Martin]
> revision 0731f097157b didn’t actually merge the 3.5
> branch (It only had one parent)
I'm not sure I see what was missed here (I am not a mercurial jedi). My 3.5 checkout shows the change and so does the 3.6. What needs to be done at this point ?
[Serhiy]
> This isn't new feature, this is just a bugfix, old behavior
> obviously was incorrect. We don't add the versionchanged
> directive for every bugfix.
I concur with Serhiy.
|
msg259507 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2016-02-03 19:30 |
> No need for a news entry in default. We haven't released 3.6 yet, so the bugfix from 3.5 is just a carry forward of the corrected behavior. That isn't news ;-)
NEWS entries for 3.5.2 are not included in the NEWS file for 3.6. Usually we add bugfix NEWS entries in all branches that take the fix.
> What needs to be done at this point ?
It is too late to do anything except add a NEWS entry. Branches were merged by Martin when his applied next patch.
|
msg259519 - (view) |
Author: Martin Panter (martin.panter) *  |
Date: 2016-02-03 21:20 |
I thought the 3.6.0 NEWS file generally listed bugs fixed since (at least) the 3.5.0 release. But the NEWS in default has no mention of Issue 26194.
Serhiy is right that there is nothing more to do for the merging problem. I was just pointing out that something in your “merge” commit must have not worked properly. Normally a merge looks like revision 58266f5101cc (there are two parents listed). But in this case I pulled from default expecting to get all the 3.5 changes as well, but never got your latest 3.5 change until I pulled it explicitly.
|
msg259527 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2016-02-04 03:26 |
I can't say that I agree with putting an additional entry the 3.6 NEWS. In the past, I've not made news entries in unreleased versions regarding bug fixes in prior versions. That seems pointless to me and it would tend to clutter a document whose size already gets in the way of its usability.
|