classification
Title: Undefined behavior for deque.insert() when len(d) == maxlen
Type: behavior Stage:
Components: Extension Modules Versions: Python 3.6, Python 3.5
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: rhettinger Nosy List: josh.r, mark.dickinson, martin.panter, python-dev, rhettinger, serhiy.storchaka, tim.peters, vstinner
Priority: normal Keywords: patch

Created on 2016-01-24 17:47 by rhettinger, last changed 2016-02-07 09:06 by rhettinger. This issue is now closed.

Files
File name Uploaded Description Edit
full_deque.diff rhettinger, 2016-01-26 03:59 Option to raise an IndexError review
full_deque_alt.diff rhettinger, 2016-01-26 07:24 Plain insert followed by pop from right review
full_deque2.diff rhettinger, 2016-01-27 23:48 Index error for insert into a full deque review
Messages (28)
msg258893 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2016-01-24 17:47
The behavior of deque.insert() in a bounded deque is a bit odd:

>>> from collections import deque
>>> d = deque(range(20), maxlen=10)
>>> d
deque([10, 11, 12, 13, 14, 15, 16, 17, 18, 19], maxlen=10)
>>> d.insert(3, 'New')
>>> d
deque([19, 10, 11, 'New', 13, 14, 15, 16, 17, 18], maxlen=10)
#       ^   ^        ^--- The 12 got replaced
#       |   \--- The elements before the insertion point got rotated
#       \--- The final element got rotated to first position

With append/appendleft/extend/extendleft, the intended and useful behavior for maxlen is to pop from the opposite end.  But for deque.insert(), the best and most useful behavior invariants are less obvious.

Ideally, the effect of d.insert(0, item) would be the same as d.appendleft(item), and the effect of d.insert(len(d), item) would be the same as d.append(item).  

Also, d.insert(i, newitem) should have the post-condition:

  assert d[i] == newitem if i < len(d) else d[-1] == newitem

Having decided where to put the newitem, the question turns to deciding which element should be popped-off to maintain the size invariant.

I'm thinking that it should always be the rightmost element that gets popped, so that items before the inserted element never get moved.  This matches what insert() does for unbounded deques, making it the least surprising choice.
msg258943 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2016-01-26 02:45
Mark, Serhiy and Timmy, do you have any thoughts on what is the right behavior?

One option is to always pop the rightmost element to make room, but that results in a weird asymmetry between d.insert(len(d), item) and what d.append(item) would do.  I can't seem to come with a coherent set of invariants that doesn't have a surprising discontinuity.

Another option is to "refuse the temptation to guess" at what the user intends for the popped-off element to be and to raise an exception.  But that isn't very user friendly either.
msg258947 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2016-01-26 06:38
Another take: Do a regular insertion with no special cases, followed by a pop from the right:

>>> for i in range(len(d) + 1):
	s = list(d)
	s.insert(i, None)      
	_ = s.pop()
	print(i, s)
	
0 [None, 'a', 'b']
1 ['a', None, 'b']
2 ['a', 'b', None]
3 ['a', 'b', 'c']

Nice parts:
* Doesn't change pop direction depending on the inputs
* Guarantee that entries to the left of i will keep their position.
* Post condition of d[i]==newelem applies whenever i exists.

Not nice part:
* Initially surprising that d.insert(len(d), newelem) does not actually insert newelem.
* d.insert(len(d), newelem) not the same as d.append(newelem).

Another alternative is to raise an exception for the one case where index==maxlen, with the theory being that d[index] won't be a valid index so you can't insert anything there.
msg258948 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2016-01-26 08:41
The plain-insert-followed-by-a-pop-on-the-right is giving reasonable results with nice properties:

d.insert(0, i)
-----------------------------------
0 --> deque([0], maxlen=4)
1 --> deque([1, 0], maxlen=4)
2 --> deque([2, 1, 0], maxlen=4)
3 --> deque([3, 2, 1, 0], maxlen=4)
4 --> deque([4, 3, 2, 1], maxlen=4)
5 --> deque([5, 4, 3, 2], maxlen=4)
6 --> deque([6, 5, 4, 3], maxlen=4)
7 --> deque([7, 6, 5, 4], maxlen=4)
8 --> deque([8, 7, 6, 5], maxlen=4)
9 --> deque([9, 8, 7, 6], maxlen=4)

d.insert(len(d), i)
-----------------------------------
0 --> deque([0], maxlen=4)
1 --> deque([0, 1], maxlen=4)
2 --> deque([0, 1, 2], maxlen=4)
3 --> deque([0, 1, 2, 3], maxlen=4)
4 --> deque([0, 1, 2, 3], maxlen=4)
5 --> deque([0, 1, 2, 3], maxlen=4)
6 --> deque([0, 1, 2, 3], maxlen=4)
7 --> deque([0, 1, 2, 3], maxlen=4)
8 --> deque([0, 1, 2, 3], maxlen=4)
9 --> deque([0, 1, 2, 3], maxlen=4)
msg258995 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2016-01-27 05:46
New changeset ce3d47eaeb21 by Raymond Hettinger in branch '3.5':
Issue #26194: Fix undefined behavior for deque.insert() when len(d) == maxlen
https://hg.python.org/cpython/rev/ce3d47eaeb21
msg258996 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-01-27 06:59
New behavior looks less surprising to me.

Old behavior is even more weird for negative indices:

>>> from collections import deque
>>> d = deque(range(20), maxlen=10)
>>> d
deque([10, 11, 12, 13, 14, 15, 16, 17, 18, 19], maxlen=10)
>>> d.insert(-3, 'New')
>>> d
deque([11, 12, 13, 14, 15, 16, 'New', 18, 19, 10], maxlen=10)

Note that new element not just replaced the old one in the middle of the deque, but all context was rotated one position left.

Patched code behave less surprising.

>>> d.insert(-3, 'New')
>>> d                                                                   
deque([10, 11, 12, 13, 14, 15, 'New', 16, 17, 18], maxlen=10)
msg259004 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-01-27 09:11
But the postcondition d[i] == newitem is broken if i is negative.

>>> d = deque('ABC', maxlen=3)
>>> d.insert(-1, None)
>>> d
deque(['A', None, 'B'], maxlen=3)

I would expected ['A', 'B', None].
msg259005 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-01-27 09:14
> I would expected ['A', 'B', None].

Hum, it seems consistent with list behaviour, no?

>>> x=list("abc")
>>> x.insert(-1, "X")
>>> x
['a', 'b', 'X', 'c']

-1 is the same than len(x)-1 (2 in this example).

deque(['A', None, 'B'], maxlen=3) seems correct to me.
msg259007 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-01-27 09:24
-1 is the position before the last element ('C'). After popping-off extra element the result can be ['A', 'B', None] or ['B', None, 'C'].
msg259010 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-01-27 09:27
> -1 is the position before the last element ('C'). After popping-off extra element the result can be ['A', 'B', None] or ['B', None, 'C'].

Oh ok :-) Now I'm confused, I don't know what is the expected behaviour :-)
msg259043 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2016-01-27 18:31
The expected behavior should be "insert normally as if the deque were unbounded and then pop-off the rightmost element to restore the maxlen invariant".

The applied fix does that for non-negative indices but gives the wrong result for negative indicies:

  >>> from collections import deque
  >>> d = deque('abcde', maxlen=5)
  >>> d.insert(-1, 'f')
  >>> list(d)
  ['a', 'b', 'c', 'f', 'd']
  >>> s = list('abcde')
  >>> s.insert(-1, 'f'); del s[-1]
  >>> s
  ['a', 'b', 'c', 'd', 'f']

I think the behavior can be made explainable and also be useful for common cases, but there is likely no getting around odd looking results with negative index insertions into bounded deques already at their maximum size.
msg259046 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2016-01-27 18:56
I'd raise an exception when trying to insert into a bounded deque that's already full.  There's simply no way to guess what was _intended_; it's dead easy for the user to implement what they _do_ intend (first make room by deleting the specific item they no longer want); and I can't think of a use case compelling enough to justify whatever arbitrary non-exceptional behavior may be implemented instead.

WRT the behavior you settled on, sure, it's explainable.  That doesn't imply it's useful, though ;-)  I'd rather have an exception.  It's plain bizarre that after

    d.insert(i, x)

one can't even rely on

    assert any(y is x for y in d)

succeeding.  Implementing behavior that allows that invariant to fail is _really_ user-unfriendly ;-)

In contrast, what .append() and .appendleft() do for a full bounded deque are compelling (and don't violate the weak invariant above).
msg259050 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-01-27 19:53
What if implement the bahavior described by Raymond for index -len(d) <= i < len(d), but raise an exception if the index is out of this range?

The result of deque('abc', maxlen=3).insert(i, 'X') would be:

-4: error
-3: ['X', 'a', 'b']
-2: ['a', 'X', 'b']
-1: ['a', 'b', 'X']
 0: ['X', 'a', 'b']
 1: ['a', 'X', 'b']
 2: ['a', 'b', 'X']
 3: error
msg259055 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2016-01-27 20:18
My opinion doesn't change:  I'd rather see an exception.  I see no use case for inserting "into the middle" of a full bounded queue.  If I had one, it would remain trivial to force the specific behavior I intended.
msg259063 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2016-01-27 21:24
My only aversion to raising an exception is that it goes against the original intent of maxlen as giving an automatic-pop-to-make-room behavior.

Rather that introduce a discontinuity, I like the "smoothness" of letting d.insert(len(d), obj) follow the insert-normally-then-pop-from-the-right rule as opposed to having a special case.
msg259072 - (view) Author: Josh Rosenberg (josh.r) * (Python triager) Date: 2016-01-27 23:36
I agree with Tim that arbitrarily deciding that insert should behave more like appendleft is surprising. This really feels like guessing at what the user wants, silently doing something that really isn't predictable.

Any of the following seem nicer (not exactly arguing for any but #1):

1. Raising an exception for full deque
2. Making insert pop left when full (and possibly make insertleft that will pop right; the name isn't perfect, but it would make the symmetry between "no suffix pops left and 'left' suffix pops right" line up with other deque methods)
3. Adding an optional argument to say which end should be popped (defaulting to "raise an exception", probably no good though, since it would break Sequence rules.
msg259115 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-01-28 09:52
full_deque2.diff LGTM. But I have added minor suggestions on Rietveld.
msg259156 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2016-01-28 19:59
Since slicing is going to be added to deques, it is worth thinking about what the behavior should be there as well:

   d = deque('abcde', maxlen=7)
   d[3:4] = 'efghijklm'

Side note: repetition behaves like extend() using maxlen to decide how much to truncate from the left:

    >>> from collections import deque
    >>> d = deque('abcde', maxlen=8)
    >>> d *= 2
    >>> d
    deque(['c', 'd', 'e', 'a', 'b', 'c', 'd', 'e'], maxlen=8)

Ideally, the deque API should be coherent and simple taken as a whole so that the behaviors are consistent and predictable as possible even if the general rules lead to some oddities in some uncommon cases.  

One such general rule could be:  "When maxlen is defined, operations proceed as if the deque were unbounded and then truncation is applied to the left" (this is like the rule in decimal where inputs to operations are treated as precise and context rounding is applied after the operation).  The general rule would fit a mental models of "inserted new data is prioritized over old data", that "maxlen is all about automatically evicting data to make room for the newer data", and that "methods without 'left' in their name correspond to newest data on the right and next to be evicted on the left".
msg259216 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-01-29 17:21
*All* reasons look reasonable.  May be discuss this on Python-Dev or ask BDFL?

From my point it looks that correct implementation of the insertion is too complex and this feature should go only in development release (if any).
msg259359 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2016-02-02 05:21
New changeset 7dabb94bdf63 by Raymond Hettinger in branch '3.5':
Issue #26194:  Inserting into a full deque to raise an IndexError
https://hg.python.org/cpython/rev/7dabb94bdf63
msg259369 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-02-02 08:24
FYI Raymond, your revision 0731f097157b didn’t actually merge the 3.5 branch (It only had one parent). Also, the default branch never got any NEWS entry in 58266f5101cc, so maybe it needs to be added?
msg259372 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-02-02 09:42
> FYI Raymond, your revision 0731f097157b didn’t actually merge the 3.5 branch (It only had one parent). Also, the default branch never got any NEWS entry in 58266f5101cc, so maybe it needs to be added?

Maybe it's also worth to document the change in Python 3.5 and 3.6 documentation with a ".. versionchanged::"?
https://docs.python.org/dev/library/collections.html#deque-objects

It can be suprising to get a different behaviour between two bugfix versions (Python 3.5.1 and 3.5.2). I'm not saying that we must not fix bugs, just that it helps users to document them :-)

It will also avoid bug reports about this behaviour change. Example of user report: issue #26260, "it works on Python 2 but no more on Python 3" ;-)
msg259405 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-02-02 17:09
Reopened. Otherwise Martin's comments can be lost in spam.

> Maybe it's also worth to document the change in Python 3.5 and 3.6 documentation with a ".. versionchanged::"?

I don't think this is needed. This isn't new feature, this is just a bugfix, old behavior obviously was incorrect. We don't add the versionchanged directive for every bugfix.
msg259418 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-02-02 19:03
Ok no problem, it was just a suggestion.

In asyncio, we try to document behaviour change but asyncio is different.
Major functions are still modified, asyncio API is still provisional, but
people rely on tehe exact API.
msg259501 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2016-02-03 19:01
[Martin]
>  the default branch never got any NEWS entry in 
> 58266f5101cc, so maybe it needs to be added?

No need for a news entry in default.  We haven't released 3.6 yet, so the bugfix from 3.5 is just a carry forward of the corrected behavior.  That isn't news ;-)

[Martin]
> revision 0731f097157b didn’t actually merge the 3.5 
> branch (It only had one parent)

I'm not sure I see what was missed here (I am not a mercurial jedi).  My 3.5 checkout shows the change and so does the 3.6.   What needs to be done at this point ?

[Serhiy]
> This isn't new feature, this is just a bugfix, old behavior
> obviously was incorrect. We don't add the versionchanged 
> directive for every bugfix.

I concur with Serhiy.
msg259507 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-02-03 19:30
> No need for a news entry in default.  We haven't released 3.6 yet, so the bugfix from 3.5 is just a carry forward of the corrected behavior.  That isn't news ;-)

NEWS entries for 3.5.2 are not included in the NEWS file for 3.6. Usually we add bugfix NEWS entries in all branches that take the fix.

> What needs to be done at this point ?

It is too late to do anything except add a NEWS entry. Branches were merged by Martin when his applied next patch.
msg259519 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-02-03 21:20
I thought the 3.6.0 NEWS file generally listed bugs fixed since (at least) the 3.5.0 release. But the NEWS in default has no mention of Issue 26194.

Serhiy is right that there is nothing more to do for the merging problem. I was just pointing out that something in your “merge” commit must have not worked properly. Normally a merge looks like revision 58266f5101cc (there are two parents listed). But in this case I pulled from default expecting to get all the 3.5 changes as well, but never got your latest 3.5 change until I pulled it explicitly.
msg259527 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2016-02-04 03:26
I can't say that I agree with putting an additional entry the 3.6 NEWS.   In the past, I've not made news entries in unreleased versions regarding bug fixes in prior versions.  That seems pointless to me and it would tend to clutter a document whose size already gets in the way of its usability.
History
Date User Action Args
2016-02-07 09:06:11rhettingersetstatus: open -> closed
2016-02-04 03:26:13rhettingersetmessages: + msg259527
2016-02-03 21:20:11martin.pantersetmessages: + msg259519
2016-02-03 19:30:00serhiy.storchakasetmessages: + msg259507
2016-02-03 19:01:50rhettingersetmessages: + msg259501
2016-02-02 19:03:51vstinnersetmessages: + msg259418
2016-02-02 17:09:58serhiy.storchakasetstatus: closed -> open

messages: + msg259405
2016-02-02 09:42:03vstinnersetmessages: + msg259372
2016-02-02 08:24:48martin.pantersetnosy: + martin.panter
messages: + msg259369
2016-02-02 05:22:14rhettingersetstatus: open -> closed
resolution: fixed
2016-02-02 05:21:53python-devsetmessages: + msg259359
2016-01-29 17:21:20serhiy.storchakasetmessages: + msg259216
2016-01-28 19:59:29rhettingersetmessages: + msg259156
2016-01-28 09:52:35serhiy.storchakasetmessages: + msg259115
2016-01-27 23:48:21rhettingersetfiles: + full_deque2.diff
2016-01-27 23:36:37josh.rsetnosy: + josh.r
messages: + msg259072
2016-01-27 21:24:58rhettingersetmessages: + msg259063
2016-01-27 20:18:53tim.peterssetmessages: + msg259055
2016-01-27 19:53:22serhiy.storchakasetmessages: + msg259050
2016-01-27 18:56:05tim.peterssetmessages: + msg259046
2016-01-27 18:31:20rhettingersetmessages: + msg259043
2016-01-27 09:27:16vstinnersetmessages: + msg259010
2016-01-27 09:24:01serhiy.storchakasetmessages: + msg259007
2016-01-27 09:14:12vstinnersetnosy: + vstinner
messages: + msg259005
2016-01-27 09:11:16serhiy.storchakasetstatus: closed -> open
resolution: fixed -> (no value)
messages: + msg259004
2016-01-27 06:59:14serhiy.storchakasetmessages: + msg258996
2016-01-27 05:46:42rhettingersetstatus: open -> closed
resolution: fixed
2016-01-27 05:46:13python-devsetnosy: + python-dev
messages: + msg258995
2016-01-26 08:41:34rhettingersetmessages: + msg258948
2016-01-26 07:24:36rhettingersetfiles: + full_deque_alt.diff
2016-01-26 06:38:23rhettingersetmessages: + msg258947
2016-01-26 03:59:11rhettingersetfiles: + full_deque.diff
keywords: + patch
2016-01-26 02:45:10rhettingersetnosy: + tim.peters, mark.dickinson, serhiy.storchaka
messages: + msg258943
2016-01-24 17:47:04rhettingercreate