classification
Title: extended slice behavior inconsistent with docs
Type: behavior Stage: resolved
Components: Documentation Versions: Python 3.7, Python 3.6, Python 3.5, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: docs@python Nosy List: bessho, bethard, docs@python, georg.brandl, logistix, mark.dickinson, martin.panter, python-dev
Priority: normal Keywords: patch

Created on 2006-03-09 17:37 by bethard, last changed 2016-12-24 09:51 by martin.panter. This issue is now closed.

Files
File name Uploaded Description Edit
extended_slicing_docs.diff bethard, 2011-03-27 12:47 review
extended_slicing_docs.v2.diff martin.panter, 2016-12-10 04:45 review
Messages (11)
msg60879 - (view) Author: Steven Bethard (bethard) * (Python committer) Date: 2006-03-09 17:37
I don't know whether the behavior or the docs is wrong,
but the extended slice behavior does not match the
documentation.  I found three places that talk about
extended slicing:

http://docs.python.org/ref/slicings.html
The semantics for a simple slicing are as follows... It
is not an error if i or j lie outside the range of
valid indexes (such items don't exist so they aren't
selected).
The semantics for an extended slicing are as follows...

http://docs.python.org/ref/types.html
Some sequences also support ``extended slicing'' with a
third ``step'' parameter: a[i:j:k] selects all items of
a with index x where x = i + n*k, n >= 0 and i <= x < j.

http://docs.python.org/lib/typesseq.html
s[i:j:k]   slice of s from i to j with step k   (3),(5)
...
(3) If i or j is negative, the index is relative to the
end of the string: len(s) + i or len(s) + j is
substituted. But note that -0 is still 0.
(5) The slice of s from i to j with step k is defined
as the sequence of items with index x = i + n*k such
that $0 \leq n < \frac{j-i}{k}$. In other words, the
indices are i, i+k, i+2*k, i+3*k and so on, stopping
when j is reached (but never including j). If i or j is
greater than len(s), use len(s). If i or j are omitted
then they become ``end'' values (which end depends on
the sign of k). Note, k cannot be zero.



Given those docs, consider this behavior:

>>> range(10)[10:0:-2]
[9, 7, 5, 3, 1]

By the Sequence Type Language Reference, [10:0:-2]
selects all items with index x where x = 10 + n*(-2), n
>= 0 and 10 <= x < 0.  Since no such values of x exist,
I conclude the result should have been an empty list.

By the Sequence Type Library Reference, [10:0:-2]
selects all items with index x = 10 + n*(-2) such that
0 <= n < (0 - 10)/(-2) = 5.  Thus I conclude that I
should get indices [10, 8, 6, 4, 2].  But this is also
wrong -- that would either have given me an IndexError,
if I stick to what's written for extended slicings, or
the list [8, 6, 4, 2] if I assume that the phrase "It
is not an error if i or j lie outside the range of
valid indexes" from the simple slicings section in the
Slicings Language Reference also applies to extended
slicings.


All the references I can find document this behavior
incorrectly.  I suggest rewording all the sections to
say something like:

"""
To get the slice of s from i to j with step k, first
determine the bounds.  If k is positive, and i or j is
greater than len(s), use len(s).  If k is negative, and
i or j is greater than len(s)-1, use len(s)-1. Note, k
cannot be zero.  If i or j are omitted then they become
``end'' values (which end depends on the sign of k). 

The slice of s from i to j with step k is then defined
as the sequence of items with index x = i + n*k such
that 0 <= n < (j - i)/k.  In other words, the indices
are i, i+k, i+2*k, i+3*k and so on, stopping when j is
reached (but never including j).
"""

The most important sentence above is "If k is negative,
and i or j is greater than len(s)-1, use len(s)-1." 
Without this sentence, the documentation doesn't match
the behavior.  The remainder of the rewording is to
make it clear that the bounds are adjusted *before* the
indices are calculated.
msg60880 - (view) Author: Grant Olson (logistix) Date: 2006-03-10 02:42
Logged In: YES 
user_id=699438

One problem in your analysis that range(10) returns the array

[0,1,2,3,4,5,6,7,8,9]

not:

[1,2,3,4,5,6,7,8,9,10]

The 10th element of the array holds the value 9.  so when x
= 10 + (0*-2), you get 10.  Then:

[0,1,2,3,4,5,6,7,8,9][10]

will return 9
msg60881 - (view) Author: Steven Bethard (bethard) * (Python committer) Date: 2006-03-10 03:42
Logged In: YES 
user_id=945502

logistix wrote:
> [0,1,2,3,4,5,6,7,8,9][10]
> will return 9

Huh?  Python indexing starts from 0.  Hence:

>>> range(10)[10]
Traceback (most recent call last):
  File "<interactive input>", line 1, in ?
IndexError: list index out of range

>>> [0,1,2,3,4,5,6,7,8,9][10]
Traceback (most recent call last):
  File "<interactive input>", line 1, in ?
IndexError: list index out of range

If I didn't understand your complaint, please explain it again.
msg60882 - (view) Author: Grant Olson (logistix) Date: 2006-03-10 22:40
Logged In: YES 
user_id=699438

You're right.  I misunderstood what you were saying.  I was
trying to point out that you wouldn't have an element '10'
but I now see that you said "indicies[10...]", not
values[10...].

Sorry.
msg85534 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2009-04-05 17:32
See also #1265100.
msg132319 - (view) Author: Steven Bethard (bethard) * (Python committer) Date: 2011-03-27 12:47
The problem still exists in current trunk:

The slicing semantics have been removed from the expressions reference:
http://docs.python.org/py3k/reference/expressions.html#slicings

The datamodel and types sections still have the same equations:
http://docs.python.org/py3k/reference/datamodel.html#types
http://docs.python.org/py3k/library/stdtypes.html#typesseq

Here's a synopsis of the problem, in code:

>>> # positive step, works as described
>>> i, j, k = 2, 8, 2
>>> range(10)[i:j:k]
[2, 4, 6]
>>> [i + n * k for n in range(0, (j - i) / k)]
[2, 4, 6]
>>> [range(10)[i] for i in _]
[2, 4, 6]

>>> # negative step, does not work as described
>>> i, j, k = 10, 0, -2
>>> range(10)[i:j:k]
[9, 7, 5, 3, 1]
>>> [i + n * k for n in range(0, (j - i) / k)]
[10, 8, 6, 4, 2]
>>> [range(10)[i] for i in _]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: list index out of range

>>> # actual behavior trims 10 to 9 (is the same as 9:0:-2)
>>> # that is, it trims to len(s) - 1, not len(s)
>>> range(10)[9:0:-2]
[9, 7, 5, 3, 1]

I propose to ignore the definition in the datamodel section, since it's talking about extended slicing in general, and doesn't mention anything about negative indices.

I propose the attached patch to fix the definition in the types section, which simply changes the statement:

  If i or j is greater than len(s), use len(s)

to say:

  If *i* or *j* is greater than len(s), use len(s) if *k* is positive, 
  or len(s) - 1 if *k* is negative.
msg220778 - (view) Author: Mark Lawrence (BreamoreBoy) * Date: 2014-06-16 22:38
The attached patch is simple enough, surely it's a simple decision as to whether or not to commit it?
msg227360 - (view) Author: Fumihiro Bessho (bessho) Date: 2014-09-23 14:26
I also wondered the same thing today and found this issue.

I've added my comment on the patch to make the change more precise. Can anyone move it ahead and update the document?
msg282813 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-12-10 00:22
Fumihiro’s suggestion seems reasonable to me (assuming it still applies to the current text)
msg282826 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-12-10 04:45
Patch ruling out the len(s) corner case. I use a modified version of Fumihiro’s suggestion.
msg283927 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2016-12-24 08:49
New changeset b0b17b41edfc by Martin Panter in branch '3.5':
Issue #1446619: Account for negative slice direction in description
https://hg.python.org/cpython/rev/b0b17b41edfc

New changeset 031af2160094 by Martin Panter in branch '3.6':
Issue #1446619: Merge slicing description from 3.5
https://hg.python.org/cpython/rev/031af2160094

New changeset 4383d403aaef by Martin Panter in branch 'default':
Issue #1446619: Merge slicing description from 3.6
https://hg.python.org/cpython/rev/4383d403aaef

New changeset d63a11bd98ce by Martin Panter in branch '2.7':
Issue #1446619: Account for negative slice direction in description
https://hg.python.org/cpython/rev/d63a11bd98ce
History
Date User Action Args
2016-12-24 09:51:30martin.pantersetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2016-12-24 08:49:31python-devsetnosy: + python-dev
messages: + msg283927
2016-12-11 00:18:58BreamoreBoysetnosy: - BreamoreBoy
2016-12-10 04:45:54martin.pantersetfiles: + extended_slicing_docs.v2.diff

messages: + msg282826
versions: + Python 3.6, Python 3.7, - Python 3.4
2016-12-10 00:22:00martin.pantersetnosy: + martin.panter
messages: + msg282813
2014-09-23 14:26:15besshosetnosy: + bessho
messages: + msg227360
2014-06-16 22:38:54BreamoreBoysetnosy: + BreamoreBoy

messages: + msg220778
versions: + Python 3.4, Python 3.5, - Python 3.2, Python 3.3
2011-03-27 12:56:30mark.dickinsonsetnosy: + mark.dickinson
2011-03-27 12:48:59bethardsetstage: test needed -> patch review
2011-03-27 12:47:52bethardsetfiles: + extended_slicing_docs.diff
keywords: + patch
messages: + msg132319

versions: + Python 2.7, Python 3.2, Python 3.3, - Python 2.6, Python 3.0
2010-08-22 02:01:51BreamoreBoysetassignee: georg.brandl -> docs@python

nosy: + docs@python
2009-04-05 18:06:57bethardsetnosy: + bethard, - bediviere.historic
2009-04-05 17:32:29georg.brandllinkissue1265100 superseder
2009-04-05 17:32:26georg.brandlsetmessages: + msg85534
2009-03-21 01:36:04ajaksu2setnosy: + georg.brandl
versions: + Python 2.6, Python 3.0, - Python 2.4
assignee: georg.brandl
components: + Documentation, - None
type: behavior
stage: test needed
2006-03-09 17:37:32bediviere.historiccreate