classification
Title: __getitem__ and __setitem__ try to be smart when invoked with negative slice indices
Type: behavior Stage:
Components: Interpreter Core Versions: Python 2.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: docs@python Nosy List: docs@python, eryksun, kt, rhettinger
Priority: low Keywords:

Created on 2014-06-17 05:16 by kt, last changed 2014-06-17 08:05 by kt.

Messages (5)
msg220792 - (view) Author: Konstantin Tretyakov (kt) Date: 2014-06-17 05:16
Consider the following example:

class A:
     def __getitem__(self, index):
         return True

If you invoke A()[-1], everything is fine. However, if you invoke A()[-1:2], you get an "AttributeError: A instance has no attribute '__len__'".

Moreover, if you define __len__ for your class, you will discover that __getitem__ will act "smart" and modify slice you are passing into the function. Check this out:

class A:
    def __getitem__(self, index):
        return index.start
    def __len__(self):
        return 10

Now A()[-1:10] outputs "9". The same kind of argument-mangling happens within __setitem__.

This is completely unintuitive and contrary to what I read in the docs (https://docs.python.org/2/reference/datamodel.html#object.__getitem__):
"Note that the special interpretation of negative indexes (if the class wishes to emulate a sequence type) is up to the __getitem__() method.".

Especially intuitive is the fact that if you do A()[slice(-1,10)] or A().__getitem__(slice(-1,10)), no special treatment is done for the -1, everything works fine and the __len__ method is not invoked.

As far as I understand, the root cause is the behaviour of STORE_SLICE+3 command, which tries to be too smart.

I have discovered this within code where slice indexing was used to insert arbitrary intervals into a data structure (hence using negative numbers would be totally fine), and obviuosly such behaviour broke the whole idea, albeit it was nontrivial to debug and discover.

This does not seem to be a problem for Python 3.3, however I believe fixing this in Python 2.7 is important as well.
msg220795 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2014-06-17 05:38
It's too late for a behavior change in 2.7.   That risks breaking code that relies on the current behavior.

However, the docs could be amended to indicate that slices with negative indicies are treated differently from scalar negative indicies, the former get length adjusted automatically and the latter don't.
msg220806 - (view) Author: Konstantin Tretyakov (kt) Date: 2014-06-17 07:54
Do note that things are not as simple as "slices with negative indices are treated differently from scalar negative indicies".

Namely, behaviour differs whether you use [] or .__getitem__, and whether you use [a:b] or [slice(a,b)]. This does not make sense from a specification perspective, but has to be made clear in the docs then.

Besides, Jython does not have this problem and I presume other Python implementations might also be fine (e.g. PyPy or whatever else there exists, couldn't test now).

Hence, although fixing the docs does seem like a simple solution, if you want to regard the docs as a "specification of the Python language" rather than a list of particular CPython features, this won't be reasonable.
msg220808 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2014-06-17 07:59
Refer to the documentation for deprecated __getslice__ when slicing an instance of a classic class:

https://docs.python.org/2/reference/datamodel.html#object.__getslice__

The SLICE+3 implementation (apply_slice) calls PySequence_GetSlice if both index values can be converted to Py_ssize_t integers and if the type defines sq_slice (instance_slice for the "instance" type). The "instance" type is used for an instance of a classic class. This predates unification of Python classes and types.

apply_slice
http://hg.python.org/cpython/file/f89216059edf/Python/ceval.c#l4383

PySequence_GetSlice
http://hg.python.org/cpython/file/f89216059edf/Objects/abstract.c#l1995

instance_slice
http://hg.python.org/cpython/file/f89216059edf/Objects/classobject.c#l1177

A new-style class, i.e. a class that subclasses object, would have to define or inherit __getslice__ in order for the C sq_slice slot to be defined. But __getslice__ is deprecated and shouldn't be implemented  unless you have to override it in a subclass of a built-in type. 

When sq_slice doesn't exist, apply_slice instead calls PyObject_GetItem with a slice object:

    class A(object):
        def __getitem__(self, index):
            return index.start
        def __len__(self):
            return 10

    >>> A()[-1:10]
    -1

By the way, you don't observe the behavior in Python 3 because it doesn't have classic classes, and the __getslice__, __setslice__, and __delslice__ methods are not in its data model.
msg220809 - (view) Author: Konstantin Tretyakov (kt) Date: 2014-06-17 08:05
Aha, I see. I knew I'd get bitten by not explicitly subclassing (object) one day.

In any case, adding a reference to this issue into the docs of __getitem__ and __setitem__ would probably save someone some hours of utter confusion in the future.
History
Date User Action Args
2014-06-17 08:05:14ktsetmessages: + msg220809
2014-06-17 07:59:30eryksunsetnosy: + eryksun
messages: + msg220808
components: + Interpreter Core, - Documentation
2014-06-17 07:54:58ktsetmessages: + msg220806
2014-06-17 05:38:49rhettingersetpriority: normal -> low

nosy: + rhettinger, docs@python
messages: + msg220795

assignee: docs@python
components: + Documentation, - Interpreter Core
2014-06-17 05:16:03ktcreate