Author terry.reedy
Recipients mark.dickinson, serhiy.storchaka, tehybel, terry.reedy
Date 2016-08-27.21:51:03
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1472334664.78.0.0470495049059.issue27867@psf.upfronthosting.co.za>
In-reply-to
Content
There is really one immediate issue: PySlice_GetIndicesEx in Objects/sliceobject.c takes as inputs a slice object and a sequence length, but not the sequence itself.  It starts with "/* this is harder to get right than you might think */"  Yep.  It assumes both inputs are constants.

However, it thrice calls _PyEval_SliceIndex(intlike) (Python/ceval.c). This call in turn may call PyNunber_AsSsize_t(intlike), which calls intlike.__index__.  At least in toy examples, this last can change an underlying mutable sequence and hence its length.  Since the calculation of the returned start, stop, step, and slicelength depend on the length, the result is that PySlice_GetIndeces can indirectly invalidate its own results.
---

Side effects in __index__ also affect indexing

class X:
    def __index__(self):
        b[:] = []
        return 1

b = [1,2,3,4]
b[X()]

Traceback (most recent call last):
  File "F:\Python\mypy\tem.py", line 7, in <module>
    print(b[X()])
IndexError: list index out of range

Similarly, "b[X()] = 4" results in "list assignment index ...".

Crashes seem not possible as the conversion to a real int seems to always be done locally in the sequence_subscript method.  See for instance list_subscript and list_ass_subscript in Objects/listojbect.c.  If the subscript is an index, it is *first* converted to int with PyNumber_AsSsize_t *before* being possibly incremented by PySize(self) and compared to the same.

Side-effects also affect index methods.

class X:
    def __index__(self):
        b[:] = []
        return 1

b = [1]
b.index(1, X())

Traceback (most recent call last):
  File "F:\Python\mypy\tem.py", line 7, in <module>
    print(b.index(1, X()))
ValueError: 1 is not in list

For tuple/list/deque.index(val, start, stop), start and stop are converted to int with _PyEval_SliceIndex, as with slices.  However, they make this call as part of an initial PyArg_ParseTuple call, before adjusting them to the length of the sequence.  So again, no crash is possible.

I suppose there are other places where PyNumber_AsSsize_t is called, and where a crash *might* be possible, and should be checked,  But back to slicing.
---

Action on this issue requires a policy decision among three options.

0.  In general, special methods should be proper functions, without side effects, that return what the doc suggests.  Doing otherwise is 'at one own risk'.  Close that as "won't fix" because anyone writing an 'intlike' with such an evil __index__ method deserves the result, even it it is a crash.  Also, this seems like a toy problem since it require __index__ can know of and access the collection it is being called for.  It would otherwise only be written my a malicious attacker, who could do things much worse in the __index__ methods.

Counter arguments: A. Our usual policy is that pure Python code only using the stdlib (minus ctypes) should not crash.  B. The writer and user of 'intlike' might be different people.

Question: does 'no crash' apply to arbitrary side-effects in special methods?

1. Declare and enforce that a length-changing side-effect in __index__ used with slicing is a bug and raise ValueError.  A. Change every mutable_collection(_ass)_subscript method to record is length before calling PySlice_GetIndicesEx and check itself afterwards. B. Create PySlice_GetIndicesEx2 to receive *collection instead of length, so *it* can do the before and check in just one place, and change the calls.

2. As I suggested earlier: 'slicing should always work'.  Ensure that the 'length' used to adjust int slice components is the correct value, after all calls to __index__.  A. Change all calling sites, as above, to pass a slice object with int components and a length calculated after int conversions.  Bl. Create PySlice_GetIndicesEx2 to receive *collection, so *it* can retrieve the length after __index__ conversions.

I will ask on pydev which, if any, of the 4 possible patches would be best.
History
Date User Action Args
2016-08-27 21:51:04terry.reedysetrecipients: + terry.reedy, mark.dickinson, serhiy.storchaka, tehybel
2016-08-27 21:51:04terry.reedysetmessageid: <1472334664.78.0.0470495049059.issue27867@psf.upfronthosting.co.za>
2016-08-27 21:51:04terry.reedylinkissue27867 messages
2016-08-27 21:51:03terry.reedycreate