Add operator.subscript as a convenience for creating slices #68567

llllllllll · 2015-06-04T06:07:04Z

BPO	24379
Nosy	@gvanrossum, @rhettinger, @mdickinson, @vstinner, @taleinat, @ned-deily, @stevendaprano, @bitdancer, @vadmium, @serhiy-storchaka, @1st1, @MojoVampire, @llllllllll, @ilevkivskyi, @Vgr255, @shoyer
Files	slice.patch slicerepr.patch operator_subscript.patch operator_subscript_pyonly.patch operator_subscript_pyonly.patch operator_subscript_pyonly.patch operator_subscript_pyonly.patch operator_subscript_pyonly.patch operator_subscript_pyonly.patch operator_subscript_norefleak.patch operator_subscript_norefleak_v2.patch

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2018-06-16.02:02:00.835>
created_at = <Date 2015-06-04.06:07:04.227>
labels = ['extension-modules', 'type-feature', '3.7']
title = 'Add operator.subscript as a convenience for creating slices'
updated_at = <Date 2018-07-20.00:08:42.343>
user = 'https://github.com/llllllllll'

bugs.python.org fields:

activity = <Date 2018-07-20.00:08:42.343>
actor = 'vstinner'
assignee = 'none'
closed = True
closed_date = <Date 2018-06-16.02:02:00.835>
closer = 'rhettinger'
components = ['Extension Modules']
creation = <Date 2015-06-04.06:07:04.227>
creator = 'llllllllll'
dependencies = []
files = ['39614', '39615', '39775', '39792', '39902', '39904', '39906', '39907', '39913', '45471', '45473']
hgrepos = []
issue_num = 24379
keywords = ['patch']
message_count = 60.0
messages = ['244801', '244802', '244804', '244805', '244811', '244819', '244820', '244821', '245070', '245072', '245082', '245131', '245133', '245134', '245657', '245665', '245666', '245702', '246618', '246624', '246625', '246631', '246634', '246635', '246637', '246638', '246752', '247286', '247528', '247530', '248703', '248705', '248706', '248708', '248710', '253291', '253894', '253899', '253902', '254031', '256895', '272773', '272774', '272863', '280456', '280639', '280648', '280661', '280692', '280699', '280700', '280703', '280706', '280707', '280716', '280721', '319507', '319695', '321966', '321972']
nosy_count = 18.0
nosy_names = ['gvanrossum', 'rhettinger', 'mark.dickinson', 'vstinner', 'taleinat', 'ned.deily', 'Arfrever', 'steven.daprano', 'r.david.murray', 'python-dev', 'martin.panter', 'serhiy.storchaka', 'yselivanov', 'josh.r', 'llllllllll', 'levkivskyi', 'abarry', 'Stephan Hoyer']
pr_nums = []
priority = 'normal'
resolution = 'rejected'
stage = 'resolved'
status = 'closed'
superseder = None
type = 'enhancement'
url = 'https://bugs.python.org/issue24379'
versions = ['Python 3.7']

llllllllll · 2015-06-04T06:07:01Z

I often find that when working with pandas and numpy I want to store slice objects in variables to pass around and re-use; however, the syntax for constructing a slice literal outside of an indexer is very different from the syntax used inside of a subscript. This patch proposes the following change:

slice.literal

This would be a singleton instance of a class that looks like:

class sliceliteral(object):
    def __getitem__(self, key):
        return key

The basic idea is to provide an alternative constructor to 'slice' that uses the subscript syntax. This allows people to write more understandable code.

Consider the following examples:

reverse = slice(None, None, -1)
reverse = slice.literal[::-1]

all_rows_first_col = slice(None), slice(0)
all_rows_first_col = slice.literal[:, 0]

first_row_all_cols_but_last = slice(0), slice(None, -1)
first_row_all_cols_but_last = slice.literal[0, :-1]

Again, this is not intended to make the code shorter, instead, it is designed to make it more clear what the slice object your are constructing looks like.

Another feature of the new literal object is that it is not limited to just the creation of slice instances; instead, it is designed to mix slices and other types together. For example:

>>> slice.literal[0]
0
>>> slice.literal[0, 1]
(0, 1)
>>> slice.literal[0, 1:]
(0, slice(1, None, None)
>>> slice.literal[:, ..., ::-1]
(slice(None, None, None), Ellipsis, slice(None, None, -1)

These examples show that sometimes the subscript notation is much more clear that the non-subscript notation.
I believe that while this is trivial, it is very convinient to have on the slice type itself so that it is quickly available. This also prevents everyone from rolling their own version that is accesible in different ways (think Py_RETURN_NONE).
Another reason that chose this aproach is that it requires no change to the syntax to support.

There is a second change proposed here and that is to 'slice.__repr__'. This change makes the repr of a slice object match the new literal syntax to make it easier to read.

>>> slice.literal[:]
slice.literal[:]
>>> slice.literal[1:]
slice.literal[1:]
>>> slice.literal[1:-1]
slice.literal[1:-1]
>>> slice.literal[:-1]
slice.literal[:-1]
>>> slice.literal[::-1]
slice.literal[::-1]

This change actually affects old behaviour so I am going to upload it as a seperate patch. I understand that the change to repr much be less desirable than the addition of 'slice.literal'

llllllllll · 2015-06-04T06:07:35Z

Here is the patch that includes the updates to 'slice.__repr__'

rhettinger · 2015-06-04T06:21:01Z

FWIW, I like this idea.

serhiy-storchaka · 2015-06-04T06:39:07Z

Why not index the slice type itself? slice[1:2]

Another feature of the new literal object is that it is not limited to just the creation of slice instances; instead, it is designed to mix slices and other types together.

This looks as disadvantage.

stevendaprano · 2015-06-04T11:14:19Z

I'm with Serhiy, I don't think we need a "literal", just make slice itself indexable:

reverse = slice(None, None, -1)
reverse = slice[::-1]

The only question in my mind is what slice should do when given just a single index:

slice[0]

I suppose that should be a ValueError?

mdickinson · 2015-06-04T13:23:04Z

For prior art, it's worth taking a look at NumPy, and in particular its s_ and index_exp functions:

>>> import numpy as np
>>> np.s_[1:2]
slice(1, 2, None)
>>> np.s_[0]
0
>>> np.s_[1:2, 3]
(slice(1, 2, None), 3)

mdickinson · 2015-06-04T13:23:56Z

(Correction: they're not functions, or even callables.)

llllllllll · 2015-06-04T13:32:33Z

Why not index the slice type itself? slice[1:2]

I originally considered this and I personally really like this syntax, but I was concerned with ambiguity with the typing module

The only question in my mind is what slice should do when given just a single index

I think some of the power of this concept comes from the fact that I can express a complicated indexer without worrying about how it desugars. I would personally prefer being able to have this return tuples and scalars so that the syntax is easier to explain.

taleinat · 2015-06-09T16:00:02Z

(This should probably be discussed on the Python Ideas mailing list...)

I definitely like the idea of being able to construct slices etc. using "[]" syntax. I think this should be considered even if it is decided not to change the repr() of slices.

An important note here is that this is about more than slices:

$ python3
Python 3.4.2 (default, Feb 23 2015, 21:16:28)
[GCC 4.2.1 Compatible Apple LLVM 5.1 (clang-503.0.40)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> class A:
...     def __getitem__(self, *args):
...         print(repr(args))
...
>>> a = A()
>>> a[0]
(0,)
>>> a[0:1]
(slice(0, 1, None),)
>>> a[0:1, ..., 1:2]
((slice(0, 1, None), Ellipsis, slice(1, 2, None)),)
>>> a[0:1, 2]
((slice(0, 1, None), 2),)

Indeed, Joe's suggested slice.literal supports all of this, but we can't just modify slice to handle all of these cases.

What I'm missing is a way to use such an object to actually index/slice something. The only way I can currently think of is using a.__getitem__(), but that's quite ugly IMO.

llllllllll · 2015-06-09T16:06:00Z

What I'm missing is a way to use such an object to actually index/slice something

Sorry, I am not sure I understand what you mean by this? You can pass a slice object, or a tuple of slices in subscript notation.

>>> [1, 2, 3, 4][slice(2)]
[1, 2]

Also, I am not currently subscribed to python-ideas; what is the common practice for posting new threads?

taleinat · 2015-06-09T19:17:04Z

(see https://mail.python.org/mailman/listinfo/python-ideas)

But for x = [1,2,3,4], how will x[y] work for all of the following values of y?

y = slice.literal[0]
y = slice.literal[1:2]
y = slice.literal[0:1, ..., 3]

NumPy's s_ "magic object" is a factory, returning objects of different types depending on the given input. If you want an actual slice.literal class, then you'll have to supply a way to convert it into an object usable for indexing/slicing, e.g.:

y = slice.literal[0:1]
[1,2,3,4][y.as_indexer]

llllllllll · 2015-06-10T15:24:22Z

>>> slice.literal[0]
0
>>> y = slice.literal[1:2]
slice(1, 2, None)
>>> slice.literal[0:1, ..., 3]
(slice(0, 1, None), Ellipsis, 3)

The way this object works right now does not create instances of some inner class of slice, instead, indexing it returns the key without modification.

taleinat · 2015-06-10T15:52:54Z

So, is this in any ways different than NumPy's s_?

llllllllll · 2015-06-10T15:57:31Z

It is a singleton, does not accept the maketuple flag, and is written in C. I did not know about the s_ attribute of numpy before writing this; however, I still think that moving this object to slice improves code clarity (s_ is not a super clear name). I also think that this behaviour belongs on the slice object.

llllllllll · 2015-06-22T22:18:45Z

Based on some discussion on python-ideas, this is being renamed to operator.subscript. Here is the patch that makes the correct the changes to move this object into the operator module.

serhiy-storchaka · 2015-06-23T04:15:13Z

Is C implementation needed?

llllllllll · 2015-06-23T04:20:14Z

I just moved it over since I implemented it for slice originally, I can drop the C implementation.

llllllllll · 2015-06-23T22:09:08Z

I removed the C implementation.

llllllllll · 2015-07-11T23:51:01Z

ping: is there anything blocking this?

vadmium · 2015-07-12T01:31:34Z

Joe: Have you seen the comments on the code review? See the Review link at the top of the bug thread, or maybe check your spam.

llllllllll · 2015-07-12T01:55:39Z

Ah, I hadn't seen that, I will address those now, sorry about that.

llllllllll · 2015-07-12T04:04:30Z

updating with the slots change

This also adds a ton of test cases

serhiy-storchaka · 2015-07-12T05:30:43Z

The implementation itself LGTM (nice use of decorator). New feature should be documented. Needed changes in Doc/library/operator.rst and Doc/whatsnew/3.6.rst and the docstring for the subscript class. 'subscript' should be added to __all__.

llllllllll · 2015-07-12T06:24:14Z

updating to address the docs and order of assertions

serhiy-storchaka · 2015-07-12T07:09:24Z

I'm with Martin about tests. Only one assertion per special case is needed, no need to write exponential number of combinations.

vadmium · 2015-11-02T09:58:49Z

On the mailing list, Antoine also suggested removing __slots__ completely as another workaround. I agree that it doesn’t seem important for a singleton, though in the review comments a couple people seemed to want __slots__.

MojoVampire · 2015-11-03T23:52:29Z

Martin: I think the primary reason I suggested an empty __slots__ was to avoid people doing dumb stuff like attaching attributes to the singleton, and then getting annoyed if support for them was removed in some future release (e.g. if this ever moved to C for whatever reason), or expecting them to survive pickling, what have you. The memory savings (if any) would be negligible, it was just to avoid future compatibility concerns.

Serhiy: If we add a stub slot, presumably it should be __stub, not __stub__, the former getting the private member name mangling protection while the latter would not (and the latter might give the impression it was some sort of special method/attribute when it was not).

llllllllll · 2015-12-23T02:44:57Z

and the latter might give the impression it was some sort of special method/attribute when it was not.

Wouldn't adding this be special because it is specifically reserved by CPython as an implementation detail?

Also, would adding this name (stub or __stub) be sufficient to get this patch merged again. I would not mind investigating the gc_head issue after but I view that as a separate issue which is only exercised by this patch.

serhiy-storchaka · 2016-08-15T15:56:20Z

Raymond, adding a stub element to __slots__ fixes a leak. Is this enough to push the patch again?

We should open a separate issue for leaks in objects with empty __slots__.

ilevkivskyi · 2016-08-15T16:04:31Z

It looks like namedtuple suffers the same issue with empty __slots__:

test_collections leaked [0, 0, 2, 0] references, sum=2

rhettinger · 2016-08-16T16:41:44Z

We should open a separate issue for leaks in objects with empty __slots__

The leak should be fixed first rather than introducing stub fields hack.

MojoVampire · 2016-11-09T21:30:51Z

bpo-28651 opened for the general problem with empty __slots__ and gc failures.

rhettinger · 2016-11-12T08:23:45Z

When I first looked at this a year ago, I thought it seemed like a reasonable idea and might be useful. Since that time, I've haven't encountered any situations whether this would have improved my or someone else's code. And now, I can't even dream up any scenarios where operator.subscript[3:7:2] would be better than slice(3, 7, 2). Accordingly, I think we should re-evaluate whether there is any real benefit to be had by adding this to the standard library. Perhaps, we're better off without it.

ilevkivskyi · 2016-11-12T09:46:43Z

I think it would be more useful for multidimensional slicing:

>>> subscript[::-1, ..., ::-1]
(slice(None, None, -1), Ellipsis, slice(None, None, -1))

stevendaprano · 2016-11-12T13:48:13Z

On Sat, Nov 12, 2016 at 08:23:45AM +0000, Raymond Hettinger wrote:

I can't even dream up any scenarios where
operator.subscript[3:7:2] would be better than
slice(3, 7, 2).

For that specific example, I completely agree. But I think that in
general slice[::-1] is better than either the current alternative
slice(None, None, -1) or the proposed operator.subscript[::-1].

And there's no comparison once you get to multi-dimensional slices:

    s = slice[::-1, 1:, 1::2]
    spam[s] + eggs[s]

versus:

    s = slice(slice(None, None, -1), slice(1, None), slice(1, None, 2))
    spam[s] + eggs[s]

Even simple examples look nice in slice syntax:

slice[3:7:2]

I had completely forgotten about this feature request, but
coincidentally re-suggested it on the Python-Ideas mailing
list earlier today:

https://mail.python.org/pipermail/python-ideas/2016-November/043630.html

The downside is that beginners who are still learning Python's
syntax may try writing:

    slice(::-1)

by mistake. Nevertheless, it seems to me that the advantage to more
experienced developers to be able to read and write slice objects using
slice format outweighs the temporary confusion to beginners. (I would
expect that outside of the Numpy community, few beginners use the
slice() object directly, so this would not often affect them.)

ilevkivskyi · 2016-11-13T11:23:25Z

Here is a new patch that does not cause refleaks, I just replaced empty __slots__ with a __setattr__: it looks like __slots__ are not needed here to save memory (there is only a singleton instance), they were used just to create an immutable object.

gvanrossum · 2016-11-13T15:53:54Z

So we now have the refleak fixed. Can we apply the patch? Or is it too late for 3.6?

ilevkivskyi · 2016-11-13T15:59:09Z

The patch is small and it was initially applied before 3.6b1, I think we should ask Ned (added to nosy) if this is OK to apply the fixed version?

ilevkivskyi · 2016-11-13T17:02:58Z

Serhiy, thank you for review! Here is a corrected patch.

serhiy-storchaka · 2016-11-13T17:25:06Z

operator_subscript_norefleak_v2.patch LGTM. Waiting for Ned's decision about 3.6.

ned-deily · 2016-11-13T18:40:28Z

I don't think this is appropriate for 3.6 now. We're a little more than 4 weeks from the final release - way past the feature code cut-off - and, from the comments here, there is no longer a total consensus that this feature should go back in at all after being pulled and largely forgotten about for a year.

ilevkivskyi · 2016-11-13T20:58:19Z

Maybe then it makes sense to apply this to 3.7? It looks like most people are in favour of this (also on python-ideas), only Raymond is not sure/mildly against.

gvanrossum · 2016-11-13T22:39:33Z

Actually I'm with Raymond -- I'm also not sure or mildly against (-0).
Given how easy it is to do without and how cumbersome using the operator
module is I don't see a great benefit. (IMO the operator module should be
the measure of *last* resort -- it is not to become part of anybody's
toolkit for common operators. But I seem to be a minority opinion here.)

taleinat · 2018-06-14T10:36:57Z

So 3.8 then, or should this be closed?

FWIW I'm -1. IMO this should be a code recipe somewhere, no need for it to be in the stdlib.

gvanrossum · 2018-06-16T00:26:40Z

Let's close it. Just because someone spent a lot of effort on a patch we don't have to accept it.

shoyer · 2018-07-19T20:54:54Z

Raymond, Tal and Guido -- do any of you work routinely with multi-dimensional arrays?

In my experience as someone who uses Python everyday for numerical computing (and has been doing so for many years), the need for an operator like this comes up with some regularity. With multi-dimensional indexing, this allows lets you cut down quite a bit on boilerplate, e.g., compare

subscript[:, 0, ::-1] vs (slice(None), 0, slice(None, None, -1))

It's absolutely true that subscript is an easy four line recipe, and indeed a form of this recipe is already included in both NumPy and pandas. But I think this is a case where the lack of a common idiom is actively harmful. I do multi-dimensional indexing in using at least four different libraries (pandas, xarray, numpy and tensorflow), and it feels wrong to use a utility from numpy/pandas for other projects. I could write my own version of operator.subscript in each project (and yes, I've done so before), but for any individual use case it's less hassle to write things out the long way.

In practice, the preferred way to write such long expressions seems to be to redundantly repeat indexing operations, e.g.,

x[:, 0, ::-1]
y[:, 0, ::-1]

rather than

  index = subscript[:, 0, ::-1]
  x[index]
  y[index]

This is definitely non-ideal from a readability perspective. It's no longer immediately clear that these arrays are being indexed in the same way, and any changes would need to be applied twice.

vstinner · 2018-07-20T00:08:41Z

Raymond, Tal and Guido -- do any of you work routinely with multi-dimensional arrays?

Hi Stephan Hoyer,

IMHO a *closed* issue is not the most appropriate place to request Guido to change his mind. You may get more traction on python-ideas where you may find more supporters who need the operator.

The question is not only if the operator makes sense (obviously, it does, for you), but if it "belongs" to the standard library. Is it popular enough? Is it consistent with other functions of the operator module? etc.

llllllllll mannequin added interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement labels Jun 4, 2015

llllllllll mannequin added extension-modules C modules in the Modules dir and removed interpreter-core (Objects, Python, Grammar, and Parser dirs) labels Jun 22, 2015

llllllllll mannequin changed the title ~~slice.literal notation~~ operator.subscript Jun 22, 2015

Vgr255 mannequin changed the title ~~operator.subscript~~ Add operator.subscript as a convenience for creating slices Dec 23, 2015

serhiy-storchaka added the 3.7 (EOL) end of life label Nov 12, 2016

serhiy-storchaka added the release-blocker label Nov 13, 2016

serhiy-storchaka assigned ned-deily Nov 13, 2016

ned-deily removed the release-blocker label Nov 13, 2016

ned-deily removed their assignment Nov 13, 2016

rhettinger closed this as completed Jun 16, 2018

ezio-melotti transferred this issue from another repository Apr 10, 2022

Add operator.subscript as a convenience for creating slices #68567

Add operator.subscript as a convenience for creating slices #68567

Comments

llllllllll mannequin commented Jun 4, 2015

llllllllll mannequin commented Jun 4, 2015

llllllllll mannequin commented Jun 4, 2015

rhettinger commented Jun 4, 2015

serhiy-storchaka commented Jun 4, 2015

stevendaprano commented Jun 4, 2015

mdickinson commented Jun 4, 2015

mdickinson commented Jun 4, 2015

llllllllll mannequin commented Jun 4, 2015

taleinat commented Jun 9, 2015

llllllllll mannequin commented Jun 9, 2015

taleinat commented Jun 9, 2015

llllllllll mannequin commented Jun 10, 2015

taleinat commented Jun 10, 2015

llllllllll mannequin commented Jun 10, 2015

llllllllll mannequin commented Jun 22, 2015

serhiy-storchaka commented Jun 23, 2015

llllllllll mannequin commented Jun 23, 2015

llllllllll mannequin commented Jun 23, 2015

llllllllll mannequin commented Jul 11, 2015

vadmium commented Jul 12, 2015

llllllllll mannequin commented Jul 12, 2015

llllllllll mannequin commented Jul 12, 2015

serhiy-storchaka commented Jul 12, 2015

llllllllll mannequin commented Jul 12, 2015

serhiy-storchaka commented Jul 12, 2015

vadmium commented Nov 2, 2015

MojoVampire mannequin commented Nov 3, 2015

llllllllll mannequin commented Dec 23, 2015

serhiy-storchaka commented Aug 15, 2016

ilevkivskyi commented Aug 15, 2016

rhettinger commented Aug 16, 2016

MojoVampire mannequin commented Nov 9, 2016

rhettinger commented Nov 12, 2016

ilevkivskyi commented Nov 12, 2016 • edited by AlexWaygood

stevendaprano commented Nov 12, 2016

ilevkivskyi commented Nov 13, 2016

gvanrossum commented Nov 13, 2016

ilevkivskyi commented Nov 13, 2016

ilevkivskyi commented Nov 13, 2016

serhiy-storchaka commented Nov 13, 2016

ned-deily commented Nov 13, 2016

ilevkivskyi commented Nov 13, 2016

gvanrossum commented Nov 13, 2016

taleinat commented Jun 14, 2018

gvanrossum commented Jun 16, 2018

shoyer mannequin commented Jul 19, 2018

vstinner commented Jul 20, 2018

ilevkivskyi commented Nov 12, 2016 •

edited by AlexWaygood