This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: memoryview: add multi-dimensional indexing and slicing
Type: enhancement Stage: needs patch
Components: Interpreter Core Versions: Python 3.3
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: DLowell, belopolsky, ncoghlan, pitrou, pv, skrah, teoliphant, undercoveridiot
Priority: normal Keywords:

Created on 2012-02-26 12:20 by skrah, last changed 2022-04-11 14:57 by admin.

Messages (9)
msg154336 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2012-02-26 12:23
The PEP-3118 authors originally planned to have support for multi-dimensional indexing and slicing in memoryview.

Since memoryview now already has the capabilities of multi-dimensional
list representations and comparisons, this would be a nice addition
to the feature set.
msg194072 - (view) Author: (DLowell) Date: 2013-08-01 13:30
Is this issue still being worked on?
msg196523 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2013-08-30 12:00
I would probably work on it (it's basically implemented in _testbuffer.c),
but I'm not sure if the NumPy community will actually use the feature.
msg210583 - (view) Author: Ian Beaver (undercoveridiot) Date: 2014-02-08 01:45
If there is any way to get this implemented, it is needed.  For one, the docs on memoryview make no mention that indexing and slicing doesn't work with multi-dimensional data which led me to believe it was supported until I tried using it.  A second reason is currently this represents a loss of functionality from the buffer type in python2.  In porting code using the buffer type in python2 to python3, you get a very unhelpful "NotImplementedError" with no description when trying to slice a memoryview.  There is no workaround but to call tobytes() and copy the data in memory to an object that supports slicing, but for very large objects this defeats the primary purpose of using buffers in the first place, which is to avoid memory copies.
msg210597 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2014-02-08 09:28
memoryview supports slicing - it just doesn't support NumPy style *multi-dimensional* slicing (and buffer doesn't support that either).
msg210598 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2014-02-08 09:29
(However, if you're on Python 3.2, then you'll likely need to upgrade to Python 3.3 - memoryview *does* have a lot of additional limitations in Python 3.2)
msg210660 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2014-02-08 15:23
Ian, could you please provide an example where multi-dimensional
indexing and slicing works in 2.x but not in 3.3?
msg210883 - (view) Author: Ian Beaver (undercoveridiot) Date: 2014-02-10 22:21
Its not multi-dimensional slicing  to get a subset of objects as in Numpy, but more the ability to slice a buffer containing a multi-dimensional array as raw bytes.  Buffer objects in Python2.7 are dimensionality naive so it works fine.  You were correct that I was testing against Python3.2, in Python3.3 the slicing of ndim > 1 works, however only for reading from the buffer.  I still can't write back into a memoryview object with ndim > 1 in Python 3.3.

Python 2.7.3:
>>> import numpy as np
>>> arr = np.zeros(shape=(100,100))
>>> type(arr.data)
<type 'buffer'>
>>> arr.data[0:10]
'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
>>> 

Python 3.2.3:
>>> import numpy as np
>>> arr = np.zeros(shape=(100,100))
>>> type(arr.data)
<class 'memoryview'>
>>> arr.data[0:10]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NotImplementedError
>>> 

Python 3.3.3:
>>> import numpy as np
>>> arr = np.zeros(shape=(100,100))
>>> type(arr.data)
<class 'memoryview'>
>>> arr.data[0:10]
<memory at 0x7faaf1d03a48>
>>> 


However to write data back into a buffer:

Python 2.7.3:
>>> import numpy as np
>>> arr = np.zeros(shape=(100,100))
>>> arr.data[0:10] = '\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
>>> 

Python 3.2.3:
>>> import numpy as np
>>> arr = np.zeros(shape=(100,100))
>>> arr.data[0:10] = b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NotImplementedError
>>> 

Python 3.3.3:
>>> import numpy as np
>>> arr = np.zeros(shape=(100,100))
>>> arr.data[0:10] = b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NotImplementedError: memoryview assignments are currently restricted to ndim = 1
>>> 


Also the slice in Python3.3 is not the same as just returning a chunk of raw bytes from the memory buffer, instead of a bytes object the indexing behaves similar to numpy array indexes and you get the (sub) array items back as Python objects.

Python2.7.3:
>>> import numpy as np
>>> arr = np.zeros(shape=(100,100))
>>> arr.data[0:10]
'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
len(bytes(arr.data[0:10]))
10

Python3.3.3:
>>> import numpy as np
>>> arr = np.zeros(shape=(100,100))
>>> arr.data[0:10]
<memory at 0x7f109a71ea48>
>>> len(bytes(arr.data[0:10]))
8000

This is not a big deal in my case since I already have numpy arrays I can just use bytes(arr.flat[start:end]) to scan through the array contents as byte chunks, but that would not be possible with just a memoryview object like it was with the Python2 buffer object without converting it to something else or dropping to ctypes and iterating over the memory addresses and dereferencing the contents.

So in Python3.3 its halfway to the functionality in Python2.7, I can send chunks of the data through a compressed or encrypted stream, but I can't rebuild the data on the other side without first creating a bytearray and eating the cost of a copy into a memoryview.  All I really need is a way to reconstruct the original memoryview buffer in memory from a stream of bytes without having to make a temporary object first and then copy its contents into the final memoryview object when it is complete.
msg210918 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2014-02-11 10:50
Thanks, Ian. It seems to me that these issues should be sorted out
on the NumPy lists:

memoryview is not a drop-in replacement for buffer, so it has
different semantics.

What might help you is that you can cast any memoryview to
simple bytes without making a copy:

memoryview.cast('B')
History
Date User Action Args
2022-04-11 14:57:27adminsetgithub: 58338
2014-10-14 17:31:14skrahsetassignee: skrah ->
2014-02-11 10:50:19skrahsetmessages: + msg210918
2014-02-10 22:21:11undercoveridiotsetmessages: + msg210883
2014-02-08 15:23:34skrahsetmessages: + msg210660
2014-02-08 09:29:53ncoghlansetmessages: + msg210598
2014-02-08 09:28:10ncoghlansetmessages: + msg210597
2014-02-08 01:45:10undercoveridiotsetnosy: + undercoveridiot
messages: + msg210583
2013-08-30 12:00:28skrahsetmessages: + msg196523
2013-08-01 13:30:02DLowellsetnosy: + DLowell
messages: + msg194072
2012-08-21 03:11:42belopolskysetnosy: + belopolsky
2012-02-26 12:23:58skrahsetcomponents: + Interpreter Core
versions: + Python 3.3
2012-02-26 12:23:26skrahsetmessages: + msg154336
2012-02-26 12:20:55skrahcreate