classification
Title: Can't read a F-contiguous memoryview in physical order
Type: enhancement Stage: patch review
Components: Interpreter Core Versions: Python 3.8
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: pitrou, skrah
Priority: normal Keywords: patch, patch, patch

Created on 2019-01-28 20:51 by pitrou, last changed 2019-02-02 17:57 by skrah.

Pull Requests
URL Status Linked Edit
PR 11730 merged skrah, 2019-02-02 00:26
PR 11730 merged skrah, 2019-02-02 00:26
PR 11730 merged skrah, 2019-02-02 00:26
Messages (8)
msg334491 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2019-01-28 20:51
This request is motivated in detail here:
https://github.com/python/peps/pull/883#issuecomment-458290745

In short: in C, when you have a Py_buffer, you can directly read the memory in whatever order you want (including physical order).  It is not possible in pure Python, though.  Somewhat unintuitively, memoryview.tobytes() as well as bytes(memoryview) read bytes in *logical* order, even though it flattens the dimensions and doesn't keep the original type.  Logical order is different from physical order for Fortran-contiguous arrays.

One possible way of alleviating this would be to offer a memoryview.transpose() method, similar to the Numpy transpose() method (see https://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.transpose.html).

One could also imagine a memoryview.to_c_contiguous() method.

Or even: a memoryview.raw_memory() method, that would 1) flatten dimensions 2) cast to 'B' format 3) keep physical order.
msg334495 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2019-01-28 22:43
Yes, it's modeled after NumPy's tobytes():

>>> x = np.array(list(range(6)), dtype="int8").reshape(2,3)
>>> x.tobytes()
b'\x00\x01\x02\x03\x04\x05'
>>> x.T.tobytes()
b'\x00\x03\x01\x04\x02\x05'
>>> 
>>> 
>>> memoryview(x).tobytes()
b'\x00\x01\x02\x03\x04\x05'
>>> memoryview(x.T).tobytes()
b'\x00\x03\x01\x04\x02\x05'


I guess the reason is that without a type it's easier to serialize the logical array by default, so you can always assume C when you read back.



NumPy also has an 'F' parameter though that flips the order:

>>> x.tobytes('F')
b'\x00\x03\x01\x04\x02\x05'

It would be possible to add this to memoryview as well.
msg334496 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2019-01-28 22:46
raw_bytes() is also possible of course. I assume it would do nothing and just dump the memory.

Or tobytes('F') AND tobytes('raw').
msg334497 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2019-01-28 22:52
Well, raw_memory() would avoid a copy, which is useful.

As for tobytes(), if we want to follow NumPy, we can have 'F' mean if F-contiguous, 'C' otherwise:

>>> a = np.arange(12, dtype='int8').reshape((3,4))                                                                             
>>> a.tobytes('A')                                                                                                             
b'\x00\x01\x02\x03\x04\x05\x06\x07\x08\t\n\x0b'
>>> a.tobytes('A') == a.T.tobytes('A')                                                                                         
True
msg334498 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2019-01-28 22:53
Sorry, my fingers slipped.  Let me try again:

As for tobytes(), if we want to follow NumPy, we can have 'A' mean 'F' if F-contiguous, 'C' otherwise: [...]
msg334739 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2019-02-02 09:59
Yes, following NumPy looks like the sanest option for tobytes(), so I
went ahead and implemented that signature.

memory.raw() is of course complicated by the fact that things like
m[::-1] move buf.ptr to the end of the buffer.

So we'd need to restrict to contiguous views anyway, which makes
the method less appealing (IOW, it doesn't offer more than an
augmented memoryview.cast()).
msg334743 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2019-02-02 10:31
> So we'd need to restrict to contiguous views anyway, which makes
the method less appealing (IOW, it doesn't offer more than an
augmented memoryview.cast()).

Yes, it would probably be a simpler way of writing `.cast('B', shape=(...), order='A')`.
msg334759 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2019-02-02 17:57
New changeset d08ea70464cb8a1f86134dcb4a5c2eac1a02bf1a by Stefan Krah in branch 'master':
bpo-35845: Add order={'C', 'F', 'A'} parameter to memoryview.tobytes(). (#11730)
https://github.com/python/cpython/commit/d08ea70464cb8a1f86134dcb4a5c2eac1a02bf1a
History
Date User Action Args
2019-02-02 17:57:43skrahsetmessages: + msg334759
2019-02-02 10:31:35pitrousetkeywords: patch, patch, patch

messages: + msg334743
2019-02-02 09:59:25skrahsetkeywords: patch, patch, patch

messages: + msg334739
2019-02-02 00:26:35skrahsetkeywords: + patch
stage: needs patch -> patch review
pull_requests: + pull_request11622
2019-02-02 00:26:30skrahsetkeywords: + patch
stage: needs patch -> needs patch
pull_requests: + pull_request11621
2019-02-02 00:26:24skrahsetkeywords: + patch
stage: needs patch -> needs patch
pull_requests: + pull_request11620
2019-01-28 22:53:37pitrousetmessages: + msg334498
2019-01-28 22:52:32pitrousetmessages: + msg334497
2019-01-28 22:46:52skrahsetmessages: + msg334496
2019-01-28 22:43:30skrahsetmessages: + msg334495
2019-01-28 20:51:37pitroucreate