Title: Memoryview for column-major (f_contiguous) arrays from bytes impossible to achieve
Type: enhancement Stage:
Components: Extension Modules, Library (Lib) Versions: Python 3.8, Python 3.7, Python 3.6, Python 3.4, Python 3.5
Status: open Resolution:
Dependencies: Superseder:
Assigned To: skrah Nosy List: lgautier, mattip, ncoghlan, pitrou, skrah
Priority: normal Keywords:

Created on 2018-09-23 17:51 by lgautier, last changed 2019-08-31 23:44 by lgautier.

Messages (15)
msg326167 - (view) Author: Laurent Gautier (lgautier) Date: 2018-09-23 17:51
The buffer protocol is accounting for the row-major or column-major arrays, and that information is shown in the attributes, `c_contiguous` and `f_contiguous` respectively, of a memoryview object.

Using the method `cast` allows one to specify a shape but does not allow
to specify whether row or column major:

# column-major 3x2 array of bytes that was serialized
b = bytearray([1,2,3,4,5,6])

mv = memoryview(b)
mv_b = mv.cast('b', shape=(3,2))

The result object is believed to be row-major and little can be done
to correct it:

>>> mv_int.c_contiguous
>>> mv_int.c_contiguous = False
AttributeError: attribute 'c_contiguous' of 'memoryview' objects is not writable
msg326276 - (view) Author: mattip (mattip) * Date: 2018-09-24 17:53
This could be done via a `shape` kwarg to `cast`
msg326660 - (view) Author: Laurent Gautier (lgautier) Date: 2018-09-28 23:23
@mattip : do you mean that it can currently be achieved by calling `cast` with a specific shape parameter ? If the case, how so ?
msg326684 - (view) Author: mattip (mattip) * Date: 2018-09-29 16:30
Sorry, I meant a "strides" keyword. "shape" is already a valid keyword
msg326689 - (view) Author: Laurent Gautier (lgautier) Date: 2018-09-29 19:13
Wouldn't a contiguity argument ('C' or 'F') be simpler ?
(Independently, an argument strides is likely also missing from "cast").

Do you know what are the next possible steps here ? Bring this to the python-dev list ? Submit a patch ?
msg332743 - (view) Author: Laurent Gautier (lgautier) Date: 2018-12-30 04:25

What are the next steps here ?
msg332751 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2018-12-30 10:39
memoryview.cast() was originally meant to be a faster version of tobytes(), which always converts to C-contiguous.

The 'shape' keyword was added because it is odd if you can cast from ND-C to 1D-Bytes but not back.

I'm not sure if we should introduce that feature, just pointing out that the original decision to exclude non 'C' views was deliberate.
msg332766 - (view) Author: Laurent Gautier (lgautier) Date: 2018-12-30 21:57
Wait. Isn't a `memoryview` memerely a Python object for a buffer inferface, whatever its valid attributes or flags might be ?

The perceived oddness that lead to the addition of the keyword 'shape' was a good initial instinct that something was off, but this is an incomplete workaround .

If the rationale was to follow what `tobytes` is doing, this delegates the justification for excluding non 'C' views it. Then I do not understand the rationale behind `memoryview.tobytes`'s exclusive relationshop to C-contiguous arrays. A memmoryview is a window on a memory region (a Python buffer), and one would expect `tobytes` to just return bytes for it (in whatever bytes/strides) the memoryview is originally in.
msg332805 - (view) Author: mattip (mattip) * Date: 2018-12-31 07:51
> the original decision to exclude non 'C' views was deliberate

Seems this is reflected in the code:

a = np.array([[0, 1, 2], [3, 4, 5]])
mv = memoryview(a.T)
# True
mv.cast('i', (3, 2))
# TypeError: memoryview: casts are restricted to C-contiguous views

Is there any interest in revisiting that discussion? It seems the buffer protocol could allow more flexibility wrt strides and contiguous flags. Do you have a link to the discussion where this was rejected?
msg334519 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2019-01-29 13:57
CC Antoine and Nick.

I think we can do it, but we'd need cast(shape=[2,3], order='F')
to allow casting back.

The only practical objections are feature creep. To preserve
symmetry with tobytes(), we'd need to add tobytes('F') (and
msg334520 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2019-01-29 14:04
I think feature creep is ok if it stems from user needs.

Slighty related, but simpler, is
msg334622 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2019-01-31 11:27
+1 for Antoine's comment - while our approach with memoryview has generally been "If you're doing serious work with n-dimensional data, you still need NumPy (or an equivalent)", we've also been open to borrowing more NumPy behaviours for memoryview as particular pain points arise, and I think that applies here.
msg334742 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2019-02-02 10:23
It seems reasonable to support f-contiguous for cast() and tobytes().
For tobytes() it's implemented in the issue that Antoine linked to.

General support for strides in cast(), i.e. a zero-copy view for
non-contiguous arrays does not seem possible because buf.ptr is
moved around. Even NumPy does not support that:

>>> x = np.array([1,2,3])
>>> x.view('B')
array([1, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 3, 0, 0, 0, 0, 0,
       0, 0], dtype=uint8)
>>> x[::-1].view('B')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: To change to a dtype of a different size, the array must be C-contiguous
>>> y = x.astype('B')
>>> y.flags['OWNDATA'] # It's a copy.
msg334783 - (view) Author: Laurent Gautier (lgautier) Date: 2019-02-03 03:28
> General support for strides in cast(), i.e. a zero-copy view for
> non-contiguous arrays does not seem possible because buf.ptr is
> moved around. Even NumPy does not support that:

I'd be happy enough with zero-copy `cast()` of f-continguous arrays
along with the parameter `shape`, as this makes interfacing with
orginally-in-FORTRAN C libraries through ctypes or cffi possible
(without having to write a C-extension).
msg350933 - (view) Author: Laurent Gautier (lgautier) Date: 2019-08-31 23:44

Is there anything I could do to move this forward (as in write and submit a patch for review) ?
Date User Action Args
2019-08-31 23:44:06lgautiersetmessages: + msg350933
2019-02-05 11:26:45skrahsetassignee: skrah
2019-02-03 03:28:45lgautiersetmessages: + msg334783
2019-02-02 10:23:55skrahsetmessages: + msg334742
2019-01-31 11:27:39ncoghlansetmessages: + msg334622
2019-01-29 14:04:09pitrousetmessages: + msg334520
2019-01-29 13:57:32skrahsetnosy: + ncoghlan, pitrou
messages: + msg334519
2018-12-31 07:51:00mattipsetmessages: + msg332805
2018-12-30 21:57:41lgautiersetmessages: + msg332766
2018-12-30 10:39:27skrahsetmessages: + msg332751
2018-12-30 04:25:55lgautiersetmessages: + msg332743
2018-09-29 19:13:16lgautiersetmessages: + msg326689
components: + Extension Modules, Library (Lib), - Interpreter Core
2018-09-29 16:30:29mattipsetmessages: + msg326684
2018-09-28 23:23:00lgautiersetmessages: + msg326660
2018-09-24 17:53:17mattipsetnosy: + mattip
messages: + msg326276
2018-09-24 07:41:12skrahsetnosy: + skrah
2018-09-23 17:51:38lgautiercreate