Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't read a F-contiguous memoryview in physical order #80026

Open
pitrou opened this issue Jan 28, 2019 · 9 comments
Open

Can't read a F-contiguous memoryview in physical order #80026

pitrou opened this issue Jan 28, 2019 · 9 comments
Labels
3.8 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement

Comments

@pitrou
Copy link
Member

pitrou commented Jan 28, 2019

BPO 35845
Nosy @pitrou, @skrah, @jakirkham
PRs
  • bpo-35845: Add order={'C', 'F', 'A'} parameter to memoryview.tobytes(). #11730
  • bpo-35845: Add order={'C', 'F', 'A'} parameter to memoryview.tobytes(). #11730
  • bpo-35845: Add order={'C', 'F', 'A'} parameter to memoryview.tobytes(). #11730
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2019-01-28.20:51:37.338>
    labels = ['interpreter-core', 'type-feature', '3.8']
    title = "Can't read a F-contiguous memoryview in physical order"
    updated_at = <Date 2020-06-05.16:33:50.845>
    user = 'https://github.com/pitrou'

    bugs.python.org fields:

    activity = <Date 2020-06-05.16:33:50.845>
    actor = 'jakirkham'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Interpreter Core']
    creation = <Date 2019-01-28.20:51:37.338>
    creator = 'pitrou'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 35845
    keywords = ['patch', 'patch', 'patch']
    message_count = 9.0
    messages = ['334491', '334495', '334496', '334497', '334498', '334739', '334743', '334759', '370767']
    nosy_count = 3.0
    nosy_names = ['pitrou', 'skrah', 'jakirkham']
    pr_nums = ['11730', '11730', '11730']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue35845'
    versions = ['Python 3.8']

    @pitrou
    Copy link
    Member Author

    pitrou commented Jan 28, 2019

    This request is motivated in detail here:
    python/peps#883 (comment)

    In short: in C, when you have a Py_buffer, you can directly read the memory in whatever order you want (including physical order). It is not possible in pure Python, though. Somewhat unintuitively, memoryview.tobytes() as well as bytes(memoryview) read bytes in *logical* order, even though it flattens the dimensions and doesn't keep the original type. Logical order is different from physical order for Fortran-contiguous arrays.

    One possible way of alleviating this would be to offer a memoryview.transpose() method, similar to the Numpy transpose() method (see https://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.transpose.html).

    One could also imagine a memoryview.to_c_contiguous() method.

    Or even: a memoryview.raw_memory() method, that would 1) flatten dimensions 2) cast to 'B' format 3) keep physical order.

    @pitrou pitrou added 3.8 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement labels Jan 28, 2019
    @skrah
    Copy link
    Mannequin

    skrah mannequin commented Jan 28, 2019

    Yes, it's modeled after NumPy's tobytes():

    >>> x = np.array(list(range(6)), dtype="int8").reshape(2,3)
    >>> x.tobytes()
    b'\x00\x01\x02\x03\x04\x05'
    >>> x.T.tobytes()
    b'\x00\x03\x01\x04\x02\x05'
    >>> 
    >>> 
    >>> memoryview(x).tobytes()
    b'\x00\x01\x02\x03\x04\x05'
    >>> memoryview(x.T).tobytes()
    b'\x00\x03\x01\x04\x02\x05'

    I guess the reason is that without a type it's easier to serialize the logical array by default, so you can always assume C when you read back.

    NumPy also has an 'F' parameter though that flips the order:

    >>> x.tobytes('F')
    b'\x00\x03\x01\x04\x02\x05'

    It would be possible to add this to memoryview as well.

    @skrah
    Copy link
    Mannequin

    skrah mannequin commented Jan 28, 2019

    raw_bytes() is also possible of course. I assume it would do nothing and just dump the memory.

    Or tobytes('F') AND tobytes('raw').

    @pitrou
    Copy link
    Member Author

    pitrou commented Jan 28, 2019

    Well, raw_memory() would avoid a copy, which is useful.

    As for tobytes(), if we want to follow NumPy, we can have 'F' mean if F-contiguous, 'C' otherwise:

    >>> a = np.arange(12, dtype='int8').reshape((3,4))                                                                             
    >>> a.tobytes('A')                                                                                                             
    b'\x00\x01\x02\x03\x04\x05\x06\x07\x08\t\n\x0b'
    >>> a.tobytes('A') == a.T.tobytes('A')                                                                                         
    True

    @pitrou
    Copy link
    Member Author

    pitrou commented Jan 28, 2019

    Sorry, my fingers slipped. Let me try again:

    As for tobytes(), if we want to follow NumPy, we can have 'A' mean 'F' if F-contiguous, 'C' otherwise: [...]

    @skrah
    Copy link
    Mannequin

    skrah mannequin commented Feb 2, 2019

    Yes, following NumPy looks like the sanest option for tobytes(), so I
    went ahead and implemented that signature.

    memory.raw() is of course complicated by the fact that things like
    m[::-1] move buf.ptr to the end of the buffer.

    So we'd need to restrict to contiguous views anyway, which makes
    the method less appealing (IOW, it doesn't offer more than an
    augmented memoryview.cast()).

    @pitrou
    Copy link
    Member Author

    pitrou commented Feb 2, 2019

    So we'd need to restrict to contiguous views anyway, which makes
    the method less appealing (IOW, it doesn't offer more than an
    augmented memoryview.cast()).

    Yes, it would probably be a simpler way of writing .cast('B', shape=(...), order='A').

    @skrah
    Copy link
    Mannequin

    skrah mannequin commented Feb 2, 2019

    New changeset d08ea70 by Stefan Krah in branch 'master':
    bpo-35845: Add order={'C', 'F', 'A'} parameter to memoryview.tobytes(). (bpo-11730)
    d08ea70

    @jakirkham
    Copy link
    Mannequin

    jakirkham mannequin commented Jun 5, 2020

    Sorry if I'm just misunderstanding the discussion here. Would it make sense to have an order keyword argument to cast as well? This seems useful when interpreting a flatten F-order bytes object (say on the receiving end of a transmission).

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.8 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    1 participant