Author rhansen
Recipients pitrou, rhansen, seberg, skrah
Date 2015-02-01.02:29:19
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1422757762.56.0.808707841757.issue23352@psf.upfronthosting.co.za>
In-reply-to
Content
(The following message is mostly off-topic but I think it is relevant to those interested in this issue.  This message is about the clarity of the documentation regarding flag semantics, and what I think the flags should mean.)

> Cython doesn't follow the spec though (use Python 3):
> 
>     from _testbuffer import *
>     cpdef foo():
>         cdef unsigned char[:] v = bytearray(b"testing")
>         nd = ndarray(v, getbuf=PyBUF_ND)
>         print(nd.suboffsets)
>         nd = ndarray(v, getbuf=PyBUF_FULL)
>         print(nd.suboffsets)

When I compile and run the above (latest Cython from Git master), I get:

    ()
    ()

Looking at the Python source code (Modules/_testbuffer.c ndarray_get_suboffsets() and ssize_array_as_tuple()) the above output is printed if the suboffsets member is NULL.

> Cython hands out suboffsets even for a PyBUF_ND request, which is
> definitely wrong.  For a PyBUF_FULL request they should only be
> provided *if needed*.

suboffsets appears to be NULL in both cases in your example, which seems acceptable to me given:
  * the user didn't request suboffsets information in the PyBUF_ND
    case, and
  * there are no indirections in the PyBUF_FULL case.

Even if suboffsets wasn't NULL, I believe the behavior would still be correct for the PyBUF_ND case.  My reading of <https://docs.python.org/3/c-api/buffer.html> is that flags=PyBUF_ND implies that shape member MUST NOT be NULL, but strides and suboffsets can be NULL or not (it doesn't matter).  This interpretation of mine is due to the sentence, "Note that each flag contains all bits of the flags below it."  If flags=PyBUF_ND meant that strides and suboffsets MUST be NULL, then PyBUF_ND and PyBUF_STRIDES would necessarily be mutually exclusive and flags=(PyBUF_ND|PyBUF_STRIDES) would not make sense (the strides member would have to be both NULL and non-NULL).

If (flags & PyBUF_INDIRECT) is false, then the consumer is not interested in the suboffsets member so it shouldn't matter what it points to.  (And the consumer should not dereference the pointer in case it points to a junk address.)

IMHO, if the buffer is C-style contiguous then the producer should be allowed to populate the shape, strides, and suboffsets members regardless of whether or not any of the PyBUF_ND, PyBUF_STRIDES, or PyBUF_INDIRECT flags are set.  In other words, for C-style contiguous buffers, the producer should be allowed to act as if PyBUF_INDIRECT was always provided because the consumer will always get an appropriate Py_buffer struct regardless of the actual state of the PyBUF_ND, PyBUF_STRIDES, and PyBUF_INDIRECT flags.

It *is* a bug, however, to dereference the strides or suboffsets members with ndarray(v, getbuf=PyBUF_ND) because the consumer didn't ask for strides or suboffsets information and the pointers might be bogus.

Stepping back a bit, I believe that the flags should be thought of as imposing requirements on the producer.  I think those requirements should be (assuming ndim > 0):

  * PyBUF_ND:  If (flags & PyBUF_ND) is true:
      - If (flags & PyBUF_STRIDES) is false *and* the producer is
        unable to produce a block of memory at [buf,buf+len)
        containing all (len/itemsize) entries in a C-style contiguous
        chunk, then the producer must raise an exception.
      - Otherwise, the producer must provide the shape of buffer.
    If (flags & PyBUF_ND) is false:
      - If the producer is unable to produce a contiguous chunk of
        (len/itemsize) entries (of any shape) in the block of memory
        at [buf,buf+len), then the producer must raise an exception.
      - Otherwise, the producer is permitted to do any of the
        following:
          + don't touch the shape member (don't set it to NULL or any
            other value; just leave it as-is)
          + set the shape member to NULL
          + set the shape member to point to an array describing the
            shape
          + set the shape member to point to any other location, even
            if dereferencing the pointer would result in a segfault
  * PyBUF_STRIDES:  If (flags & PyBUF_STRIDES) is true:
      - The producer must provide the appropriate strides information.
    If (flags & PyBUF_STRIDES) is false:
      - If the producer is unable to produce a block of memory at
        [buf,buf+len) containing all (len/itemsize) entries, the
        producer must raise an exception.
      - Otherwise, the producer is permitted to do any of the
        following;
          + don't touch the strides member (don't set it to NULL or
            any other value; just leave it as-is)
          + set the strides member to NULL
          + set the strides member to point to an array describing the
            strides
          + set the strides member to point to any other location,
            even if dereferencing the pointer would result in a
            segfault
  * PyBUF_INDIRECT:  If (flags & PyBUF_INDIRECT) is true:
      - If the buffer uses indirections then the producer must set the
        suboffsets member to point to an array with appropriate
        entries.
      - Otherwise, the producer can either set the suboffsets member
        to NULL or set it to point to an array of all negative
        entries.
    If (flags & PyBUF_INDIRECT) is false:
      - If the producer cannot produce a buffer that does not have
        indirections, then the producer must raise an exception.
      - Otherwise, the producer is permitted to do any of the
        following:
          + don't touch the suboffsets member (don't set it to NULL or
            any other value; just leave it as-is)
          + set the suboffsets member to NULL
          + set the suboffsets member to point to an array of all
            negative entries
          + set the suboffsets member to point to any other location,
            even if dereferencing the pointer would result in a
            segfault
History
Date User Action Args
2015-02-01 02:29:22rhansensetrecipients: + rhansen, pitrou, skrah, seberg
2015-02-01 02:29:22rhansensetmessageid: <1422757762.56.0.808707841757.issue23352@psf.upfronthosting.co.za>
2015-02-01 02:29:22rhansenlinkissue23352 messages
2015-02-01 02:29:19rhansencreate