Message 141858 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	skrah
Recipients	jcon, mark.dickinson, ncoghlan, petri.lehtinen, pitrou, pv, rupole, skrah, teoliphant, vstinner
Date	2011-08-10.11:46:41
SpamBayes Score	0.0
Marked as misclassified	No
Message-id	<1312976803.02.0.732791234967.issue10181@psf.upfronthosting.co.za>
In-reply-to

Content
I thought it might be productive to switch to documentation/test driven development for PEP-3118 in general. So I updated the documentation, trying to spell out the responsibilities of both exporter and consumer as clearly as possible. In order to have a PEP-3118 reference implementation, I wrote Modules/_testbuffer.c and Lib/test/test_buffer.py. The test module contains an ndarray object (independent from NumPy's ndarray) with these features: o Full base object capabilities, including responding to flag specific requests. o Full re-exporter capability: The object obtains a buffer from another exporter and poses as a base object. o Optional capability to change layout while buffers are exported. o Full support for arbitrary format strings using the struct module. o Fortran style arrays. o Arbitrary multidimensional structures, including offsets and negative strides. o Support for converting arrays to suboffset representations. o Support for multidimensional indexing, slicing and tolist(). o Optional support for testing against NumPy. In memoryobject.c I only fixed the buffer release issues that came up. Before proceeding with allocating private arrays for each memoryview, it would be great to have agreement on the revised documentation and a couple of other issues. Documentation ------------- I'll highlight some changes here: 1) view->obj: Must be a new reference to the provider if the buffer is obtained via getbuffer(), otherwise NULL. In case of failure the field MUST be set to NULL (was: undefined). So, logically this should be seen the same way as returning a new reference from a standard Python function. 2) view->buf: This can (and _does_ for Numpy arrays) point to any location in the underlying memory block. If a consumer doesn't request one of the simple or contiguous buffers, all bets are off. 3) view->len: Make it clear that the PEP defines it as product(shape) * itemsize. 4) Ownership for shape, strides, internal: read-only for the consumer. 5) Flag explanations: The new section puts emphasis on which fields a consumer can expect in response to a buffer request, starting with the fields that will always be set. 6) Rule for writable/read-only requests. Only raise if the buffer is read-only and the request is 'writable'. This seems to be the most practical solution. 7) Add NumPy-style and PIL-style sections to make clear what a consumer must be able to handle if complicated structures are requested (thanks Pauli Virtanen for clarifying what valid NumPy arrays may look like). What I would like to spell out clearly: 8) A PEP-3118 compliant provider MUST always be able to respond to a PyBUF_FULL_RO request (i.e. fill in shape AND strides). ctypes doesn't currently do that, but could be fixed easily. This is easy to implement for the exporter (see how PyBuffer_FillInfo() does it in a few lines). Memoryview ---------- 8) Add PyMemoryView_FromBytes(). This could potentially replace PyMemoryView_FromBuffer(). 9) I've come to think that memoryviews should only be created from PyBUF_FULL_RO requests (this also automatically succeeds if an object is writable, see above). One reason is that we don't constantly have to check for the presence of shape and strides and format (unless ndim = 0). Currently it is possible to create any kind of view via PyMemoryView_FromBuffer(). It would be possible to deprecate it or make it a rule that the buffer argument must have full information. 10) NumPy has a limit of ndim = 64. It would be possible to use that limit as well and make shape, strides and suboffsets static arrays of size 64. Perhaps this is a bit wasteful, it depends on how many views are typically created. 11) test_buffer.py contains an example (see: test_memoryview_release()) that Antoine's test case will not work if a re-exporter is involved. Should we leave that or fix it, too? My apologies for the long list, but it'll be easier to proceed if some rules are written in stone. :)

I thought it might be productive to switch to documentation/test driven
development for PEP-3118 in general. So I updated the documentation,
trying to spell out the responsibilities of both exporter and consumer
as clearly as possible.

In order to have a PEP-3118 reference implementation, I wrote
Modules/_testbuffer.c and Lib/test/test_buffer.py. The test module
contains an ndarray object (independent from NumPy's ndarray) with
these features:

  o Full base object capabilities, including responding to flag
    specific requests.

  o Full re-exporter capability: The object obtains a buffer from
    another exporter and poses as a base object.

  o Optional capability to change layout while buffers are exported.

  o Full support for arbitrary format strings using the struct
    module.

  o Fortran style arrays.

  o Arbitrary multidimensional structures, including offsets and
    negative strides.

  o Support for converting arrays to suboffset representations.

  o Support for multidimensional indexing, slicing and tolist().

  o Optional support for testing against NumPy.


In memoryobject.c I only fixed the buffer release issues that came up.
Before proceeding with allocating private arrays for each memoryview,
it would be great to have agreement on the revised documentation and a
couple of other issues.

Documentation
-------------

I'll highlight some changes here:

  1) view->obj: Must be a new reference to the provider if the buffer
     is obtained via getbuffer(), otherwise NULL. In case of failure
     the field MUST be set to NULL (was: undefined).

     So, logically this should be seen the same way as returning
     a new reference from a standard Python function.

  2) view->buf: This can (and _does_ for Numpy arrays) point to
     any location in the underlying memory block. If a consumer
     doesn't request one of the simple or contiguous buffers,
     all bets are off.

  3) view->len: Make it clear that the PEP defines it as
     product(shape) * itemsize.

  4) Ownership for shape, strides, internal: read-only for the
     consumer.

  5) Flag explanations: The new section puts emphasis on which fields
     a consumer can expect in response to a buffer request, starting
     with the fields that will always be set.

  6) Rule for writable/read-only requests. Only raise if the
     buffer is read-only and the request is 'writable'. This
     seems to be the most practical solution.

  7) Add NumPy-style and PIL-style sections to make clear what
     a consumer must be able to handle if complicated structures
     are requested (thanks Pauli Virtanen for clarifying what
     valid NumPy arrays may look like).


What I would like to spell out clearly:

  8) A PEP-3118 compliant provider MUST always be able to respond
     to a PyBUF_FULL_RO request (i.e. fill in shape AND strides).
     ctypes doesn't currently do that, but could be fixed easily.

     This is easy to implement for the exporter (see how
     PyBuffer_FillInfo() does it in a few lines).


Memoryview
----------

  8) Add PyMemoryView_FromBytes(). This could potentially replace
     PyMemoryView_FromBuffer().

  9) I've come to think that memoryviews should only be created
     from PyBUF_FULL_RO requests (this also automatically succeeds
     if an object is writable, see above).

     One reason is that we don't constantly have to check for
     the presence of shape and strides and format (unless ndim = 0).

     Currently it is possible to create any kind of view via
     PyMemoryView_FromBuffer(). It would be possible to deprecate
     it or make it a rule that the buffer argument must have full
     information.

  10) NumPy has a limit of ndim = 64. It would be possible to
      use that limit as well and make shape, strides and
      suboffsets static arrays of size 64. Perhaps this is a bit
      wasteful, it depends on how many views are typically
      created.

  11) test_buffer.py contains an example (see: test_memoryview_release())
      that Antoine's test case will not work if a re-exporter is
      involved. Should we leave that or fix it, too?



My apologies for the long list, but it'll be easier to proceed if
some rules are written in stone. :)

History
Date	User	Action	Args
2011-08-10 11:46:43	skrah	set	recipients: + skrah, teoliphant, mark.dickinson, ncoghlan, rupole, pitrou, vstinner, pv, jcon, petri.lehtinen
2011-08-10 11:46:43	skrah	set	messageid: <1312976803.02.0.732791234967.issue10181@psf.upfronthosting.co.za>
2011-08-10 11:46:42	skrah	link	issue10181 messages
2011-08-10 11:46:41	skrah	create