classification
Title: ctypes incorrectly encodes .format attribute of memory views
Type: behavior Stage: resolved
Components: ctypes Versions: Python 3.4, Python 3.5
process
Status: closed Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: amaury.forgeotdarc, belopolsky, dabeaz, martin.panter, meador.inge
Priority: normal Keywords:

Created on 2012-10-04 15:35 by dabeaz, last changed 2021-01-02 02:41 by dabeaz. This issue is now closed.

Messages (3)
msg171965 - (view) Author: David Beazley (dabeaz) Date: 2012-10-04 15:35
This is somewhat related to an earlier bug report concerning memory views, but as far as I can tell, ctypes is not encoding the '.format' attribute correctly in most cases.   Consider this example:

First, create a ctypes array:

>>> a = (ctypes.c_double * 3)(1,2,3)
>>> len(a)
3
>>> a[0]
1.0
>>> a[1]
2.0
>>> 

Now, create a memory view for it:

>>> m = memoryview(a)
>>> len(m)
3
>>> m.itemsize
8
>>> m.ndim
1
>>> m.shape
(3,)
>>> 

All looks well.  However, if you try to do anything with the .format or access the items, it's completely broken:

>>> m.format
'(3)<d'
>>> m[0]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NotImplementedError: memoryview: unsupported format (3)<d
>>> 

This is quite inconsistent with the behavior observed elsewhere. For example:

>>> import array
>>> b = array.array('d',[1,2,3])
>>> memoryview(b).format
'd'
>>> import numpy
>>> c = numpy.array([1,2,3],dtype='d')
>>> memoryview(c).format
'd'
>>> 

As you can see, array libraries are using .format to encode the format of a single array item.  ctypes is encoding the format of the entire array (all items).  ctypes also includes endianness which presents additional difficulties.

This behavior affects both Python code that wants to use memoryviews, but also C extension code that wants to use the underlying buffer protocol to work with arrays in a generic way. Essentially, it cuts the use of ctypes off entirely unless you modify the underlying buffer handling code to special case it. 

Suggested fix:  Have ctypes only encode the format for a single item in the case of arrays.  Also, for items that are encoded using the native byte ordering, don't include an endianness modifier ('<','>', etc.).  Including the byte order just complicates all of the handling code because it has to be modified to a) know what the native byte ordering is and b) to check multiple cases such as for "d" and "<d".
msg222154 - (view) Author: Mark Lawrence (BreamoreBoy) * Date: 2014-07-03 08:04
@David please accept our apologies for the delay in getting back to you.

Can someone else take a look please as I know nothing about ctypes, thanks.
msg267959 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-06-09 04:56
In Python 3.5 (since Issue 15944), you can now cast normal C-contiguous memoryviews (including ones from ctypes) to bytes:

>>> a = (ctypes.c_double * 3)(1,2,3)
>>> m = memoryview(a)
>>> m.format
'<d'
>>> byteview = m.cast("B")
>>> byteview.format
'B'
>>> byteview[0]
0

Also, the format has changed at some point. See '<d' above vs '(3)<d' from David’s original post. Maybe it would be nice for ctypes to use a proper struct module format string where possible, but there doesn’t seem to be much demand.
History
Date User Action Args
2021-01-02 02:41:59dabeazsetstatus: open -> closed
stage: resolved
2016-06-09 08:38:35BreamoreBoysetnosy: - BreamoreBoy
2016-06-09 04:56:11martin.pantersetnosy: + martin.panter
messages: + msg267959
components: + ctypes
2014-07-03 08:04:55BreamoreBoysetnosy: + amaury.forgeotdarc, BreamoreBoy, belopolsky

messages: + msg222154
versions: + Python 3.4, Python 3.5, - Python 3.3
2012-10-05 02:53:26meador.ingesetnosy: + meador.inge
2012-10-04 15:35:06dabeazcreate