This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: [ctypes] Allow from_pointer creation
Type: enhancement Stage:
Components: ctypes Versions: Python 3.10
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: amaury.forgeotdarc, belopolsky, eryksun, meador.inge, memeplex
Priority: normal Keywords:

Created on 2016-06-09 03:17 by memeplex, last changed 2022-04-11 14:58 by admin.

Messages (8)
msg267951 - (view) Author: Memeplex (memeplex) Date: 2016-06-09 03:17
This real life example is pretty terrible:

(ct.c_float * self._nfeats).from_address(
   ct.addressof(self._vals.contents))

The alternative of casting the pointer to pointer-to-array, then pick ptr.contents is not really better.

What about a from_pointer(ptr) method? Or overloading from_address to take a pointer? Or a simple shortcut to get the address pointed to by a pointer (related: https://bugs.python.org/issue26565).

I think this part of ctypes api needs to get more concise and pythonic.
msg267954 - (view) Author: Memeplex (memeplex) Date: 2016-06-09 03:39
I would like to add some information about my use case. Many c structs have pointers to arrays of data plus some field indicating the length of those arrays. Sometimes I need to pickle that kind of structs and a bytes object has to somehow be created from each pointer, given the length (the alternative ptr[:len] is too expensive for large arrays). So I need to cast the pointer to a ctypes array first and then convert the array to bytes (sadly, there is no way to pickle a memoryview, so a copy is unavoidable).
msg267992 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2016-06-09 08:32
Probably adding from_pointer is a good idea. That said, there's a simpler way to go about getting a bytes copy for a pointer.

Say that you have a pointer p for the following array:

    >>> a = (c_float * 3)(1, 2, 3)
    >>> bytes(a)
    b'\x00\x00\x80?\x00\x00\x00@\x00\x00@@'
    >>> p = POINTER(c_float)(a)

IMO, the most straight-forward way to get a bytes copy is to call string_at:

    >>> string_at(p, sizeof(p.contents) * 3)
    b'\x00\x00\x80?\x00\x00\x00@\x00\x00@@

In 3.x string_at uses the FFI machinery to call the following function:

    static PyObject *
    string_at(const char *ptr, int size)
    {
        if (size == -1)
            return PyBytes_FromStringAndSize(ptr, strlen(ptr));
        return PyBytes_FromStringAndSize(ptr, size);
    }

The first argument can be any type accepted by c_void_p.from_param, such as a ctypes pointer/array, str, bytes, or an integer address.

Alternatively, note that pointer instantiation is the same as setting the `contents` value, which accepts any ctypes data object. Here's the C code that implements this assignment:

    dst = (CDataObject *)value;
    *(void **)self->b_ptr = dst->b_ptr;

The b_ptr field points at the buffer of the ctypes data object. Thus you can cast p to a char pointer without even calling the cast() function, which avoids an FFI call:

    >>> bp = POINTER(c_char)(p.contents)

Slicing c_char and c_wchar pointers is special cased to return bytes and str, respectively. So you can slice bp to get bytes:

    >>> bp[:sizeof(p.contents) * 3]
    b'\x00\x00\x80?\x00\x00\x00@\x00\x00@@'
msg268032 - (view) Author: Memeplex (memeplex) Date: 2016-06-09 15:36
Thank you for the great tips, Eryk, somehow I overlooked string_at while reading the docs.

Now, given that the address parameter of string_at is pretty overloaded, wouldn't it be reasonable to overload from_address the same instead of introducing from_pointer? That is, everywhere an address is expected, an address-like ctypes object would be ok.
msg268034 - (view) Author: Memeplex (memeplex) Date: 2016-06-09 15:47
> The first argument can be any type accepted by c_void_p.from_param, such as a ctypes pointer/array, str, bytes, or an integer address.

Now I see why you suggested ptr.as_void in 26565. Both issues are very related. Some functions are overloaded in the sense they automatically call c_void_p.from_param on their arguments; other functions aren't. Uniformity would dictate to treat all functions expecting an address the same way. Anyway this would not cover all use cases for ptr.toaddress or ptr.as_void or whatever it gets called.
msg270057 - (view) Author: Memeplex (memeplex) Date: 2016-07-09 17:36
I have been happily using this helper function:

def c_array(*args):
    if type(args[1]) is int:
        ptr, size = args
        return (ptr._type_ * size).from_address(ct.addressof(ptr.contents))
    else:
        c_type, buf = args
        return (c_type * (len(buf) // ct.sizeof(c_type))).from_buffer_copy(buf)

For example:

c_array(ptr_obj, 10)

c_array(c_int, bytes_obj)
msg270063 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2016-07-09 19:40
If your goal is to get a bytes object, I don't see the point of creating an array. string_at is simpler and more efficient. 

If you really must create an array, note that simple pointers (c_void_p, c_char_p, c_wchar_p) need special handling. They don't have a `contents` attribute, and their _type_ is a string. 

Also, I think combining from_address and addressof is a bit convoluted. I think it's cleaner to implement this using an array pointer:

    >>> ptr = POINTER(c_int)((c_int * 3)(1,2,3))
    >>> arr = POINTER(ptr._type_ * 3)(ptr.contents)[0]
    >>> arr[:]
    [1, 2, 3]

This also keeps the underlying ctypes object(s) properly referenced:

    >>> arr._b_base_
    <__main__.LP_c_int_Array_3 object at 0x7fb28471cd90>
    >>> arr._b_base_._objects['0']['1']
    <__main__.c_int_Array_3 object at 0x7fb28477da60>

whereas using from_address creates an array that dangerously doesn't own its own data and doesn't keep a reference to the owner:

    >>> arr2 = (ptr._type_ * 3).from_address(addressof(ptr.contents))
    >>> arr2._b_needsfree_
    0
    >>> arr2._b_base_ is None
    True
    >>> arr2._objects is None
    True

Let's create a larger array to ensure it's using an mmap region instead of the heap. This ensures a segfault when trying to access the memory block after it's deallocated:

    >>> ptr = POINTER(c_int)((c_int * 100000)(*range(100000)))
    >>> arr = (ptr._type_ * 100000).from_address(addressof(ptr.contents))
    >>> del ptr
    >>> x = arr[:]
    Segmentation fault (core dumped)

whereas using a dereferenced array pointer keeps the source data alive:

    >>> ptr = POINTER(c_int)((c_int * 100000)(*range(100000)))
    >>> arr = POINTER(ptr._type_ * 100000)(ptr.contents)[0]
    >>> del ptr
    >>> x = arr[:]
    >>> x[-5:]
    [99995, 99996, 99997, 99998, 99999]
msg270118 - (view) Author: Memeplex (memeplex) Date: 2016-07-10 18:10
As usual, thank you for the detailed and informative answer, Eryk. I think I understand your points but I decided to do it the way I did it because:

1. I sometimes need the array itself. For example, some of my clases contains or inherits from a ctypes structure with pointers (to an array of memory). Usually I name these pointers with a leading underscore and expose them as properties returning ctypes arrays.

2. For pickling/unpickling ctypes arrays provide a convenient middle-point between bytes objects and ctypes pointers. Getting a bytes object from an array is as easy as calling bytes() on it. OTOH, the array can be directly assigned to a compatible pointer structure field.

3. While the overloaded c_array(ptr, size)/c_array(type, bytes) is not the most efficient api to get bytes from a pointer and viceversa, it's very simple for the range of uses cases (1 and 2) it covers. Nevertheless, I have benchmarked the perfomance and it's not that terrible.
History
Date User Action Args
2022-04-11 14:58:32adminsetgithub: 71461
2021-03-19 04:35:04eryksunsetversions: + Python 3.10, - Python 3.6
2016-07-10 18:10:35memeplexsetmessages: + msg270118
2016-07-09 19:40:58eryksunsetmessages: + msg270063
2016-07-09 17:36:48memeplexsetmessages: + msg270057
2016-06-09 15:47:54memeplexsetmessages: + msg268034
2016-06-09 15:36:57memeplexsetmessages: + msg268032
2016-06-09 08:32:49eryksunsetnosy: + eryksun
messages: + msg267992
2016-06-09 08:07:36SilentGhostsetnosy: + amaury.forgeotdarc, belopolsky, meador.inge

versions: + Python 3.6
2016-06-09 03:39:11memeplexsetmessages: + msg267954
2016-06-09 03:17:38memeplexcreate