Issue27274
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2016-06-09 03:17 by memeplex, last changed 2022-04-11 14:58 by admin.
Messages (8) | |||
---|---|---|---|
msg267951 - (view) | Author: Memeplex (memeplex) | Date: 2016-06-09 03:17 | |
This real life example is pretty terrible: (ct.c_float * self._nfeats).from_address( ct.addressof(self._vals.contents)) The alternative of casting the pointer to pointer-to-array, then pick ptr.contents is not really better. What about a from_pointer(ptr) method? Or overloading from_address to take a pointer? Or a simple shortcut to get the address pointed to by a pointer (related: https://bugs.python.org/issue26565). I think this part of ctypes api needs to get more concise and pythonic. |
|||
msg267954 - (view) | Author: Memeplex (memeplex) | Date: 2016-06-09 03:39 | |
I would like to add some information about my use case. Many c structs have pointers to arrays of data plus some field indicating the length of those arrays. Sometimes I need to pickle that kind of structs and a bytes object has to somehow be created from each pointer, given the length (the alternative ptr[:len] is too expensive for large arrays). So I need to cast the pointer to a ctypes array first and then convert the array to bytes (sadly, there is no way to pickle a memoryview, so a copy is unavoidable). |
|||
msg267992 - (view) | Author: Eryk Sun (eryksun) * | Date: 2016-06-09 08:32 | |
Probably adding from_pointer is a good idea. That said, there's a simpler way to go about getting a bytes copy for a pointer. Say that you have a pointer p for the following array: >>> a = (c_float * 3)(1, 2, 3) >>> bytes(a) b'\x00\x00\x80?\x00\x00\x00@\x00\x00@@' >>> p = POINTER(c_float)(a) IMO, the most straight-forward way to get a bytes copy is to call string_at: >>> string_at(p, sizeof(p.contents) * 3) b'\x00\x00\x80?\x00\x00\x00@\x00\x00@@ In 3.x string_at uses the FFI machinery to call the following function: static PyObject * string_at(const char *ptr, int size) { if (size == -1) return PyBytes_FromStringAndSize(ptr, strlen(ptr)); return PyBytes_FromStringAndSize(ptr, size); } The first argument can be any type accepted by c_void_p.from_param, such as a ctypes pointer/array, str, bytes, or an integer address. Alternatively, note that pointer instantiation is the same as setting the `contents` value, which accepts any ctypes data object. Here's the C code that implements this assignment: dst = (CDataObject *)value; *(void **)self->b_ptr = dst->b_ptr; The b_ptr field points at the buffer of the ctypes data object. Thus you can cast p to a char pointer without even calling the cast() function, which avoids an FFI call: >>> bp = POINTER(c_char)(p.contents) Slicing c_char and c_wchar pointers is special cased to return bytes and str, respectively. So you can slice bp to get bytes: >>> bp[:sizeof(p.contents) * 3] b'\x00\x00\x80?\x00\x00\x00@\x00\x00@@' |
|||
msg268032 - (view) | Author: Memeplex (memeplex) | Date: 2016-06-09 15:36 | |
Thank you for the great tips, Eryk, somehow I overlooked string_at while reading the docs. Now, given that the address parameter of string_at is pretty overloaded, wouldn't it be reasonable to overload from_address the same instead of introducing from_pointer? That is, everywhere an address is expected, an address-like ctypes object would be ok. |
|||
msg268034 - (view) | Author: Memeplex (memeplex) | Date: 2016-06-09 15:47 | |
> The first argument can be any type accepted by c_void_p.from_param, such as a ctypes pointer/array, str, bytes, or an integer address. Now I see why you suggested ptr.as_void in 26565. Both issues are very related. Some functions are overloaded in the sense they automatically call c_void_p.from_param on their arguments; other functions aren't. Uniformity would dictate to treat all functions expecting an address the same way. Anyway this would not cover all use cases for ptr.toaddress or ptr.as_void or whatever it gets called. |
|||
msg270057 - (view) | Author: Memeplex (memeplex) | Date: 2016-07-09 17:36 | |
I have been happily using this helper function: def c_array(*args): if type(args[1]) is int: ptr, size = args return (ptr._type_ * size).from_address(ct.addressof(ptr.contents)) else: c_type, buf = args return (c_type * (len(buf) // ct.sizeof(c_type))).from_buffer_copy(buf) For example: c_array(ptr_obj, 10) c_array(c_int, bytes_obj) |
|||
msg270063 - (view) | Author: Eryk Sun (eryksun) * | Date: 2016-07-09 19:40 | |
If your goal is to get a bytes object, I don't see the point of creating an array. string_at is simpler and more efficient. If you really must create an array, note that simple pointers (c_void_p, c_char_p, c_wchar_p) need special handling. They don't have a `contents` attribute, and their _type_ is a string. Also, I think combining from_address and addressof is a bit convoluted. I think it's cleaner to implement this using an array pointer: >>> ptr = POINTER(c_int)((c_int * 3)(1,2,3)) >>> arr = POINTER(ptr._type_ * 3)(ptr.contents)[0] >>> arr[:] [1, 2, 3] This also keeps the underlying ctypes object(s) properly referenced: >>> arr._b_base_ <__main__.LP_c_int_Array_3 object at 0x7fb28471cd90> >>> arr._b_base_._objects['0']['1'] <__main__.c_int_Array_3 object at 0x7fb28477da60> whereas using from_address creates an array that dangerously doesn't own its own data and doesn't keep a reference to the owner: >>> arr2 = (ptr._type_ * 3).from_address(addressof(ptr.contents)) >>> arr2._b_needsfree_ 0 >>> arr2._b_base_ is None True >>> arr2._objects is None True Let's create a larger array to ensure it's using an mmap region instead of the heap. This ensures a segfault when trying to access the memory block after it's deallocated: >>> ptr = POINTER(c_int)((c_int * 100000)(*range(100000))) >>> arr = (ptr._type_ * 100000).from_address(addressof(ptr.contents)) >>> del ptr >>> x = arr[:] Segmentation fault (core dumped) whereas using a dereferenced array pointer keeps the source data alive: >>> ptr = POINTER(c_int)((c_int * 100000)(*range(100000))) >>> arr = POINTER(ptr._type_ * 100000)(ptr.contents)[0] >>> del ptr >>> x = arr[:] >>> x[-5:] [99995, 99996, 99997, 99998, 99999] |
|||
msg270118 - (view) | Author: Memeplex (memeplex) | Date: 2016-07-10 18:10 | |
As usual, thank you for the detailed and informative answer, Eryk. I think I understand your points but I decided to do it the way I did it because: 1. I sometimes need the array itself. For example, some of my clases contains or inherits from a ctypes structure with pointers (to an array of memory). Usually I name these pointers with a leading underscore and expose them as properties returning ctypes arrays. 2. For pickling/unpickling ctypes arrays provide a convenient middle-point between bytes objects and ctypes pointers. Getting a bytes object from an array is as easy as calling bytes() on it. OTOH, the array can be directly assigned to a compatible pointer structure field. 3. While the overloaded c_array(ptr, size)/c_array(type, bytes) is not the most efficient api to get bytes from a pointer and viceversa, it's very simple for the range of uses cases (1 and 2) it covers. Nevertheless, I have benchmarked the perfomance and it's not that terrible. |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:58:32 | admin | set | github: 71461 |
2021-03-19 04:35:04 | eryksun | set | versions: + Python 3.10, - Python 3.6 |
2016-07-10 18:10:35 | memeplex | set | messages: + msg270118 |
2016-07-09 19:40:58 | eryksun | set | messages: + msg270063 |
2016-07-09 17:36:48 | memeplex | set | messages: + msg270057 |
2016-06-09 15:47:54 | memeplex | set | messages: + msg268034 |
2016-06-09 15:36:57 | memeplex | set | messages: + msg268032 |
2016-06-09 08:32:49 | eryksun | set | nosy:
+ eryksun messages: + msg267992 |
2016-06-09 08:07:36 | SilentGhost | set | nosy:
+ amaury.forgeotdarc, belopolsky, meador.inge versions: + Python 3.6 |
2016-06-09 03:39:11 | memeplex | set | messages: + msg267954 |
2016-06-09 03:17:38 | memeplex | create |