classification
Title: ctypes works incorrectly with _swappedbytes_ = 1
Type: behavior Stage: resolved
Components: ctypes Versions: Python 3.1, Python 3.2, Python 3.3, Python 2.7
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: Pavel.Boldin, meador.inge
Priority: normal Keywords:

Created on 2011-09-09 13:54 by Pavel.Boldin, last changed 2011-11-29 02:29 by meador.inge. This issue is now closed.

Files
File name Uploaded Description Edit
test_ctypes.py Pavel.Boldin, 2011-09-09 13:54
Messages (9)
msg143761 - (view) Author: Pavel Boldin (Pavel.Boldin) Date: 2011-09-09 13:54
ctypes seems to work incorrectly with _swappedbytes_ specified.

I.e. it misses some values from buffer:

class X(ctypes.Structure):
    _swappedbytes_ = 1
    _pack_ = 1
    _fields_ = [
        ('a', ctypes.c_ubyte, 4),
        ('b', ctypes.c_ubyte, 4),
        ('c', ctypes.c_ushort, 8),
        ('d', ctypes.c_ushort, 8),
    ]

buf = '\x12\x34\x56\x78'
x = X.from_buffer_copy(buf)

print x.a == 1
print x.b == 2
print x.c == 3
print x.d == 4

This prints
True
True
False
False

Where as four 'True' are expected.
msg143818 - (view) Author: Meador Inge (meador.inge) * (Python committer) Date: 2011-09-09 23:53
I can reproduce this on Fedora 15 with the Python tip revision.  I am investigating the behavior now.
msg143847 - (view) Author: Meador Inge (meador.inge) * (Python committer) Date: 2011-09-11 01:41
Pavel, I looked into to this a little more and have some ideas.  First off, '_swappedbytes_' is an undocumented implementation detail that is used to implement the LittleEndianStructure and BigEndianStructure types.  So using it directly like that is not expected to work.  If you want to change the endianity of the layout, then use LittleEndianStructure or BigEndianStructure instead.

Next, consider your test case when removing '_swappedbytes_' and using a LittleEndianStructure.  I am using a Linux hosted Python which lays out the structures in a manner similar to GCC.  This gives a layout like:

 ---------------------------------------
 | unsigned short   | unsigned short   |
 ---------------------------------------
 | bbbbaaaa cccccccc dddddddd ........ |
 ---------------------------------------
 | 00010010 00110100 01010110 01111000 |
 ---------------------------------------

'a', 'b', and 'c' all get "expanded" into one 'unsigned short'.  Then 'd' has to go in an 'unsigned short' as well leaving one byte left over with don't cares.

Similarly, the big endian layout looks like:

 ---------------------------------------
 | unsigned short   | unsigned short   |
 ---------------------------------------
 | aaaabbbb cccccccc dddddddd ........ |
 ---------------------------------------
 | 00010010 00110100 01010110 01111000 |
 ---------------------------------------

All of this is really a roundabout way of saying that the documentation for ctypes structure layout stinks.  issue12880 has been opened to fix it.

Does this seem reasonable to you?
msg143849 - (view) Author: Pavel Boldin (Pavel.Boldin) Date: 2011-09-11 01:54
Yes. Thanks. But here is another error:

import ctypes


class X(ctypes.Structure):
    _pack_ = 1
    _fields_ = [
        ('a', ctypes.c_ubyte, 4),
        ('b', ctypes.c_ubyte, 4),
        ('c', ctypes.c_ushort, 4),
        ('d', ctypes.c_ushort, 12),
    ]

buf = '\x12\x34\x56\x78'
x = X.from_buffer_copy(buf)

print X.a
print X.b
print X.c
print X.d

print x.a == 2
print x.b == 1
print x.c == 4
print x.d == 0x563


Prints (python 2.7.1):
True
True
True
False

Can you reproduce this?
msg143850 - (view) Author: Meador Inge (meador.inge) * (Python committer) Date: 2011-09-11 02:59
Yes I can.  This seems strange, but it is correct.  The little endian case look like:

 Little endian
 ---------------------------------------
 | unsigned short   | unsigned short   |
 ---------------------------------------
 | bbbbaaaa ....cccc dddddddd ....dddd |
 ---------------------------------------
 | 00010010 00110100 01010110 01111000 |
 ---------------------------------------

where the 'd' bits pack from left to right, so '1000 01010110'.
The big endian case look like:

 Big endian
 ---------------------------------------
 | unsigned short   | unsigned short   |
 ---------------------------------------
 | aaaabbbb cccc.... dddddddd dddd.... |
 ---------------------------------------
 | 00010010 00110100 01010110 01111000 |
 ---------------------------------------

where the 'd' bits pack from right to left, so '01010110 0111'.

The native case (Structure) can typically be verified using your host C compiler.  For example, the above code can be represented in C as:

#include <stdio.h>

struct T
{
  unsigned char  a : 4;
  unsigned char  b : 4;
  unsigned short c : 4;
  unsigned short d : 12;
};

int main (int argc, char **argv)
{
  unsigned char bytes[] = {0x12, 0x34, 0x56, 0x78};
  struct T *t = (struct T*)&bytes;

  printf ("%X\n", t->a);
  printf ("%X\n", t->b);
  printf ("%X\n", t->c);
  printf ("%X\n", t->d);
}

With respect to structure layout, ctypes typically behaves the same way as the native compiler used to build the interpreter.
msg143861 - (view) Author: Pavel Boldin (Pavel.Boldin) Date: 2011-09-11 12:17
OK. So, it seems just like ctypes work, but don't for my needs.

Thing that bothers me anyway is the strange code, where size contains either size (when bitsize==0) or bitsize in upper 16 bits and bitoffset in lower 16 bits.
msg143882 - (view) Author: Meador Inge (meador.inge) * (Python committer) Date: 2011-09-12 03:19
Would you mind explaining your use case and why ctypes won't fit it?  Maybe there is something that can be fixed.

FWIW, I agree that the overloading of 'size' is unnecessary.
msg143891 - (view) Author: Pavel Boldin (Pavel.Boldin) Date: 2011-09-12 09:44
We have raw data packages from some tools. These packages contains bitfields, arrays, simple data and so on.

We want to parse them into Python objects (structures) for analysis and storage. I tried to use ctypes, but now I wrote myself implementation of raw parser based on bitarray and struct.

I wonder if ctypes can do this.
msg148534 - (view) Author: Meador Inge (meador.inge) * (Python committer) Date: 2011-11-29 02:29
Without seeing a specific example of what you are trying to do, it is hard to tell whether ctypes would be a good fit.  I am closing this issue since the original questions have been answered.  Please open a new issue if you think ctypes could be modified to support your use cases.
History
Date User Action Args
2011-11-29 02:29:01meador.ingesetstatus: open -> closed
resolution: not a bug
messages: + msg148534

stage: needs patch -> resolved
2011-09-12 09:44:45Pavel.Boldinsetmessages: + msg143891
2011-09-12 03:19:42meador.ingesetmessages: + msg143882
2011-09-11 12:17:43Pavel.Boldinsetmessages: + msg143861
2011-09-11 02:59:21meador.ingesetmessages: + msg143850
2011-09-11 01:54:58Pavel.Boldinsetmessages: + msg143849
2011-09-11 01:41:09meador.ingesetmessages: + msg143847
2011-09-09 23:53:44meador.ingesetversions: + Python 3.2, Python 3.3
nosy: + meador.inge

messages: + msg143818

type: behavior
stage: needs patch
2011-09-09 13:54:48Pavel.Boldincreate