This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: [Linux] ctypes packs bitfields Incorrectly
Type: behavior Stage: patch review
Components: ctypes Versions: Python 3.11, Python 3.10, Python 3.9
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Charles Machalow, FFY00, amaury.forgeotdarc, belopolsky, eryksun, jaraco, karlding, meador.inge, miss-islington, mleroy003, pablogsal, shihai1991, terry.reedy, thesamprice, ztane
Priority: normal Keywords: patch

Created on 2017-03-08 00:36 by Charles Machalow, last changed 2022-04-11 14:58 by admin.

Files
File name Uploaded Description Edit
test.py Charles Machalow, 2017-03-08 00:36 File to test with
ctypes-bitfields-bug.tar.gz mleroy003, 2018-04-01 16:18 tar file containing a test program in C and Python 3, plus a makefile. to tets : make rebuild ; make test
Pull Requests
URL Status Linked Edit
PR 19850 merged FFY00, 2020-05-02 00:54
PR 27085 merged FFY00, 2021-07-11 16:15
PR 27086 merged miss-islington, 2021-07-11 16:44
Messages (21)
msg289193 - (view) Author: Charles Machalow (Charles Machalow) Date: 2017-03-08 00:36
There appears to be a bug related to sizing/packing of ctypes Structures on Linux. I'm not quite sure how, but this structure:

class MyStructure(Structure):
    _pack_      = 1
    _fields_    = [ 
                      ("P",       c_uint16),    # 2 Bytes
                      ("L",       c_uint16, 9),
                      ("Pro",     c_uint16, 1),
                      ("G",       c_uint16, 1),
                      ("IB",      c_uint16, 1),
                      ("IR",      c_uint16, 1),
                      ("R",       c_uint16, 3), # 4 Bytes
                      ("T",       c_uint32, 10),
                      ("C",       c_uint32, 20),
                      ("R2",      c_uint32, 2)  # 8 Bytes
                  ]

Gives back a sizeof of 8 on Windows and 10 on Linux. The inconsistency makes it difficult to have code work cross-platform.

Running the given test.py file will print out the size of the structure on your platform.

Tested with Python 2.7.6 and Python 3.4.3 (builtin to Ubuntu 14.04), Python 2.7.13, (built from source) both on Ubuntu 14.04. On Linux all Python builds were 32 bit.

On Windows I tried with 2.7.7 (both 32 and 64 bit).

I believe on both platforms it should return a sizeof 8.
msg289209 - (view) Author: Charles Machalow (Charles Machalow) Date: 2017-03-08 07:19
Took a quick look at the c code for this. The area at fault appears to be this section in cfield.c:

#ifndef MS_WIN32
    } else if (bitsize /* this is a bitfield request */
        && *pfield_size /* we have a bitfield open */
        && dict->size * 8 >= *pfield_size
        && (*pbitofs + bitsize) <= dict->size * 8) {
        /* expand bit field */
        fieldtype = EXPAND_BITFIELD;
#endif

The problem seems to be after the end of the 2nd c_uint16, it seems to think that the next 10 bytes should extend that c_uint16 to a c_uint32 instead of taking the type as the beginning of a new c_uint32. So I guess it is making the structure something like this:

                      ("P",       c_uint16),
                      ("L",       c_uint32, 9),
                      ("Pro",     c_uint32, 1),
                      ("G",       c_uint32, 1),
                      ("IB",      c_uint32, 1),
                      ("IR",      c_uint32, 1),
                      ("R",       c_uint32, 3),
                      ("T",       c_uint32, 10),
                      # And now this needs an extra 6 bits of padding to fill the c_uint32
                      ("C",       c_uint32, 20),
                      ("R2",      c_uint32, 2)
                      # And now this needs an extra 10 bits of padding to fill the c_uint32.

I guess that is how we get to 10... instead of the expected 8 bytes. I don't believe that this behavior is desired nor really makes logical sense.
msg289211 - (view) Author: Charles Machalow (Charles Machalow) Date: 2017-03-08 07:23
Some more debug with print statements in the c code seems to confirm my suspicion:

bitsize, pfield_size, bitofs, dict->size prints were added just above the if chain to determine fieldtype and fieldtype was just after that if chain. That code looks something like this:

printf("bitsize: %d\n" , (int)bitsize);
printf("pfield_size: %u\n", (size_t)*pfield_size);
printf("bitofs: %d\n", (int)*pbitofs);
printf("dict->size: %d\n", (int)dict->size);
    if (bitsize /* this is a bitfield request */
        && *pfield_size /* we have a bitfield open */
#ifdef MS_WIN32
        /* MSVC, GCC with -mms-bitfields */
        && dict->size * 8 == *pfield_size
#else
        /* GCC */
        && dict->size * 8 <= *pfield_size
#endif
        && (*pbitofs + bitsize) <= *pfield_size) {
        /* continue bit field */
        fieldtype = CONT_BITFIELD;
#ifndef MS_WIN32
    } else if (bitsize /* this is a bitfield request */
        && *pfield_size /* we have a bitfield open */
        && dict->size * 8 >= *pfield_size
        && (*pbitofs + bitsize) <= dict->size * 8) {
        /* expand bit field */
        fieldtype = EXPAND_BITFIELD;
#endif
    } else if (bitsize) {
        /* start new bitfield */
        fieldtype = NEW_BITFIELD;
        *pbitofs = 0;
        *pfield_size = dict->size * 8;
    } else {
        /* not a bit field */
        fieldtype = NO_BITFIELD;
        *pbitofs = 0;
        *pfield_size = 0;
    }

printf("Fieldtype: %d\n------\n", fieldtype);

And the run with the custom-built Python 2.7.13:

>>> from test import *
bitsize: 0
pfield_size: 0
bitofs: 255918304
dict->size: 2
Fieldtype: 0
------
bitsize: 9
pfield_size: 0
bitofs: 0
dict->size: 2
Fieldtype: 1
------
bitsize: 1
pfield_size: 16
bitofs: 9
dict->size: 2
Fieldtype: 2
------
bitsize: 1
pfield_size: 16
bitofs: 10
dict->size: 2
Fieldtype: 2
------
bitsize: 1
pfield_size: 16
bitofs: 11
dict->size: 2
Fieldtype: 2
------
bitsize: 1
pfield_size: 16
bitofs: 12
dict->size: 2
Fieldtype: 2
------
bitsize: 3
pfield_size: 16
bitofs: 13
dict->size: 2
Fieldtype: 2
------
bitsize: 10
pfield_size: 16
bitofs: 16
dict->size: 4
Fieldtype: 3
------
bitsize: 20
pfield_size: 32
bitofs: 26
dict->size: 4
Fieldtype: 1
------
bitsize: 2
pfield_size: 32
bitofs: 20
dict->size: 4
Fieldtype: 2
------
10
>>> MyStructure.P
<Field type=c_ushort, ofs=0, size=2>
>>> MyStructure.T
<Field type=c_uint, ofs=2:16, bits=10>
>>> MyStructure.R
<Field type=c_ushort, ofs=2:13, bits=3>
>>> MyStructure.P
<Field type=c_ushort, ofs=0, size=2>
>>> MyStructure.L
<Field type=c_ushort, ofs=2:0, bits=9>
>>> MyStructure.Pro
<Field type=c_ushort, ofs=2:9, bits=1>
>>> MyStructure.R
<Field type=c_ushort, ofs=2:13, bits=3>
>>> MyStructure.T
<Field type=c_uint, ofs=2:16, bits=10>
>>> MyStructure.C
<Field type=c_uint, ofs=6:0, bits=20>
msg289212 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2017-03-08 07:33
To make it simpler to diagram the fields, I've rewritten your structure using field names A-J:

    import ctypes

    class MyStructure(ctypes.Structure):
        _pack_ = 1
        _fields_ = (('A', ctypes.c_uint16),     # 2 bytes
                    ('B', ctypes.c_uint16, 9),
                    ('C', ctypes.c_uint16, 1),
                    ('D', ctypes.c_uint16, 1),
                    ('E', ctypes.c_uint16, 1),
                    ('F', ctypes.c_uint16, 1),
                    ('G', ctypes.c_uint16, 3),  # 4 bytes
                    ('H', ctypes.c_uint32, 10),
                    ('I', ctypes.c_uint32, 20),
                    ('J', ctypes.c_uint32, 2))  # 8 bytes

ctypes is attempting to extend the bitfield for H by switching to a c_uint storage unit with an offset of 2 bytes and adding H field with a 16-bit offset:

    >>> MyStructure.H
    <Field type=c_uint, ofs=2:16, bits=10>

Here's the correct layout:

    |     uint16    |     uint16    |            uint32             
    |------16-------|------16-------|---10----|--------20---------|-
    A---------------B--------CDEFG--H---------I-------------------J-

and here's the layout that ctypes creates, which wastes 2 bytes:

                    |             uint32            |
    |     uint16    |     uint16    |               |             uint32            
    |------16-------|------16-------|---10----|--6--|--------20---------|-|---10----
    A---------------B--------CDEFG--H---------------I-------------------J-----------

The current strategy for extending bitfields to work like gcc on Linux is obviously failing us here. Hopefully someone can flesh out the exact rules to make this code accurate. Bitfields are such a pain.
msg301020 - (view) Author: Antti Haapala (ztane) * Date: 2017-08-30 09:51
To Charles first: "Gives back a sizeof of 8 on Windows and 10 on Linux. The inconsistency makes it difficult to have code work cross-platform." 

The bitfields in particular and ctypes in general have *never* been meant to be cross-platform - instead they just must need to match the particular C compiler behaviour of the platform, thus the use of these for cross platform work is ill-advised - perhaps you should just use the struct module instead.

However, that said, on Linux, sizeof these structures - packed or not - do not match the output from GCC; unpacked one has sizeof 12 and packed 10 on my Python 3.5, but they're both 8 bytes on GCC. This is a real bug.

GCC says that the bitfield behaviour is: https://gcc.gnu.org/onlinedocs/gcc-4.9.1/gcc/Structures-unions-enumerations-and-bit-fields-implementation.html

Whether a bit-field can straddle a storage-unit boundary (C90 6.5.2.1, C99 and C11 6.7.2.1).

Determined by ABI.
The order of allocation of bit-fields within a unit (C90 6.5.2.1, C99 and C11 6.7.2.1).

Determined by ABI.
The alignment of non-bit-field members of structures (C90 6.5.2.1, C99 and C11 6.7.2.1).

Determined by ABI. 

Thus, the actual behaviour need to be checked from the API documentation of the relevant platform. However - at least for unpacked structs - the x86-64 behaviour is that a bitfield may not cross an addressable unit.
msg306410 - (view) Author: Charles Machalow (Charles Machalow) Date: 2017-11-17 04:49
Antti, is there a place in the ctypes documentation that explicitly says ctypes is not meant to be used cross-platform? If not, shouldn't that be mentioned?

I think ultimately ctypes should default to standard OS/compiler behavior, but should allow the flexibility for one to override for cross-platform interchangeability.

In the C++ code here, the code is explicitly checking if Windows (or GCC) to choose behavior. In theory, that could be changed to allow runtime-choice for packing behavior. 

Of course, that is probably best for a feature issue. For this one, I'd just like to have the behavior on Linux match GCC, as that is the definitive bug here as our example has shown.
msg306434 - (view) Author: Antti Haapala (ztane) * Date: 2017-11-17 12:50
"Antti, is there a place in the ctypes documentation that explicitly says ctypes is not meant to be used cross-platform? If not, shouldn't that be mentioned?"

I don't know about that, but the thing is nowhere does it say that it is meant to be used cross-platform. It just says it allows defining C types. It is somewhat implied that C types are not cross-platform at binary level, at all.
msg306466 - (view) Author: Charles Machalow (Charles Machalow) Date: 2017-11-18 02:59
All of Python is implicitly cross platform. If something isn't actually cross platform, it should be mentioned explicitly in the documentation. For example see the mmap documentation, it explicitly say on Unix it does X, on Windows it does Y. We should be as explicit as possible for ctypes to say on Windows we expect packing like X (GCC) and on Linux we expect packing like Y (VC++).
msg314778 - (view) Author: Marc Le Roy (mleroy003) Date: 2018-04-01 16:18
No solution found to solve this issue ?
The anomaly is not a cross platform inconsistency, it is an inconsistency between the behaviours of GCC and ctypes, both under Linux or Cygwin, when defining packed structures :

[Marc@I7-860 ~/dev/python/ctypes-bitfields-bug] make test
./bitfield_test1
sizeof(BF32) = 12 ; Memory dump of BF32 = 0xffffffffffffffffffffffff
sizeof(BF64) = 12 ; Memory dump of BF64 = 0xffffffffffffffffffffffff
python3 bitfield_test1.py
sizeof(BF32) = 16 ; Memory dump of BF32 = 0xffffff00ffffff00ffffff00ffffff00
sizeof(BF64) = 16 ; Memory dump of BF64 = 0xffffffffffff0000ffffffffffff0000
msg343998 - (view) Author: Karl Ding (karlding) * Date: 2019-05-30 20:01
I believe the example can be simplified to the following:

        class MyStructure(ctypes.Structure):
            _pack_ = 1
            _fields_ = [('A', ctypes.c_uint32, 8)]

        assert ctypes.sizeof(MyStructure) == 1

Here, ctypes.sizeof returns 4 on my machine (64-bit Ubuntu 18.04 LTS), while I would expect it to return 1 like GCC. This is using a tree checked out at 33ce3f012f249782507df896824b045b34f765aa, if it makes a difference.

If we compile the equivalent C code (on 64-bit Ubuntu 18.04 LTS):

        #include <stdio.h>
        #include <stdint.h>
        #include <stdalign.h>

        struct my_structure_uint32 {
          uint32_t a : 8;
        } __attribute__((packed));

        int main(int argc, char *argv[]) {
          printf("sizeof(struct my_structure_uint32): %d\n", sizeof(struct my_structure_uint32));
          printf("alignof(struct my_structure_uint32): %d\n", alignof(struct my_structure_uint32));
          return 0;
        }

Using the following GCC invocation:

        gcc temp.c -o test && ./test

We get the following result:

        sizeof(struct my_structure_uint32): 1
        alignof(struct my_structure_uint32): 1

However, if I compile with MSVC bitfields:

        gcc -mms-bitfields test.c -o test && ./test

We get the following result:

        sizeof(struct my_structure_uint32): 4
        alignof(struct my_structure_uint32): 1

Similarly, adding a second and third 8-bit wide bitfield member fails and returns 4, instead of the expected 2 and 3.

This is not just c_uint32, c_uint16 and c_int also exhibit the same behaviour:

        class MyStructure(ctypes.Structure):
            _pack_ = 1
            _fields_ = [('A', ctypes.c_uint16, 8)]

        assert ctypes.sizeof(MyStructure) == 1
msg363440 - (view) Author: Sam Price (thesamprice) Date: 2020-03-05 16:19
I ran into this bug on mac and submitted a duplicate issue of 39858


If you add the check *pfield_size != *pbitofs it fixes this bug.

#ifndef MS_WIN32
    } else if (bitsize /* this is a bitfield request */
        && *pfield_size /* we have a bitfield open */
        && *pfield_size != *pbitofs /* Current field has been filled, start new one */
        && dict->size * 8 >= *pfield_size
        && (*pbitofs + bitsize) <= dict->size * 8) {
        /* expand bit field */
        fieldtype = EXPAND_BITFIELD;
#endif


However this would not fix the results where you expand a bitfield around 3 bytes.
clang produces the 3 bytes if you try and expand around.
#include <stdint.h>
#include <stdio.h>
typedef struct testA{
    uint8_t a0 : 7;
    uint8_t a1 : 7;
    uint8_t a2 : 7;
    uint8_t a3 : 3;
}  __attribute__((packed)) testA;

int main(){
    printf("%d\n", sizeof(testA)); /* Prints out 3 */
    return 0;
}
python prints out a size of 4.
msg367950 - (view) Author: Filipe Laíns (FFY00) * (Python triager) Date: 2020-05-03 01:25
My patches should resolve this fully. Please test it to make sure I did not miss any corner case.

https://github.com/python/cpython/pull/19850
msg387821 - (view) Author: miss-islington (miss-islington) Date: 2021-02-28 22:43
New changeset 0d7ad9fb38c041c46094087b0cf2c8ce44916b11 by Filipe Laíns in branch 'master':
bpo-29753: fix merging packed bitfields in ctypes struct/union (GH-19850)
https://github.com/python/cpython/commit/0d7ad9fb38c041c46094087b0cf2c8ce44916b11
msg392460 - (view) Author: Filipe Laíns (FFY00) * (Python triager) Date: 2021-04-30 16:04
I forgot to ping here, the patch went in and should be available in 3.10, so this can be closed now.
msg397255 - (view) Author: Filipe Laíns (FFY00) * (Python triager) Date: 2021-07-11 16:23
Unfortunately, the fix has some unforeseen side-effects, so we'll have to revert it :/
msg397257 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-07-11 16:44
New changeset e14d5ae5447ae28fc4828a9cee8e9007f9c30700 by Filipe Laíns in branch 'main':
bpo-29753: revert 0d7ad9f (GH-19850) (GH-27085)
https://github.com/python/cpython/commit/e14d5ae5447ae28fc4828a9cee8e9007f9c30700
msg397259 - (view) Author: miss-islington (miss-islington) Date: 2021-07-11 17:47
New changeset 42da46ed522157b057d73e6b623615ef6427999e by Miss Islington (bot) in branch '3.10':
bpo-29753: revert 0d7ad9f (GH-19850) (GH-27085)
https://github.com/python/cpython/commit/42da46ed522157b057d73e6b623615ef6427999e
msg397260 - (view) Author: Charles Machalow (Charles Machalow) Date: 2021-07-11 18:52
Maybe we need to add a __packing__ option to specify how packing should
work and default to legacy behavior. Then allow users to specify if they
want similar behavior cross-os.

Otherwise changing this does change packing for existing users and can lead
to breakages.

What exactly was the hit regression here?

On Sun, Jul 11, 2021, 10:47 AM miss-islington <report@bugs.python.org>
wrote:

>
> miss-islington <mariatta.wijaya+miss-islington@gmail.com> added the
> comment:
>
>
> New changeset 42da46ed522157b057d73e6b623615ef6427999e by Miss Islington
> (bot) in branch '3.10':
> bpo-29753: revert 0d7ad9f (GH-19850) (GH-27085)
>
> https://github.com/python/cpython/commit/42da46ed522157b057d73e6b623615ef6427999e
>
>
> ----------
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <https://bugs.python.org/issue29753>
> _______________________________________
>
msg397261 - (view) Author: Sam Price (thesamprice) Date: 2021-07-11 18:57
I’ve been looking forward to this fix.  The old implementation did  not
match what my LLVM compiler generated.

On Sun, Jul 11, 2021 at 2:52 PM Charles Machalow <report@bugs.python.org>
wrote:

>
> Charles Machalow <csm10495@gmail.com> added the comment:
>
> Maybe we need to add a __packing__ option to specify how packing should
> work and default to legacy behavior. Then allow users to specify if they
> want similar behavior cross-os.
>
> Otherwise changing this does change packing for existing users and can lead
> to breakages.
>
> What exactly was the hit regression here?
>
> On Sun, Jul 11, 2021, 10:47 AM miss-islington <report@bugs.python.org>
> wrote:
>
> >
> > miss-islington <mariatta.wijaya+miss-islington@gmail.com> added the
> > comment:
> >
> >
> > New changeset 42da46ed522157b057d73e6b623615ef6427999e by Miss Islington
> > (bot) in branch '3.10':
> > bpo-29753: revert 0d7ad9f (GH-19850) (GH-27085)
> >
> >
> https://github.com/python/cpython/commit/42da46ed522157b057d73e6b623615ef6427999e
> >
> >
> > ----------
> >
> > _______________________________________
> > Python tracker <report@bugs.python.org>
> > <https://bugs.python.org/issue29753>
> > _______________________________________
> >
>
> ----------
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <https://bugs.python.org/issue29753>
> _______________________________________
>
-- 
Thank you,

Sam Price
(707) 742-3726
msg397262 - (view) Author: Filipe Laíns (FFY00) * (Python triager) Date: 2021-07-11 18:58
Sorry for not posting a link here, see https://github.com/python/cpython/pull/19850#issuecomment-869410686.

The issue is not legacy behavior, it's that the fix messed up the functionality and that was not caught by tests. I don't have much time to look into why that happened right now, but I will when I get a chance. Sorry!
msg397679 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2021-07-17 02:08
Charles and Sam: In the future, when responding by email, please delete the email you are responding to, except maybe for a line.  When your response is posted to web page, the quote is redundant and distracting.
History
Date User Action Args
2022-04-11 14:58:44adminsetgithub: 73939
2021-07-17 02:08:19terry.reedysetnosy: + terry.reedy

messages: + msg397679
versions: - Python 3.8
2021-07-11 18:58:48FFY00setmessages: + msg397262
2021-07-11 18:57:16thesampricesetmessages: + msg397261
2021-07-11 18:52:32Charles Machalowsetmessages: + msg397260
2021-07-11 17:47:14miss-islingtonsetmessages: + msg397259
2021-07-11 16:44:09pablogsalsetnosy: + pablogsal
messages: + msg397257
2021-07-11 16:44:08miss-islingtonsetstage: resolved -> patch review
pull_requests: + pull_request25634
2021-07-11 16:23:20FFY00setstatus: closed -> open
resolution: fixed ->
messages: + msg397255

versions: + Python 3.11
2021-07-11 16:15:39FFY00setpull_requests: + pull_request25633
2021-04-30 18:57:10jaracosetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2021-04-30 16:04:31FFY00setnosy: + jaraco
messages: + msg392460
2021-03-30 18:38:27eryksunsettitle: Ctypes Packing Bitfields Incorrectly - Linux -> [Linux] ctypes packs bitfields Incorrectly
versions: + Python 3.10, - Python 3.7
2021-03-12 21:11:59eryksunlinkissue15453 superseder
2021-02-28 22:43:27miss-islingtonsetnosy: + miss-islington
messages: + msg387821
2020-06-29 17:34:34pitrousetversions: + Python 3.8, Python 3.9, - Python 2.7, Python 3.4, Python 3.5, Python 3.6
2020-05-03 01:25:29FFY00setmessages: + msg367950
2020-05-02 00:54:50FFY00setkeywords: + patch
stage: patch review
pull_requests: + pull_request19167
2020-05-02 00:25:40FFY00setnosy: + FFY00
2020-03-05 16:19:18thesampricesetnosy: + thesamprice
messages: + msg363440
2019-07-15 04:39:39shihai1991setnosy: + shihai1991
2019-05-30 20:01:37karldingsetnosy: + karlding
messages: + msg343998
2018-04-01 16:18:14mleroy003setfiles: + ctypes-bitfields-bug.tar.gz

messages: + msg314778
2017-11-18 02:59:28Charles Machalowsetmessages: + msg306466
2017-11-17 12:50:18ztanesetmessages: + msg306434
2017-11-17 04:49:13Charles Machalowsetmessages: + msg306410
2017-11-16 22:00:13mleroy003setnosy: + mleroy003
2017-11-09 11:39:58berker.peksaglinkissue31987 superseder
2017-08-30 09:51:23ztanesetnosy: + ztane

messages: + msg301020
title: Ctypes Packing Incorrectly - Linux -> Ctypes Packing Bitfields Incorrectly - Linux
2017-03-08 07:34:34eryksunsetversions: + Python 3.5, Python 3.6, Python 3.7
2017-03-08 07:33:44eryksunsetnosy: + eryksun
messages: + msg289212
2017-03-08 07:24:00Charles Machalowsetmessages: + msg289211
2017-03-08 07:19:41Charles Machalowsetmessages: + msg289209
2017-03-08 00:36:40Charles Machalowcreate