Issue12880
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2011-09-02 02:34 by meador.inge, last changed 2022-04-11 14:57 by admin.
Files | ||||
---|---|---|---|---|
File name | Uploaded | Description | Edit | |
bitfield_doc.diff | vladris, 2011-09-30 17:19 | review |
Messages (10) | |||
---|---|---|---|
msg143369 - (view) | Author: Meador Inge (meador.inge) * | Date: 2011-09-02 02:34 | |
As issues like issue6069 and issue11920 allude to, figuring out how 'ctypes' allocates bit-fields is not very clear. The documentation should be enhanced to flesh this out in more detail. As an example, Microsoft documents the VC++ implementation in a reasonably clear manner ( http://msdn.microsoft.com/en-us/library/ewwyfdbe(v=vs.71).aspx ). |
|||
msg143846 - (view) | Author: Meador Inge (meador.inge) * | Date: 2011-09-11 01:39 | |
Another example of desperately needed documentation: issue12945. |
|||
msg144696 - (view) | Author: Vlad Riscutia (vladris) | Date: 2011-09-30 17:19 | |
Attached doc update against tip (though I still hope my patch for configurable allocation strategies will make it in). This is my first doc patch so let me know if I missed something. I am basically explaining that bit field allocation is compiler-specific and no assumptions should be made of how a bitfield is allocated. I believe this is the better thing to do rather than detailing how GCC and MSVC allocated their bitfields because that would just encourage people to use this feature incorrectly. Most bugs opened on bit fields are because people are toying with the underlying buffer and get other results than what they expect. IMO, when using bitfields one should only access the structure members at a high level and not go read/write the raw memory underneath. |
|||
msg144854 - (view) | Author: Meador Inge (meador.inge) * | Date: 2011-10-04 04:21 | |
On Fri, Sep 30, 2011 at 12:19 PM, Vlad Riscutia <report@bugs.python.org> wrote: > I believe this is the better thing to do rather than detailing how GCC and MSVC allocated their bitfields because that would just > encourage people to use this feature incorrectly. So clearly documenting how a feature works will cause people to use the feature incorrectly? I think not. In any case, I agree that documenting the low-level specifics of each compiler's algorithm is too much. > Most bugs opened on bit fields are because people are toying with the underlying buffer and get other results than what they expect. The issues that I have looked at (issue6069, issue11920, and issue11920) all involve fundamental misunderstandings of *how* the structure layout is determined. I don't know if I would generalize these misunderstanding as "toying with the underlying buffer". Some times people need to know the exact layout for proper C interop. In some of the bugs reported folks are casting buffers in an attempt to discover the structure layout since it is not clearly documented. The general content of your patch seems reasonable. I will provide more specific comments shortly. |
|||
msg144855 - (view) | Author: Meador Inge (meador.inge) * | Date: 2011-10-04 04:49 | |
Added some comments in rietveld. P.S. watch out for trailing whitespace when writing patches. Use 'make patchcheck' to help find bad whitespace formatting. |
|||
msg144894 - (view) | Author: Vlad Riscutia (vladris) | Date: 2011-10-04 15:21 | |
Thanks for the "make patchcheck" tip, I didn't know about that. I will update the patch soon. In the mean time, I want to point out a couple of things: First, I'm saying "toying with the underlying buffer" because none of the bugs are actual issues of the form "I created this bitfield structure with Python, passed it to C function but C structure was different". That would be a bitfield bug. All of these bugs are people setting raw memory to some bytes, then looking at bitfield members and not seeing what they expect. Since this is platform dependent, they shouldn't worry about the raw memory as long as C interop works fine. Bitfield layout is complex as it involves both allocation algorithm and structure packing and same Python code will work differently on Windows and Unix. My point is that documenting all this low-level stuff will encourage people to work with the raw memory which will open the door for other issues. I believe it would be better to encourage users to stick to declaring members and accessing them by name as raw memory WILL be different for the same code on different OSes. Second, one of your review comments is: "GCC is used for most Unix systems and Microsoft VC++ is used on Windows.". This is not how ctypes works. Ctypes implements the bitfield allocation algorithm itself, it doesn't use the compiler with which it is built. Basically it says #ifdef WIN32 - allocate like VC++ - #else - allocate like GCC. So it doesn't really matter with which compiler you are building Python. It will still do GCC style allocation on Solaris. |
|||
msg144911 - (view) | Author: Meador Inge (meador.inge) * | Date: 2011-10-04 17:44 | |
On Tue, Oct 4, 2011 at 10:21 AM, Vlad Riscutia <report@bugs.python.org> wrote: > First, I'm saying "toying with the underlying buffer" because none of the bugs are actual issues of the form "I created this bitfield > structure with Python, passed it to C function but C structure was different". That would be a bitfield bug. All of these bugs are people > setting raw memory to some bytes, then looking at bitfield members and not seeing what they expect. Please qualify "all" instead of generalizing. I can point to two issues (issue11990 "I'm generating python code from real c code.", issue12945 "We have raw data packages from some tools. These packages contains bitfields, arrays, simple data and so on.") where C code or raw data was, in fact, involved and the reporters just don't understand what layout algorithm is being used. They may not need to know the specifics of the algorithm, but they *do* need to know if it matches the compiler they are using to do interop or the one that generated the raw data. The reason that we are seeing folks cast raw memory into a cyptes bitfield structure is because they do not understand how the structure layout algorithm works and are trying to figure it out via these examples. > Second, one of your review comments is: "GCC is used for most Unix systems and Microsoft VC++ is used on Windows.". This is not > how ctypes works. Ctypes implements the bitfield allocation algorithm itself, it doesn't use the compiler with which it is built. Basically > it says #ifdef WIN32 - allocate like VC++ - #else - allocate like GCC. So it doesn't really matter with which compiler you are building > Python. It will still do GCC style allocation on Solaris. I understand how it works. This quote is taken somewhat out of context as the preceding sentence is important. Perhaps saying GCC- style and VC++-style would have been more clear. The reason that I mentioned the compiler used to build Python is that it is an easy reference point and more times than not the bitfield allocation and layout *do* match that of the compiler used to build the interpreter. Anyway, I am fine with dropping the "used to build the Python interpreter" and going with something similar to what you originally had. Also, in general, the compiler used to build the ctypes extension *does* matter. Look in 'cfield.c' where all of the native alignments are computed at compile time. These alignments affect the structure layout and are defined by the compiler building the ctypes extension. |
|||
msg144913 - (view) | Author: Meador Inge (meador.inge) * | Date: 2011-10-04 17:46 | |
> Look in 'cfield.c' where all of the native alignments Well, not *all* the native alignments, but many of them. |
|||
msg144919 - (view) | Author: Vlad Riscutia (vladris) | Date: 2011-10-04 18:39 | |
I agree compiler matters for alignment but if you look at PyCField_FromDesc, you will see the layout is pretty much #ifdef MS_WIN32 - #else. Sorry for generalizing, "all" indeed is not the right word. My point is that we should set expectation correctly - VC++-style on Windows, GCC-style everywhere else and encourage users to access structure members by name, not raw memory. Issues opened for bitfields *usually* are of the form I mentioned - setting raw memory to some bytes then seeing members are not what user expected, even if ctypes algorithm works correctly. As I said, I will revise the patch and maybe make it more clear that users should look up how bitfield allocation works for their compiler instead of trying to understand this via structure raw memory. |
|||
msg144932 - (view) | Author: Terry J. Reedy (terry.reedy) * | Date: 2011-10-04 23:51 | |
If I understand correctly, this doc patch would apply to 2.7 and 3.2 also. I have two style comments. I believe "It is important to note that bit field allocation and layout in memory is not defined as a standard, rather its implementation is compiler-specific." could be shortened to "Bit field allocation and memory layout is compiler-specific." To me, this leads nicely into the proposed sentence that follows. "it is recommended that no assumptions are made about the structure size and layout." I do not like 'it is recommended'. Let us state the fact. "any assumptions about the structure size and layout may be wrong." |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:57:21 | admin | set | github: 57089 |
2011-10-04 23:51:15 | terry.reedy | set | messages:
+ msg144932 versions: - Python 3.4 |
2011-10-04 18:39:26 | vladris | set | messages: + msg144919 |
2011-10-04 17:46:35 | meador.inge | set | messages: + msg144913 |
2011-10-04 17:44:29 | meador.inge | set | messages: + msg144911 |
2011-10-04 15:21:45 | vladris | set | messages: + msg144894 |
2011-10-04 04:49:47 | meador.inge | set | messages: + msg144855 |
2011-10-04 04:21:33 | meador.inge | set | messages: + msg144854 |
2011-09-30 17:19:10 | vladris | set | files:
+ bitfield_doc.diff keywords: + patch messages: + msg144696 |
2011-09-11 01:39:49 | meador.inge | set | messages:
+ msg143846 components: + ctypes |
2011-09-02 02:34:11 | meador.inge | create |