This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: struct - please make sizes explicit
Type: Stage: resolved
Components: Documentation Versions: Python 3.1, Python 3.2, Python 2.7, Python 2.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: mark.dickinson Nosy List: Alexander.Belopolsky, kiilerix, mark.dickinson
Priority: low Keywords: patch

Created on 2010-04-20 12:30 by kiilerix, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
struct.diff kiilerix, 2010-06-12 23:35 ideas for further improvements
Messages (10)
msg103699 - (view) Author: Mads Kiilerich (kiilerix) * Date: 2010-04-20 12:30
The struct module is often used (at least by me) to implement protocols  and binary formats. That makes the exact sizes (number of bits/bytes) of the different types very important.

Please add the sizes to for example the table on http://docs.python.org/library/struct . I know that some of the sizes varies with the platform, and in these cases it is fine to define it in terms of the C types, but for Python programmers writing cross-platform code such variable types doesn't matter and are "never" used. (I assume that it is possible to specify all possible types in a cross-platform way, but I'm not sure and the answer is not obvious from the documentation.)
msg103712 - (view) Author: Alexander Belopolsky (Alexander.Belopolsky) Date: 2010-04-20 14:11
It is very easy to generate the size table programmatically:

>>> for c in "xcbB?hHiIlLqQfdspP":
...     print(c, struct.calcsize(c))
... 
x 1
c 1
b 1
B 1
? 1
h 2
H 2
i 4
I 4
l 8
L 8
q 8
Q 8
f 4
d 8
s 1
p 1
P 8

However, all values above except trivial 1-byte entries are platform dependent and C types are already well documented.
msg103720 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-04-20 14:30
As Alexander says, *all* the sizes except those for bytes are platform-dependent:  there are platforms where sizeof(short) isn't 2, for example, or where sizeof(int) isn't 4.

It would be possible to add the 'standard' sizes to that table (i.e. the sizes that you get when using '<', '>', etc.);  would that be helpful?  If you're trying to write cross-platform code then you should probably be using standard size, alignment and byte order anyway.
msg103721 - (view) Author: Alexander Belopolsky (Alexander.Belopolsky) Date: 2010-04-20 14:44
On Tue, Apr 20, 2010 at 10:30 AM, Mark Dickinson <report@bugs.python.org> wrote:
..
> It would be possible to add the 'standard' sizes to that table (i.e. the sizes that you get when using '<', '>', etc.);  would that be helpful?

The documentation already includes standard sizes in text:

"Standard size and alignment are as follows: no alignment is required
for any type (so you have to use pad bytes); short is 2 bytes; int and
long are 4 bytes; long long (__int64 on Windows) is 8 bytes; float and
double are 32-bit and 64-bit IEEE floating point numbers,
respectively. _Bool is 1 byte."

It may be helpful to add "Standard size" column to the code table with
a footnote that it only applies when <, > or ! code is used and that
for native sizes one should consult struct.calcsize().
msg103806 - (view) Author: Mads Kiilerich (kiilerix) * Date: 2010-04-21 08:05
The more times I read the documentation and your comments I can see that the implementation is OK and the documentation is "complete" and can be read correctly. Please take this as constructive feedback to improving the documentation to make it easier to understand and harder to read incorrectly.

Yes, adding a "Standard size" column would have been very helpful. (I had missed the section on "standard" sizes.)

"Standard" is a very general term. And slightly confusing that standard isn't the default. Could the term "platform independent" (or "fixed"?) be added as an explanation of "standard" - or perhaps used instead?

Programming skills and platform knowledge at C level should not be a requirement to understand and use struct, so perhaps the references to C should be less high-profile, and perhaps something like this could be added:
"All sizes except trivial 1-byte entries (whatever that means) are platform dependent - use calcsize to get the size on your platform."

Perhaps the sections explaining 's', 'p', 'ILqQ', 'P' and '?' could be changed to (foot)notes to the table to make it easier to see where they belongs and if they can be skipped.

Perhaps "@" in the byte order table could be replaced with "@ (default)"? (And perhaps drop "If the first character is not one of these, '@' is assumed.")

The byte order character must come first in the format string and is a key to understand the other format characters, so perhaps everything related to that should come first?
msg103807 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-04-21 08:23
Thanks for the doc suggestions.

Actually, the current docs were revised recently;  this issue is a helpful reminder to me that those doc revisions need to be backported. :)  If you want to see the current docs, look at:

http://docs.python.org/dev/library/struct.html

I'm +0 on adding the standard sizes to the table of format codes.

I also agree it might make sense to swap the 'Format Character' section and the 'Byte Order, Size and Alignment' section.

That's all for now;  I'll look at this properly sometime soon.

The standard/native terminology is fairly ingrained;  I'm not sure whether it's really worth changing it, but we can look at the explanations and make sure that they're clear.

> Programming skills and platform knowledge at C level should not be a 
> requirement to understand and use struct, so perhaps the references to > C should be less high-profile,

Agreed, though I think the references to C should certainly be there, since they will help some users, and since part of the struct module's raison d'etre is to allow communication with data written/read by C programs.

The note about ILqQ returning Python longs might be better omitted;  the difference between int and long should be irrelevant to most users.
msg107682 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-06-12 18:53
I've added sizes to the table, reordered some of the sections, and made a couple of other tweaks (like renaming the 'Objects' section to 'Classes') in r81957 (trunk) and r81955-81956 (py3k).

I'll backport these changes to release26-maint and release31-maint;  leaving open for that.
msg107685 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-06-12 19:20
Merged to maintenance branches in r81959 (release26-maint) and r81960 (release31-maint).  Closing.
msg107717 - (view) Author: Mads Kiilerich (kiilerix) * Date: 2010-06-12 23:35
Thanks for improving the documentation!

A couple of comments for possible further improvements:

I think it would be helpful to also see an early notice about how to achieve platform independence, versus the default of the local platform.

And perhaps the description of "standard" perhaps could be improved.

Perhaps something like the following could be used. Relative to release26-maint/Doc/library/struct.rst rev 81959.
msg107855 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-06-15 08:53
Thanks for the additional suggestions and patch.  I've implemented most of them in revisions r81992 through r81995.

I've left the note about 'native size and alignment':  native alignment *is* determined using sizeof, and I think this is important information.

I've also re-added the information that the 'f' and 'd' formats use IEEE binary32 and binary64, as a note to the format characters table.

And I've moved the information that the 'P' format is only available in native mode to the 'format characters' section.

Additional suggestions for improvments welcome!
History
Date User Action Args
2022-04-11 14:57:00adminsetgithub: 52715
2010-06-17 17:54:12mark.dickinsonsetstatus: open -> closed
2010-06-15 08:53:00mark.dickinsonsetstatus: closed -> open

messages: + msg107855
2010-06-12 23:35:48kiilerixsetfiles: + struct.diff
keywords: + patch
messages: + msg107717
2010-06-12 19:20:56mark.dickinsonsetstatus: open -> closed
versions: + Python 3.1, Python 2.7, Python 3.2
messages: + msg107685

resolution: fixed
stage: resolved
2010-06-12 18:53:20mark.dickinsonsetmessages: + msg107682
2010-05-29 20:58:15mark.dickinsonsetpriority: low
2010-05-29 20:56:52eric.araujosetpriority: normal -> (no value)
nosy: mark.dickinson, kiilerix, Alexander.Belopolsky
components: + Documentation, - Library (Lib)
2010-04-21 08:23:02mark.dickinsonsetmessages: + msg103807
2010-04-21 08:05:09kiilerixsetmessages: + msg103806
2010-04-20 14:44:29Alexander.Belopolskysetmessages: + msg103721
2010-04-20 14:30:46mark.dickinsonsetassignee: mark.dickinson

messages: + msg103720
nosy: + mark.dickinson
2010-04-20 14:11:40Alexander.Belopolskysetnosy: + Alexander.Belopolsky
messages: + msg103712
2010-04-20 12:30:48kiilerixcreate