This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: struct.unpack weird behavior with "bi" (byte then integer)
Type: behavior Stage:
Components: Library (Lib) Versions: Python 2.6
process
Status: closed Resolution: works for me
Dependencies: Superseder:
Assigned To: mark.dickinson Nosy List: Manux, mark.dickinson
Priority: normal Keywords:

Created on 2009-09-16 21:53 by Manux, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Messages (2)
msg92724 - (view) Author: Emmanuel Bengio (Manux) Date: 2009-09-16 21:53
Using the following command in Python 2.6.1:

>>> struct.unpack("BI","12345")
Traceback (most recent call last):
  File "<pyshell#1>", line 1, in <module>
    struct.unpack("BI","12345")
error: unpack requires a string argument of length 8

I get this error message. What confused me was that doing
>>> struct.unpack("IB","12345")
(875770417, 53)
Worked just fine.

I have found out that this only happens using the native byte
order("@"), which is the default.
For Example:
>>> struct.unpack("!BI","12345")
(49, 842216501)
Works, and all other variants, =, <, > (native standard,little endian,
and small endian) also do.

I haven't found anything about that in the documentation.

Also, the requested 3 other bytes arent event used:
>>> struct.unpack("I","abcd")
(1684234849,) # see the big number starting with 16
>>> ord("x")
120
>>> struct.unpack("BI","xabcd") # we get the error
Traceback (most recent call last):
  File "<pyshell#7>", line 1, in <module>
    struct.unpack("BI","xabcd")
error: unpack requires a string argument of length 8
>>> struct.unpack("BI","xabcdefg")
(120, 1734763876) # not the same here
>>> struct.unpack("BI","xabcabcd")
(120, 1684234849) # same here
>>> struct.unpack("BI","x___abcd")
(120, 1684234849) # same again
msg92725 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-09-16 22:05
I think this is expected behaviour:  the key point is that structs can 
include padding bytes.  From the documentation:

"By default, C numbers are represented in the machine’s native format and 
byte order, and properly aligned by skipping pad bytes if necessary 
(according to the rules used by the C compiler)."

'Native' struct formats include padding, while 'standard' formats don't.

So a native struct with format 'BI' has one byte for the 'B', followed by 
3 padding bytes, followed by four bytes for the 'I'.  This exactly matches 
the way a C struct of the form {char c; int x;} would be organized in 
memory on that machine.
History
Date User Action Args
2022-04-11 14:56:53adminsetgithub: 51173
2009-09-16 22:05:53mark.dickinsonsetstatus: open -> closed

nosy: + mark.dickinson
messages: + msg92725

assignee: mark.dickinson
resolution: works for me
2009-09-16 21:53:18Manuxcreate