Issue 6924: struct.unpack weird behavior with "bi" (byte then integer)

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/51173

classification

Title:	struct.unpack weird behavior with "bi" (byte then integer)
Type:	behavior	Stage:
Components:	Library (Lib)	Versions:	Python 2.6

process

Status:	closed	Resolution:	works for me
Dependencies:		Superseder:
Assigned To:	mark.dickinson	Nosy List:	Manux, mark.dickinson
Priority:	normal	Keywords:

Created on 2009-09-16 21:53 by Manux, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Messages (2)
msg92724 - (view)	Author: Emmanuel Bengio (Manux)	Date: 2009-09-16 21:53
Using the following command in Python 2.6.1: >>> struct.unpack("BI","12345") Traceback (most recent call last): File "<pyshell#1>", line 1, in <module> struct.unpack("BI","12345") error: unpack requires a string argument of length 8 I get this error message. What confused me was that doing >>> struct.unpack("IB","12345") (875770417, 53) Worked just fine. I have found out that this only happens using the native byte order("@"), which is the default. For Example: >>> struct.unpack("!BI","12345") (49, 842216501) Works, and all other variants, =, <, > (native standard,little endian, and small endian) also do. I haven't found anything about that in the documentation. Also, the requested 3 other bytes arent event used: >>> struct.unpack("I","abcd") (1684234849,) # see the big number starting with 16 >>> ord("x") 120 >>> struct.unpack("BI","xabcd") # we get the error Traceback (most recent call last): File "<pyshell#7>", line 1, in <module> struct.unpack("BI","xabcd") error: unpack requires a string argument of length 8 >>> struct.unpack("BI","xabcdefg") (120, 1734763876) # not the same here >>> struct.unpack("BI","xabcabcd") (120, 1684234849) # same here >>> struct.unpack("BI","x___abcd") (120, 1684234849) # same again
msg92725 - (view)	Author: Mark Dickinson (mark.dickinson) *	Date: 2009-09-16 22:05
I think this is expected behaviour: the key point is that structs can include padding bytes. From the documentation: "By default, C numbers are represented in the machine’s native format and byte order, and properly aligned by skipping pad bytes if necessary (according to the rules used by the C compiler)." 'Native' struct formats include padding, while 'standard' formats don't. So a native struct with format 'BI' has one byte for the 'B', followed by 3 padding bytes, followed by four bytes for the 'I'. This exactly matches the way a C struct of the form {char c; int x;} would be organized in memory on that machine.

History
Date	User	Action	Args
2022-04-11 14:56:53	admin	set	github: 51173
2009-09-16 22:05:53	mark.dickinson	set	status: open -> closed nosy: + mark.dickinson messages: + msg92725 assignee: mark.dickinson resolution: works for me
2009-09-16 21:53:18	Manux	create