classification
Title: struct.pack adding extra '\x00' character in very specific case
Type: behavior Stage:
Components: Library (Lib) Versions: Python 3.0, Python 2.6, Python 2.5
process
Status: closed Resolution: invalid
Dependencies: Superseder:
Assigned To: mark.dickinson Nosy List: cmadrigal, mark.dickinson (2)
Priority: Keywords

Created on 2009-11-06 16:16 by cmadrigal, last changed 2009-11-06 16:25 by mark.dickinson.

Messages (2)
msg94980 - (view) Author: Caleb Madrigal (cmadrigal) Date: 2009-11-06 16:16
struct.pack("17scBH", 'a'*17, 'c', 255, 65535)
produces 'aaaaaaaaaaaaaaaaac\xff\x00\xff\xff'.

Notice the extra '\x00' character between '\xff' and '\xff\xff'.

I have noticed that this happens when there is an odd-length string
(like '17s'), followed by a character ('c'), followed by a byte ('B'),
followed by an unsigned short ('H').

Other variations reproduce it also (such as using a 19-character string).

However, if any one of the following modifications are made to the
format, the problem goes away:
* Change the string to 16s
* Remove the byte ('B')
* Remove the character ('c')
* Remove the unsigned short ('H')

So obviously, this is a seriously deep, dark corner-case.
msg94981 - (view) Author: Mark Dickinson (mark.dickinson) Date: 2009-11-06 16:25
I don't think this is a bug:  struct.pack deliberately adds padding bytes 
so that the byte sequence matches the way that a corresponding C struct 
would be stored in memory, on that platform.  This is described 
(admittedly rather briefly) in the documentation at:

http://docs.python.org/library/struct.html

"""By default, C numbers are represented in the machine’s native format 
and byte order, and properly aligned by skipping pad bytes if necessary 
(according to the rules used by the C compiler)."""

Depending on your application, you may want to use standard size and 
alignment instead, e.g., with:

struct.pack("<17scBH", ...)
History
Date User Action Args
2009-11-06 16:25:41mark.dickinsonsetstatus: open -> closed

nosy: + mark.dickinson
messages: + msg94981

assignee: mark.dickinson
resolution: invalid
2009-11-06 16:16:35cmadrigalcreate