Title: struct.unpack returns null pascal strings - [first] bug report
Type: Stage: resolved
Components: Library (Lib) Versions: Python 2.7
Status: closed Resolution: not a bug
Assigned To: Nosy List: jonheiner, mark.dickinson, meador.inge, xiang.zhang
Created on 2015-03-30 21:27 by jonheiner, last changed 2022-04-11 14:58 by admin.

msg239644 - (view) Author: Jon Heiner (jonheiner) Date: 2015-03-30 21:27
I believe there is an issue with the _struct.c handling of Pascal style strings.

In the _struct.c:s_unpack_internal() function (reading 2.7.6 and 2.7.9 source from tgz ball), the size parameter 'n' is clamped to code->size-1.

As far as I can tell, 'n' is set to the correct deserialized value, but the code->size value is not set to 255. I could be incorrect, as I'm not running in a debugger.

I've attached a short repro case. Note the use of unpack_from() as otherwise unpac() will thrown an error. Additionally, I may be using it wrong, but this feels correct.
msg292960 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2017-05-04 04:39
My previous two messages are not clear enough so I delete them. Sorry for the noise. :-(

When unpacking a pascal string, you cannot simply specify a p format character otherwise struct calculate a wrong size of the format. That's why unpack fails. When count conflicts with the length byte, the smaller one is chosen. I think this is consistent with packing.
msg292968 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2017-05-04 07:40
Specifically, I believe what's happening here is that "8s4spp" is interpreted as "8s4s1p1p", so it decodes a single byte (which can only encode an empty string) for each of the "1p" cases.

I wonder whether the struct module should raise an exception if the length byte read from the encoded data exceeds the count given in the format.
