This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: struct.unpack returns null pascal strings - [first] bug report
Type: Stage: resolved
Components: Library (Lib) Versions: Python 2.7
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: jonheiner, mark.dickinson, meador.inge, xiang.zhang
Priority: normal Keywords:

Created on 2015-03-30 21:27 by jonheiner, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
unpack_pascal.py jonheiner, 2015-03-30 21:27 repro case
Messages (3)
msg239644 - (view) Author: Jon Heiner (jonheiner) Date: 2015-03-30 21:27
I believe there is an issue with the _struct.c handling of Pascal style strings.

In the _struct.c:s_unpack_internal() function (reading 2.7.6 and 2.7.9 source from tgz ball), the size parameter 'n' is clamped to code->size-1.

As far as I can tell, 'n' is set to the correct deserialized value, but the code->size value is not set to 255. I could be incorrect, as I'm not running in a debugger.

I've attached a short repro case. Note the use of unpack_from() as otherwise unpac() will thrown an error. Additionally, I may be using it wrong, but this feels correct.
msg292960 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2017-05-04 04:39
My previous two messages are not clear enough so I delete them. Sorry for the noise. :-(

When unpacking a pascal string, you cannot simply specify a p format character otherwise struct calculate a wrong size of the format. That's why unpack fails. When count conflicts with the length byte, the smaller one is chosen. I think this is consistent with packing.
msg292968 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2017-05-04 07:40
Specifically, I believe what's happening here is that "8s4spp" is interpreted as "8s4s1p1p", so it decodes a single byte (which can only encode an empty string) for each of the "1p" cases.

I wonder whether the struct module should raise an exception if the length byte read from the encoded data exceeds the count given in the format.
History
Date User Action Args
2022-04-11 14:58:14adminsetgithub: 68004
2017-05-04 07:40:43mark.dickinsonsetmessages: + msg292968
2017-05-04 04:39:15xiang.zhangsetmessages: + msg292960
2017-05-04 04:29:57xiang.zhangsetmessages: - msg292959
2017-05-04 04:29:51xiang.zhangsetmessages: - msg292958
2017-05-04 04:08:06xiang.zhangsetmessages: + msg292959
2017-05-04 04:04:11xiang.zhangsetstatus: open -> closed

nosy: + xiang.zhang
messages: + msg292958

resolution: not a bug
stage: resolved
2015-03-30 21:27:03jonheinercreate