New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document that the null character '\0' terminates a struct format spec #79895
Comments
ie.:
>>> from struct import calcsize
>>> calcsize('\144\u0064\000xf\U00000031000\60d\121\U00000051')
16 I'm sure some people think it's obvious or even expect the null character to signal EOF but it probably isn't obvious at all to those without experience in lower level languages. It actually seems like Python goes out of its way to make sure everything treats the null character no more special than the letter "H", which is good. At first glance I'd think something like this was just another trivial quirk of the language and not bring it up, but because the documentation doesn't mention it I actually got stuck on something related for half an hour when unit testing some dynamically generated format specs. Without going into unnecessary detail, what happened was that a typo in another tangentially related part of the test was enabling the generation of a rogue null byte. I'm bad at those "find face in the crowd" puzzles and this was hardly different, being literally camouflaged within a 300 character format spec containing a random mixture of escaped and non-escaped source characters in the forms: \Uffffffff, \uffff, \777, \xff, \x00, + latin/ascii. If I'm not the only one who sees this as a slightly bigger deal than poor documentation, the fix is trivial with an extra call to PyBytes_GET_SIZE when null is found. But just because I can't think of a use case in allowing the null character to precede other characters in the format string doesn't mean there isn't one, which is why only documentation is currently selected. |
I'm not sure whether having NULLs terminate a struct format string is a feature or a bug. Given that nearly every other string in Python treat NULLs as ordinary characters, I'm inclined to say this is a bug. Or at least an unnecessary restriction that ought to be lifted. |
I think the null character is illegal character in the format string, and struct functions should raise a struct.error for it. |
I agree with Serhiy. Any other unrecognised character would raise an error. The null character should do the same. |
I've created a patch to reject null characters in the format string. |
Zackery, do you mind to create a backport to 3.8? |
This seems resolved, can it be closed? |
Yes, this looks closeable. Thank you! |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: