New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Integer overflow in classic string formatting #58905
Comments
Check for integer overflow for width and precision is buggy. Just a few examples (on platform with 32-bit int): >>> '%.21d' % 123
'000000000000000000123'
>>> '%.2147483648d' % 123
'123'
>>> '%.2147483650d' % 123
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: prec too big
>>> '%.21f' % (1./7)
'0.142857142857142849213'
>>> '%.2147483648f' % (1./7)
'0.142857'
>>> '%.2147483650f' % (1./7)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: prec too big |
Serhiy: FYI we use the versions field to indicate which versions the fix will be made in, not which versions the bug occurs in. Since only 2.7, 3.2, and 3.3 get bug fixes, I've changed the versions field to be just those three. (3.1 and 2.6 are still in the list because they get *security* fixes, but those are rare.) |
Indeed, Objects/unicodeobject.c (default branch) has this, at around line 13839: if ((prec*10) / 10 != prec) {
PyErr_SetString(PyExc_ValueError,
"prec too big");
goto onError;
} ... which since 'prec' has type int, will invoke undefined behaviour. There are probably many other cases like this one. Serhiy, what platform are you on? And are you applying any special compile-time flags? For gcc, we should be using -fwrapv, which in this case should make the above code work as intended. |
See get_integer in Objects/stringlib/unicode_format.h for a better way to do this sort of thing. |
Well, David, I understand. This ridiculous bug is unlikely security Here is a patch that fixes this bug. |
32-bit Linux (Ubuntu), gcc 4.6. But it has to happen on any platform 214748364*10/10 == 214748364 -- test passed See also how is this problem solved in _struct.c. |
Not necessarily: it's undefined behaviour, so the compiler can do as it wishes. Your patch should also address possible overflow of the addition. |
Here there is no overflow. The patch limits prec of a little stronger |
Ah yes, true. |
Any chance of some tests? :-) |
Even a test for struct tests only struct.calcsize on this specific case. |
Sorry, gcc 4.4. |
Still, I think it would be useful to have some tests that exercise the overflow branches. (If those tests had existed before, then this issue would probably already have been found and fixed, since clang could have detected the undefined behaviour resulting from signed overflow.) I'll add tests and apply this later. |
Well, look at test_crasher in Lib/test/test_struct.py. |
New changeset 064c2d0483f8 by Mark Dickinson in branch 'default': |
Mark, I deliberately have not used the exact formula for the overflow. Comparison with the constant is much cheaper than division or multiplication. Microbencmark: ./python -m timeit -s 'f="%.1234567890s"*100;x=("",)*100' 'f%x' Before changeset 064c2d0483f8: 10000 loops, best of 3: 27.1 usec per loop |
Sure, I realize that, but I prefer not to be sloppy in the overflow check, and to use the same formula that's already used in stringlib. I somehow doubt that this micro-optimization is going to have any noticeable effect in real code. |
Agree. I just found this bug, trying to optimize the code. |
Re-opening: this should probably also be fixed in 2.7 and 3.2. See bpo-16096 for discussion. |
Here's a patch for 2.7. |
And for 3.2 |
Only one comment. test_formatting_huge_precision should use not sys.maxsize, but _testcapi.INT_MAX. Other tests can use _testcapi.PY_SSIZE_T_MAX. I think this tests are worth to add for 3.3 and 3.4. Your old test for this bug (064c2d0483f8) actually does not test the bug on all plathforms. |
Thanks for reviewing. I was being lazy with the checks; I'll fix that. Agreed that it's worth forward porting the tests to 3.3 and 3.4; I'll do that. |
For your information, I fixed recently PyUnicode_FromFormatV() to detect overflows on width and precision: changeset: 79543:d1369daeb9ec |
Mark, can I help? |
New changeset 21fb1767e185 by Mark Dickinson in branch '2.7': |
New changeset 102df748572d by Mark Dickinson in branch '3.2': New changeset 79ea0c84152a by Mark Dickinson in branch '3.3': New changeset 22c8e6d71529 by Mark Dickinson in branch 'default': |
Fixed in 2.7 and 3.2; extra tests ported to 3.3 and default. Reclosing. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: