classification
Title: Integer overflow in classic string formatting
Type: behavior Stage:
Components: Interpreter Core Versions: Python 3.2, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: mark.dickinson Nosy List: eric.smith, haypo, mark.dickinson, python-dev, r.david.murray, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2012-04-30 16:55 by serhiy.storchaka, last changed 2012-10-28 10:27 by mark.dickinson. This issue is now closed.

Files
File name Uploaded Description Edit
pyunicode_format_integer_overflow.patch serhiy.storchaka, 2012-04-30 17:43 review
formatting-overflow-2.7.patch mark.dickinson, 2012-10-07 11:16 review
formatting-overflow-3.2.patch mark.dickinson, 2012-10-07 11:57 review
Messages (28)
msg159707 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-04-30 16:55
Check for integer overflow for width and precision is buggy.

Just a few examples (on platform with 32-bit int):

>>> '%.21d' % 123
'000000000000000000123'
>>> '%.2147483648d' % 123
'123'
>>> '%.2147483650d' % 123
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: prec too big

>>> '%.21f' % (1./7)
'0.142857142857142849213'
>>> '%.2147483648f' % (1./7)
'0.142857'
>>> '%.2147483650f' % (1./7)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: prec too big
msg159708 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012-04-30 17:05
Serhiy: FYI we use the versions field to indicate which versions the fix will be made in, not which versions the bug occurs in.  Since only 2.7, 3.2, and 3.3 get bug fixes, I've changed the versions field to be just those three.  (3.1 and 2.6 are still in the list because they get *security* fixes, but those are rare.)
msg159709 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2012-04-30 17:16
Indeed, Objects/unicodeobject.c (default branch) has this, at around line 13839:

                        if ((prec*10) / 10 != prec) {
                            PyErr_SetString(PyExc_ValueError,
                                            "prec too big");
                            goto onError;
                        }

... which since 'prec' has type int, will invoke undefined behaviour.  There are probably many other cases like this one.

Serhiy, what platform are you on?  And are you applying any special compile-time flags?  For gcc, we should be using -fwrapv, which in this case should make the above code work as intended.
msg159710 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2012-04-30 17:29
See get_integer in Objects/stringlib/unicode_format.h for a better way to do this sort of thing.
msg159712 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-04-30 17:43
> Serhiy: FYI we use the versions field to indicate which versions the fix will be made in, not which versions the bug occurs in.  Since only 2.7, 3.2, and 3.3 get bug fixes, I've changed the versions field to be just those three.  (3.1 and 2.6 are still in the list because they get *security* fixes, but those are rare.)

Well, David, I understand. This ridiculous bug is unlikely security
issue.

Here is a patch that fixes this bug.
msg159713 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-04-30 17:56
> Serhiy, what platform are you on?

32-bit Linux (Ubuntu), gcc 4.6. But it has to happen on any platform
with a 32-bit integer (for 64-bit use 9223372036854775808).

214748364*10/10 == 214748364 -- test passed
214748364*10 + ('8'-'0') == -2147483648 -- oops!

See also how is this problem solved in _struct.c.
msg159714 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2012-04-30 18:04
> But it has to happen on any platform
> with a 32-bit integer

Not necessarily:  it's undefined behaviour, so the compiler can do as it wishes.

Your patch should also address possible overflow of the addition.
msg159715 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-04-30 18:12
> Your patch should also address possible overflow of the addition.

Here there is no overflow. The patch limits prec of a little stronger
(instead of 2147483647 to 2147483639 on a 32-bit platform).
msg159716 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2012-04-30 18:14
Ah yes, true.
msg159718 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2012-04-30 18:17
Any chance of some tests? :-)
msg159726 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-04-30 19:07
> Any chance of some tests? :-)

Even a test for struct tests only struct.calcsize on this specific case.
For string formatting has no such function, on most platforms testing
would be a memory overflow.
msg159729 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-04-30 19:21
> 32-bit Linux (Ubuntu), gcc 4.6.

Sorry, gcc 4.4.
msg159731 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2012-04-30 19:33
Still, I think it would be useful to have some tests that exercise the overflow branches.  (If those tests had existed before, then this issue would probably already have been found and fixed, since clang could have detected the undefined behaviour resulting from signed overflow.)

I'll add tests and apply this later.
msg159732 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-04-30 19:56
> I'll add tests and apply this later.

Well, look at test_crasher in Lib/test/test_struct.py.
msg160130 - (view) Author: Roundup Robot (python-dev) Date: 2012-05-07 10:21
New changeset 064c2d0483f8 by Mark Dickinson in branch 'default':
Issue #14700: Fix two broken and undefined-behaviour-inducing overflow checks in old-style string formatting.  Thanks Serhiy Storchaka for report and original patch.
http://hg.python.org/cpython/rev/064c2d0483f8
msg160141 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-05-07 12:14
Mark, I deliberately have not used the exact formula for the overflow. Comparison with the constant is much cheaper than division or multiplication.

Microbencmark:

./python -m timeit -s 'f="%.1234567890s"*100;x=("",)*100'  'f%x'

Before changeset 064c2d0483f8:  10000 loops, best of 3: 27.1 usec per loop
Changeset 064c2d0483f8:  10000 loops, best of 3: 25.7 usec per loop
Original patch:  100000 loops, best of 3: 18.2 usec per loop
msg160142 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2012-05-07 12:21
Sure, I realize that, but I prefer not to be sloppy in the overflow check, and to use the same formula that's already used in stringlib.  I somehow doubt that this micro-optimization is going to have any noticeable effect in real code.
msg160155 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-05-07 15:54
> I somehow doubt that this micro-optimization is going to have any noticeable effect in real code.

Agree. I just found this bug, trying to optimize the code.
msg172284 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2012-10-07 09:42
Re-opening: this should probably also be fixed in 2.7 and 3.2.  See issue 16096 for discussion.
msg172293 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2012-10-07 11:16
Here's a patch for 2.7.
msg172294 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2012-10-07 11:44
And for 3.2
msg172342 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-10-07 20:27
Only one comment. test_formatting_huge_precision should use not sys.maxsize, but _testcapi.INT_MAX. Other tests can use _testcapi.PY_SSIZE_T_MAX.

I think this tests are worth to add for 3.3 and 3.4. Your old test for this bug (064c2d0483f8) actually does not test the bug on all plathforms.
msg172344 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2012-10-07 20:40
Thanks for reviewing.  I was being lazy with the checks; I'll fix that.

Agreed that it's worth forward porting the tests to 3.3 and 3.4;  I'll do that.
msg172346 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2012-10-07 21:21
For your information, I fixed recently PyUnicode_FromFormatV() to detect overflows on width and precision:

changeset:   79543:d1369daeb9ec
user:        Victor Stinner <victor.stinner@gmail.com>
date:        Sat Oct 06 23:05:00 2012 +0200
files:       Objects/unicodeobject.c
description:
Issue #16147: PyUnicode_FromFormatV() now detects integer overflow when parsing
width and precision
msg173660 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-10-24 09:34
Mark, can I help?
msg174021 - (view) Author: Roundup Robot (python-dev) Date: 2012-10-28 10:01
New changeset 21fb1767e185 by Mark Dickinson in branch '2.7':
Issue #14700: Fix buggy overflow checks for large precision and width in new-style and old-style formatting.
http://hg.python.org/cpython/rev/21fb1767e185
msg174023 - (view) Author: Roundup Robot (python-dev) Date: 2012-10-28 10:23
New changeset 102df748572d by Mark Dickinson in branch '3.2':
Issue #14700: Fix buggy overflow checks for large precision and width in new-style and old-style formatting.
http://hg.python.org/cpython/rev/102df748572d

New changeset 79ea0c84152a by Mark Dickinson in branch '3.3':
Issue #14700: merge tests from 3.2.
http://hg.python.org/cpython/rev/79ea0c84152a

New changeset 22c8e6d71529 by Mark Dickinson in branch 'default':
Issue #14700: merge tests from 3.3.
http://hg.python.org/cpython/rev/22c8e6d71529
msg174024 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2012-10-28 10:27
Fixed in 2.7 and 3.2;  extra tests ported to 3.3 and default.  Reclosing.
History
Date User Action Args
2012-10-28 10:27:58mark.dickinsonsetstatus: open -> closed
resolution: fixed
messages: + msg174024
2012-10-28 10:23:28python-devsetmessages: + msg174023
2012-10-28 10:01:05python-devsetmessages: + msg174021
2012-10-24 09:34:51serhiy.storchakasetmessages: + msg173660
2012-10-07 21:21:04hayposetnosy: + haypo
messages: + msg172346
2012-10-07 20:40:31mark.dickinsonsetmessages: + msg172344
2012-10-07 20:27:56serhiy.storchakasetmessages: + msg172342
2012-10-07 11:57:40mark.dickinsonsetfiles: - formatting-overflow-3.2.patch
2012-10-07 11:57:29mark.dickinsonsetfiles: + formatting-overflow-3.2.patch
2012-10-07 11:44:37mark.dickinsonsetfiles: + formatting-overflow-3.2.patch

messages: + msg172294
2012-10-07 11:16:40mark.dickinsonsetfiles: + formatting-overflow-2.7.patch

messages: + msg172293
2012-10-07 09:42:26mark.dickinsonsetstatus: closed -> open
resolution: fixed -> (no value)
versions: - Python 3.3
2012-10-07 09:42:01mark.dickinsonsetmessages: + msg172284
2012-05-07 15:54:35serhiy.storchakasetmessages: + msg160155
2012-05-07 12:21:08mark.dickinsonsetmessages: + msg160142
2012-05-07 12:14:15serhiy.storchakasetmessages: + msg160141
2012-05-07 10:22:00mark.dickinsonsetstatus: open -> closed
resolution: fixed
2012-05-07 10:21:10python-devsetnosy: + python-dev
messages: + msg160130
2012-04-30 19:56:16serhiy.storchakasetmessages: + msg159732
2012-04-30 19:33:27mark.dickinsonsetassignee: mark.dickinson
messages: + msg159731
2012-04-30 19:21:40serhiy.storchakasetmessages: + msg159729
2012-04-30 19:07:09serhiy.storchakasetmessages: + msg159726
2012-04-30 18:17:07mark.dickinsonsetmessages: + msg159718
2012-04-30 18:14:31mark.dickinsonsetmessages: + msg159716
2012-04-30 18:12:59serhiy.storchakasetmessages: + msg159715
2012-04-30 18:04:12mark.dickinsonsetmessages: + msg159714
2012-04-30 17:56:13serhiy.storchakasetmessages: + msg159713
2012-04-30 17:43:59serhiy.storchakasetfiles: + pyunicode_format_integer_overflow.patch
keywords: + patch
messages: + msg159712
2012-04-30 17:31:38mark.dickinsonlinkissue9530 dependencies
2012-04-30 17:29:45mark.dickinsonsetmessages: + msg159710
2012-04-30 17:16:22mark.dickinsonsetmessages: + msg159709
2012-04-30 17:05:34r.david.murraysetnosy: + r.david.murray, mark.dickinson, eric.smith

messages: + msg159708
versions: - Python 2.6, Python 3.1
2012-04-30 16:55:26serhiy.storchakacreate