Do not assume signed integer overflow behavior #45962

gpshead · 2007-12-14T00:43:47Z

BPO	1621
Nosy	@loewis, @jcea, @mdickinson, @pitrou, @vstinner, @avassalotti, @matejcik, @jwilk, @alex, @davidmalcolm, @vadmium, @serhiy-storchaka, @ztane, @zhangyangyu, @sir-sigurd, @miss-islington
PRs	bpo-1621: Avoid signed integer overflow in set_table_resize(). #9059 [3.7] bpo-1621: Avoid signed integer overflow in set_table_resize(). (GH-9059) #9198 [3.6] bpo-1621: Avoid signed integer overflow in set_table_resize(). (GH-9059) #9199 bpo-34656: Avoid relying on signed overflow in _pickle memos. #9261 bpo-34656: Avoid relying on signed overflow in _pickle memos. #9261
Dependencies	bpo-13312: test_time fails: strftime('%Y', y) for negative year bpo-27473: bytes_concat seems to check overflow using undefined behaviour bpo-29145: failing overflow checks in replace_*
Files	config.patch overflow-error.patch overflow-error2.patch overflow-error3.patch: Fix whitespace change and comment. overflow-error4.patch: Fix -fwrapv check, thanks tiran fix-overflows-try1.patch fix-overflows-try2.patch: Better patch fix-overflows-try3.patch fix-overflows-final.patch csv.patch issue1621_hashes_and_sets.patch trapv.patch: Committed & superseded set-overflow.patch: Superseded slice-step.patch: => python/issues-test-cpython#36946; supersedes trapv.patch tuple_and_list.patch thread.patch: => Issue 33632 array-size.patch: Superseded tuple_and_list_v2.patch ctypes_v2.patch: Supersedes array-size.patch unicode.patch: Committed tuple_and_list_v3.patch: Committed overflow_fix_in_listextend.patch: Superseded

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2018-10-19.22:55:35.437>
created_at = <Date 2007-12-14.00:43:47.318>
labels = ['type-security', '3.7', '3.8']
title = 'Do not assume signed integer overflow behavior'
updated_at = <Date 2018-10-19.22:55:35.436>
user = 'https://github.com/gpshead'

bugs.python.org fields:

activity = <Date 2018-10-19.22:55:35.436>
actor = 'vstinner'
assignee = 'none'
closed = True
closed_date = <Date 2018-10-19.22:55:35.437>
closer = 'vstinner'
components = []
creation = <Date 2007-12-14.00:43:47.318>
creator = 'gregory.p.smith'
dependencies = ['13312', '27473', '29145']
files = ['8950', '9205', '9207', '9209', '9210', '9211', '9238', '9239', '9242', '9307', '23237', '43728', '43729', '43785', '43827', '43838', '43839', '43842', '43855', '43856', '43862', '43956']
hgrepos = []
issue_num = 1621
keywords = ['patch']
message_count = 128.0
messages = ['58602', '58609', '58610', '58611', '58617', '58620', '58684', '58711', '59611', '59612', '59616', '59619', '59692', '59693', '59694', '59696', '59699', '60078', '60079', '60102', '60103', '60105', '60107', '60109', '60110', '60111', '60114', '60115', '60116', '60118', '60120', '60121', '60124', '60125', '60126', '60146', '60246', '60254', '60260', '61272', '61286', '61291', '61319', '61320', '61757', '61758', '61759', '61760', '61761', '61762', '61763', '61765', '61766', '61767', '61768', '61770', '61774', '61784', '61785', '61788', '61791', '61792', '62616', '87689', '87690', '87693', '87694', '87704', '87707', '87708', '87712', '87730', '87731', '87908', '141871', '144490', '144492', '144499', '144501', '144503', '144505', '144524', '147958', '213729', '270084', '270460', '270461', '270463', '270562', '270580', '270582', '270807', '270809', '270823', '270827', '270869', '270977', '271057', '271058', '271084', '271118', '271136', '271143', '271144', '271150', '271223', '271735', '271759', '271763', '271766', '271768', '271769', '271800', '271803', '272077', '272078', '285466', '303330', '317087', '325091', '325105', '325109', '325113', '325128', '325284', '328068', '328069', '328070']
nosy_count = 22.0
nosy_names = ['loewis', 'nnorwitz', 'jcea', 'mark.dickinson', 'pitrou', 'vstinner', 'alexandre.vassalotti', 'donmez', 'matejcik', 'jwilk', 'alex', 'dmalcolm', 'python-dev', 'deadshort', 'martin.panter', 'serhiy.storchaka', 'ztane', 'fweimer', 'Jeffrey.Walton', 'xiang.zhang', 'sir-sigurd', 'miss-islington']
pr_nums = ['9059', '9198', '9199', '9261', '9261']
priority = 'normal'
resolution = 'fixed'
stage = 'resolved'
status = 'closed'
superseder = None
type = 'security'
url = 'https://bugs.python.org/issue1621'
versions = ['Python 3.6', 'Python 3.7', 'Python 3.8']

gpshead · 2007-12-14T00:43:47Z

The resolution to http://bugs.python.org/issue1608 looks like it'll add
a -fwrapv gcc flag when building python. This works around the issue
nicely on one compiler (gcc) but doesn't fix our fundamentally broken code.

We should fix all dependencies on integer overflow behavior, starting by
making everything compile properly with gcc's -Wstrict-overflow and
-Werror flags.

tiran · 2007-12-14T02:19:07Z

My gcc 4.1 doesn't have the -Wstrict-overflow option.

gcc-Version 4.1.3 2007092 (prerelease) (Ubuntu 4.1.2-16ubuntu2)

tiran · 2007-12-14T02:45:55Z

Should we use -ansi (C90 aka C89 standard) option, too? Python core
compiles fine with -ansi but together with -Werror it breaks several
extensions:

_bsddb _codecs_iso2022 _ctypes
_socket _ssl linuxaudiodev

tiran · 2007-12-14T03:23:51Z

Socket and SSL are using bluetooth.h which defines some functionas as
inline. Inline isn't part of C89. Linuxaudiodev depends on the 'linux'
macro which is not defined in C89.

The Python core can be compiled with -ansi but the extension modules
require -std=gnu89.

loewis · 2007-12-14T07:08:23Z

Using ansi is out of scope of this issue, and should not be mixed with
it. -ansi is about disabling certain GCC extensions. This report is
about C code in Python which has undefined behavior.

I think there is disagreement on whether Python should stop relying on
this particular undefined behavior (namely, whether the sum of two large
positive numbers is negative). GvR (apparently) believes that the
compiler should guarantee that the twos-complement semantic is available
throughout the C language.

malemburg · 2007-12-14T09:24:20Z

Whatever you change regarding the compiler options for Python, please
make sure that this doesn't effect the default settings used by
distutils to compile external modules (it normally takes the options
straight from the Makefile used for compiling Python).

Otherwise, you're likely going to break lots and lots of extensions. Thanks.

avassalotti · 2007-12-17T06:42:01Z

I compiled Python using gcc 4.3.0 with the -Wstrict-overflow, and that's
the only warning I got:

Objects/doubledigits.c: In function ‘_PyFloat_Digits’:
Objects/doubledigits.c:313: error: assuming signed overflow does not
occur when assuming that (X + c) < X is always false

I am sure yet how to interpret it, though. It says that the overflow
check is in _PyFloat_Digits(), line 313 is in the function add_big(). It
probably means that add_big() gets inlined. I tried to set
-finline-limit=0, but strangely the overflow warning disappears...

I will try to investigate this further, when I will have a bit more time
in my hands.

gpshead · 2007-12-17T23:49:28Z

heh if thats the only warning gcc -Wstrict-overflow gives then I've
mistitled the bug. Fixed. It'll take some manual code review.

Anyone know if the commercial analysis tools we've run the code base
through in the past can find these for us?

gvanrossum · 2008-01-09T17:29:04Z

Alexandre, which Python version did you compile with -Wstrict-overflow?
It would behoove us to check 2.5.2 thoroughly before it goes out the door.

I will contact Coverity to ask if they check for this kind of thing.
(They just upgraded us to "Rung 2", whatever that may mean. :-)

MvL: I don't want 2s complement throughout the language, I just want the
overflow checks to be reliable. Since I'd forgotten about the
difference between unsigned and signed overflow, I have no idea how many
overflow checks have been submitted that are relying on signed overflow;
though apparently (if the -Wstrict-overflow results can be trusted)
we're okay.

FWIW, I've heard that some commercial compilers (e.g. XLC) assume that
even *unsigned* overflow is undefined, violating the C standard. This
would suggest that buffer overflow checks should be coded without
relying on arithmetic overflow at all. This is possible, just a bit hairy.

gvanrossum · 2008-01-09T17:30:53Z

Marc-Andre: what do you mean by breaking lots and lots of extensions?
Extensions also contain buffer overflow checks (at least I hope they do
:-) and those should also be guaranteed safe by using -fwrapv or
-fno-strict-overflow (GCC 4.2 and higher) until we're sure there aren't any.

donmez · 2008-01-09T18:12:52Z

-Wstrict-overflow=3 with gcc 4.3 trunk here shows :

Modules/cPickle.c: In function 'Unpickler_noload':
Modules/cPickle.c:4232: warning: assuming signed overflow does not occur
when assuming that (X - c) > X is always false

Modules/cPickle.c:194: warning: assuming signed overflow does not occur
when assuming that (X - c) > X is always false
Modules/cPickle.c: In function 'load':
Modules/cPickle.c:4232: warning: assuming signed overflow does not occur
when assuming that (X - c) > X is always false

But also note that -fno-strict-aliasing is also just another workaround
and its more serious than -fwrapv.

loewis · 2008-01-09T18:59:18Z

But also note that -fno-strict-aliasing is also just another workaround
and its more serious than -fwrapv.

Sure - however, that is fixed in Python 3 (and unrelated to this issue)

avassalotti · 2008-01-11T03:06:24Z

Hm. I don't get any warning, related to the overflow issue, neither with
-Wstrict-overflow=3, nor -Wstrict-overflow=5. Are the cPickle warnings
already fixed?

donmez · 2008-01-11T03:24:37Z

Make sure you use gcc 4.3 trunk and at least -O2 is enabled. I tested
revision 59895 from release25-maint branch.

donmez · 2008-01-11T03:26:28Z

FWIW gcc hacker Ian Lance Taylor has a nice article about signed
overflow optimizations in gcc, see http://www.airs.com/blog/archives/120
. Reading that it might be better to use -fno-strict-overflow instead of
-fwrapv.

Regards,
ismail

loewis · 2008-01-11T08:43:37Z

FWIW gcc hacker Ian Lance Taylor has a nice article about signed
overflow optimizations in gcc, see http://www.airs.com/blog/archives/120
. Reading that it might be better to use -fno-strict-overflow instead of
-fwrapv.

Please be specific. I read it, and I don't think it's better to use
-fno-strict-overflow.

donmez · 2008-01-11T09:47:51Z

Ian says -fno-strict-overflow still allows some optimizations, and his
example code shows less assembly is produced with -fno-strict-overflow.
But of course your opinion matters on this one, not mine.

Regards,
ismail

gvanrossum · 2008-01-18T01:22:43Z

I think the -Wstrict-overflow option may not be enough for the audit we
need.

The overflow issue in expandtabs() still exists (in 2.5 as well as in
the trunk):

            if (*p == '\n' || *p == '\r') {
                i += j;
                old_j = j = 0;
                if (i < 0) {
                    PyErr_SetString(PyExc_OverflowError,
                                    "new string is too long");
                    return NULL;
                }
            }

Here i and j are signed ints (Py_ssize_t) initially know to be >= 0; i
can only become < 0 through overflow. This is the place where Ismail
(cartman) found a crash because the test was optimized away by GCC 4.3
before we added -fwrap.

If we ever hope to clean up the code to the point where -fwrapv is no
longer needed, the audit should find this spot! (Good thing we at least
had a unittest for the overflow check. This should be standard practice
for all overflow checks, as long it doesn't require allocating a GB or
more of memory.)

donmez · 2008-01-18T01:50:59Z

FWIW I reported this to GCC bugzilla as a missing diagnostic @
http://gcc.gnu.org/PR34843

donmez · 2008-01-18T16:48:24Z

Problem was that -Wall at the end was resetting -Wstrict-overflow, so
here is the current results for signed overflow warnings (python 2.5
branch SVN), a lot of them :

Parser/acceler.c: In function 'fixstate':
Parser/acceler.c:90: warning: assuming signed overflow does not occur
when changing X +- C1 cmp C2 to X cmp C1 +- C2

Parser/node.c: In function 'PyNode_AddChild':
Parser/node.c:90: warning: assuming signed overflow does not occur when
changing X +- C1 cmp C2 to X cmp C1 +- C2
Parser/node.c:90: warning: assuming signed overflow does not occur when
changing X +- C1 cmp C2 to X cmp C1 +- C2

Parser/firstsets.c: In function 'calcfirstset':
Parser/firstsets.c:71: warning: assuming signed overflow does not occur
when simplifying conditional to constant

Parser/pgen.c: In function 'compile_item':
Parser/pgen.c:268: warning: assuming signed overflow does not occur when
changing X +- C1 cmp C2 to X cmp C1 +- C2
Parser/pgen.c: In function '_Py_pgen':
Parser/pgen.c:454: warning: assuming signed overflow does not occur when
simplifying conditional to constant
Parser/pgen.c:556: warning: assuming signed overflow does not occur when
simplifying conditional to constant
Parser/pgen.c:604: warning: assuming signed overflow does not occur when
simplifying conditional to constant
Parser/pgen.c:611: warning: assuming signed overflow does not occur when
simplifying conditional to constant

Parser/tokenizer.c: In function 'new_string':
Parser/tokenizer.c:175: warning: assuming signed overflow does not occur
when changing X +- C1 cmp C2 to X cmp C1 +- C2
Parser/tokenizer.c: In function 'tok_get':
Parser/tokenizer.c:1163: warning: assuming signed overflow does not
occur when changing X +- C1 cmp C2 to X cmp C1 +- C2
Parser/tokenizer.c: In function 'PyTokenizer_Get':
Parser/tokenizer.c:1443: warning: assuming signed overflow does not
occur when changing X +- C1 cmp C2 to X cmp C1 +- C2
Parser/tokenizer.c:1443: warning: assuming signed overflow does not
occur when changing X +- C1 cmp C2 to X cmp C1 +- C2
Parser/tokenizer.c: In function 'PyTokenizer_FromString':
Parser/tokenizer.c:607: warning: assuming signed overflow does not occur
when changing X +- C1 cmp C2 to X cmp C1 +- C2

Objects/abstract.c: In function 'PyObject_CallMethodObjArgs':
Objects/abstract.c:2038: warning: assuming signed overflow does not
occur when changing X +- C1 cmp C2 to X cmp C1 +- C2
Objects/abstract.c:2038: warning: assuming signed overflow does not
occur when simplifying conditional to constant
Objects/abstract.c: In function 'PyObject_CallFunctionObjArgs':
Objects/abstract.c:2038: warning: assuming signed overflow does not
occur when changing X +- C1 cmp C2 to X cmp C1 +- C2
Objects/abstract.c:2038: warning: assuming signed overflow does not
occur when simplifying conditional to constant

Objects/intobject.c: In function 'PyInt_FromUnicode':
Objects/intobject.c:394: warning: assuming signed overflow does not
occur when changing X +- C1 cmp C2 to X cmp C1 +- C2

Objects/listobject.c: In function 'merge_at':
Objects/listobject.c:1595: warning: assuming signed overflow does not
occur when changing X +- C1 cmp C2 to X cmp C1 +- C2
Objects/listobject.c:1459: warning: assuming signed overflow does not
occur when changing X +- C1 cmp C2 to X cmp C1 +- C2
Objects/listobject.c:1459: warning: assuming signed overflow does not
occur when changing X +- C1 cmp C2 to X cmp C1 +- C2
Objects/listobject.c:1595: warning: assuming signed overflow does not
occur when changing X +- C1 cmp C2 to X cmp C1 +- C2

Objects/longobject.c: In function 'PyLong_FromUnicode':
Objects/longobject.c:1701: warning: assuming signed overflow does not
occur when changing X +- C1 cmp C2 to X cmp C1 +- C2
Objects/longobject.c: In function '_PyLong_AsScaledDouble':
Objects/longobject.c:703: warning: assuming signed overflow does not
occur when changing X +- C1 cmp C2 to X cmp C1 +- C2
Objects/longobject.c:703: warning: assuming signed overflow does not
occur when changing X +- C1 cmp C2 to X cmp C1 +- C2
Objects/longobject.c: In function 'long_sub':
Objects/longobject.c:1978: warning: assuming signed overflow does not
occur when simplifying conditional to constant
Objects/longobject.c: In function 'l_divmod':
Objects/longobject.c:1802: warning: assuming signed overflow does not
occur when changing X +- C1 cmp C2 to X cmp C1 +- C2
Objects/longobject.c:1802: warning: assuming signed overflow does not
occur when changing X +- C1 cmp C2 to X cmp C1 +- C2

Objects/stringobject.c: In function 'string_expandtabs':
Objects/stringobject.c:3331: warning: assuming signed overflow does not
occur when simplifying conditional to constant
Objects/stringobject.c:3339: warning: assuming signed overflow does not
occur when simplifying conditional to constant
Objects/stringobject.c: In function 'string_replace':
Objects/stringobject.c:2509: warning: assuming signed overflow does not
occur when simplifying conditional to constant
Objects/stringobject.c:2509: warning: assuming signed overflow does not
occur when simplifying conditional to constant
Objects/stringobject.c:2509: warning: assuming signed overflow does not
occur when simplifying conditional to constant
Objects/stringobject.c:2509: warning: assuming signed overflow does not
occur when simplifying conditional to constant
Objects/stringobject.c:2672: warning: assuming signed overflow does not
occur when simplifying conditional to constant
Objects/stringobject.c: In function 'string_count':
Objects/stringlib/count.h:24: warning: assuming signed overflow does not
occur when simplifying conditional to constant
Objects/stringlib/count.h:24: warning: assuming signed overflow does not
occur when simplifying conditional to constant

Objects/unicodeobject.c: In function 'unicode_expandtabs':
Objects/unicodeobject.c:5719: warning: assuming signed overflow does not
occur when simplifying conditional to constant
Objects/unicodeobject.c:5727: warning: assuming signed overflow does not
occur when simplifying conditional to constant
Objects/unicodeobject.c: In function 'PyUnicodeUCS4_Compare':
Objects/unicodeobject.c:5376: warning: assuming signed overflow does not
occur when changing X +- C1 cmp C2 to X cmp C1 +- C2
Objects/unicodeobject.c:5376: warning: assuming signed overflow does not
occur when changing X +- C1 cmp C2 to X cmp C1 +- C2
Objects/unicodeobject.c: In function 'PyUnicodeUCS4_Join':
Objects/unicodeobject.c:4659: warning: assuming signed overflow does not
occur when simplifying conditional to constant

Python/ast.c: In function 'ast_for_genexp':
Python/ast.c:1195: warning: assuming signed overflow does not occur when
simplifying conditional to constant
Python/ast.c:1160: warning: assuming signed overflow does not occur when
simplifying conditional to constant
Python/ast.c: In function 'ast_for_atom':
Python/ast.c:1040: warning: assuming signed overflow does not occur when
simplifying conditional to constant
Python/ast.c:1005: warning: assuming signed overflow does not occur when
simplifying conditional to constant

Python/bltinmodule.c: In function 'builtin_map':
Python/bltinmodule.c:907: warning: assuming signed overflow does not
occur when changing X +- C1 cmp C2 to X cmp C1 +- C2
Python/bltinmodule.c:847: warning: assuming signed overflow does not
occur when changing X +- C1 cmp C2 to X cmp C1 +- C2
Python/bltinmodule.c:847: warning: assuming signed overflow does not
occur when changing X +- C1 cmp C2 to X cmp C1 +- C2

Parser/tokenizer.c:1163: warning: assuming signed overflow does not
occur when changing X +- C1 cmp C2 to X cmp C1 +- C2
Parser/tokenizer.c: In function 'PyTokenizer_Get':
Parser/tokenizer.c:1443: warning: assuming signed overflow does not
occur when changing X +- C1 cmp C2 to X cmp C1 +- C2
Parser/tokenizer.c:1443: warning: assuming signed overflow does not
occur when changing X +- C1 cmp C2 to X cmp C1 +- C2

Python/getargs.c:994: warning: assuming signed overflow does not occur
when changing X +- C1 cmp C2 to X cmp C1 +- C2
Python/getargs.c:1040: warning: assuming signed overflow does not occur
when changing X +- C1 cmp C2 to X cmp C1 +- C2
Python/getargs.c: In function 'seterror':
Python/getargs.c:357: warning: assuming signed overflow does not occur
when changing X +- C1 cmp C2 to X cmp C1 +- C2

Python/import.c: In function 'PyImport_ExtendInittab':
Python/import.c:3129: warning: assuming signed overflow does not occur
when simplifying conditional to constant

Python/modsupport.c: In function 'va_build_value':
Python/modsupport.c:529: warning: assuming signed overflow does not
occur when simplifying conditional to constant

Python/sysmodule.c: In function 'sys_getframe':
Python/sysmodule.c:650: warning: assuming signed overflow does not occur
when changing X +- C1 cmp C2 to X cmp C1 +- C2

Modules/gcmodule.c: In function 'collect':
Modules/gcmodule.c:767: warning: assuming signed overflow does not occur
when changing X +- C1 cmp C2 to X cmp C1 +- C2

./Modules/_sre.c: In function 'sre_match':
./Modules/_sre.c:1002: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1069: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1086: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1143: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1185: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1214: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1238: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1251: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1277: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1291: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1308: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1395: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1408: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c: In function 'sre_umatch':
./Modules/_sre.c:1002: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1069: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1086: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1143: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1185: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1214: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1238: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1251: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1277: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1291: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1308: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1395: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1408: warning: assuming signed overflow does not occur
when simplifying conditional to constant

/packages/python-2.5/Modules/stropmodule.c: In function 'strop_replace':
/packages/python-2.5/Modules/stropmodule.c:1102: warning: assuming
signed overflow does not occur when changing X +- C1 cmp C2 to X cmp C1
+- C2

/packages/python-2.5/Modules/_heapqmodule.c: In function 'heappop':
/packages/python-2.5/Modules/_heapqmodule.c:146: warning: assuming
signed overflow does not occur when changing X +- C1 cmp C2 to X cmp C1
+- C2

/packages/python-2.5/Modules/_hotshot.c: In function 'pack_line_times':
/packages/python-2.5/Modules/_hotshot.c:693: warning: assuming signed
overflow does not occur when changing X +- C1 cmp C2 to X cmp C1 +- C2
/packages/python-2.5/Modules/_hotshot.c: In function 'pack_frame_times':
/packages/python-2.5/Modules/_hotshot.c:706: warning: assuming signed
overflow does not occur when changing X +- C1 cmp C2 to X cmp C1 +- C2

/packages/python-2.5/Modules/binascii.c: In function 'binascii_a2b_base64':
/packages/python-2.5/Modules/binascii.c:320: warning: assuming signed
overflow does not occur when changing X +- C1 cmp C2 to X cmp C1 +- C2
/packages/python-2.5/Modules/binascii.c: In function 'binascii_b2a_uu':
/packages/python-2.5/Modules/binascii.c:287: warning: assuming signed
overflow does not occur when changing X +- C1 cmp C2 to X cmp C1 +- C2

/packages/python-2.5/Modules/parsermodule.c: In function
'validate_subscript':
/packages/python-2.5/Modules/parsermodule.c:2811: warning: assuming
signed overflow does not occur when changing X +- C1 cmp C2 to X cmp C1
+- C2

/packages/python-2.5/Modules/cPickle.c: In function 'Unpickler_noload':
/packages/python-2.5/Modules/cPickle.c:193: warning: assuming signed
overflow does not occur when simplifying conditional to constant
/packages/python-2.5/Modules/cPickle.c:194: warning: assuming signed
overflow does not occur when reducing constant in comparison
/packages/python-2.5/Modules/cPickle.c:4232: warning: assuming signed
overflow does not occur when assuming that (X - c) > X is always false
/packages/python-2.5/Modules/cPickle.c:194: warning: assuming signed
overflow does not occur when assuming that (X - c) > X is always false
/packages/python-2.5/Modules/cPickle.c: In function 'load':
/packages/python-2.5/Modules/cPickle.c:4232: warning: assuming signed
overflow does not occur when assuming that (X - c) > X is always false

/packages/python-2.5/Modules/audioop.c: In function 'audioop_ratecv':
/packages/python-2.5/Modules/audioop.c:1150: warning: assuming signed
overflow does not occur when changing X +- C1 cmp C2 to X cmp C1 +- C2

/packages/python-2.5/Modules/imageop.c: In function 'imageop_dither2grey2':
/packages/python-2.5/Modules/imageop.c:430: warning: assuming signed
overflow does not occur when simplifying conditional to constant

/packages/python-2.5/Modules/_csv.c: In function 'join_append_data':
/packages/python-2.5/Modules/_csv.c:969: warning: assuming signed
overflow does not occur when simplifying conditional to constant

/packages/python-2.5/Modules/expat/xmlparse.c: In function 'getAttributeId':
/packages/python-2.5/Modules/expat/xmlparse.c:5337: warning: assuming
signed overflow does not occur when simplifying conditional to constant

/packages/python-2.5/Modules/dlmodule.c: In function 'dl_call':
/packages/python-2.5/Modules/dlmodule.c:103: warning: assuming signed
overflow does not occur when changing X +- C1 cmp C2 to X cmp C1 +- C2

gvanrossum · 2008-01-18T17:53:41Z

Thanks! Good catch about -Wall. I think I am now able to reproduce these
results with gcc 4.2. These results, while much more disturbing
regarding the state of our code base, at least restore my faith in GCC. :-)

tiran · 2008-01-18T18:16:25Z

I still don't get additional error messages on the trunk. I've altered
the configure.in file to include -Wstrict-overflow=5 after -Wall:

gcc -pthread -c -fno-strict-aliasing -g -Wall -Wstrict-prototypes
-Wstrict-overflow=5 -I. -IInclude -I./Include -DPy_BUILD_CORE -o
Objects/listobject.o Objects/listobject.c

Either all problems are already solved or I'm doing something wrong here.

$ LC_ALL=C gcc-4.2 -v
Using built-in specs.
Target: i486-linux-gnu
Configured with: ../src/configure -v
--enable-languages=c,c++,fortran,objc,obj-c++,treelang --prefix=/usr
--enable-shared --with-system-zlib --libexecdir=/usr/lib
--without-included-gettext --enable-threads=posix --enable-nls
--with-gxx-include-dir=/usr/include/c++/4.2 --program-suffix=-4.2
--enable-clocale=gnu --enable-libstdcxx-debug --enable-mpfr
--enable-targets=all --enable-checking=release --build=i486-linux-gnu
--host=i486-linux-gnu --target=i486-linux-gnu
Thread model: posix
gcc version 4.2.1 (Ubuntu 4.2.1-5ubuntu4)

donmez · 2008-01-18T18:27:52Z

-Wstrict-overflow=5 is not valid afaik its 1-3, 3 for most verbose also
you need a recent gcc 4.3 snapshot for best results, check your
distribution for gcc-snapshot package.

About the -Wall thing it seems to be a gcc bug, but for now workaround
is easy :-)

Regards,
ismail

donmez · 2008-01-18T18:50:22Z

Btw I think we need an unsigned version of Py_ssize_t to fix this
problem cleanly. I am not sure if you would agree with me though.

tiran · 2008-01-18T18:56:45Z

I don't think we can make Py_ssize_t unsigned. On several occasions
Python uses -1 as error flag or default flag.

donmez · 2008-01-18T18:59:34Z

No I mean we need a new unsigned variant. Else we will have to cast it
to unsigned for many overflow cases which is ugly.

gvanrossum · 2008-01-18T20:47:53Z

The proper thing to do here is to add

-Werror=strict-overflow

to the CFLAGS (before -Wall -- we should fix the position of -Wall!);
this will turn all those spots into errors, forcing us to fix them, and
alerting users who might be using a newer compiler than we tested with.

This should be done in favor of -fwrapv, but only if strict-overflow is
supported (which we can find out in the same way as we found out whether
-fwrapv is supported). I think in practice this means GCC 4.2 or newer.

Can someone come up with a patch?

gvanrossum · 2008-01-18T20:49:08Z

An unsigned variant of Py_ssize_t would just be size_t -- that's a much
older type than ssize_t. I don't think we need to invent a Py_ name for it.

vadmium · 2016-07-24T12:08:27Z

Xiang: I don’t think we need to make the tests do anything special. Just make sure they exercise the code that handles overflows. I have been running the test suite without any -j0 option, and I can look over the output and see the error messages. Or if we get to a stage where all the errors are eliminated, you could run with UBSAN_OPTIONS=halt_on_error=1. E.g. in this patch, I add two simple tests to cover parts of the code that weren’t covered before (and if I hadn’t fixed the overflows, the tests would trigger extra UBSAN errors).

ctypes_v2.patch is an update of array-size.patch. I improved the test case, and added a new fix for overflows like the following:

>>> class S(ctypes.Structure):
...     _fields_ = (("field", ctypes.c_longlong, 64),)
... 
>>> s = S()
>>> s.field = 3
Modules/_ctypes/cfield.c:900:9: runtime error: signed integer overflow: -9223372036854775808 - 1 cannot be represented in type 'long long int'

vadmium · 2016-07-24T12:12:53Z

unicode.patch avoids an overflow in PyUnicode_Join():

>>> size = int(sys.maxsize**0.5) + 1
>>> "".join(("A" * size,) * size)
Objects/unicodeobject.c:9927:12: runtime error: signed integer overflow: 46341 + 2147441940 cannot be represented in type 'int'
OverflowError: join() result is too long for a Python string

zhangyangyu · 2016-07-24T13:33:31Z

It turns out you just want to see the output. That is easy. Patch v3 adding the test.

python-dev · 2016-07-25T03:44:28Z

New changeset db93af6080e7 by Martin Panter in branch 'default':
Issue bpo-1621: Avoid signed overflow in list and tuple operations
https://hg.python.org/cpython/rev/db93af6080e7

zhangyangyu · 2016-07-31T12:19:43Z

Martin, I upload a patch to fix another possible overflow in listextend.

vadmium · 2016-08-01T01:26:12Z

overflow_fix_in_listextend.patch: I doubt Python supports the kinds of platform where this overflow would be possible. It may require pointers smaller than 32 bits, or char objects larger than 8 bits. Perhaps we could just add a comment explaining we assume the overflow cannot happen.

It seems list objects will hold one pointer for each element, but the overflow involves the number of elements. Python defines PY_SSIZE_T_MAX as PY_SIZE_MAX // 2. For the overflow to happen we would need

m + n > PY_SSIZE_T_MAX

Assuming a “flat” address space that can allocate up to PY_SIZE_MAX bytes _in total_, the total number of elements cannot exceed

m + n == PY_SIZE_MAX // sizeof (PyObject *)

So in this scenario, the overflow cannot happen unless sizeof (PyObject *) == 1.

Considering things like the 16-bit segmented Intel “large” memory model (which I doubt Python is compatible with), each list could _independently_ be up to PY_SIZE_MAX bytes. Therefore the total number of elements may reach

m + n == PY_SIZE_MAX // sizeof (PyObject *) * 2

So even in this case, sizeof (PyObject *) == 4 (large model) is fine, but anything less (e.g. 16-bit char, or 1-byte segment + 2-byte offset) might overflow.

zhangyangyu · 2016-08-01T02:51:49Z

Hmm, I don't tend to infer platform characteristics. IMHO, it's a simple problem: sum up two lists' length which may overflow in logic.

With your argument, does it means it seems somewhat meaningless to have a List a Py_ssize_t length since it can never reach it? Checks against PY_SSIZE_T_MAX have already existed (for example, in ins1).

vadmium · 2016-08-01T04:25:54Z

The check in ins1() was originally added in revision b9002da46f69. I presume it references the Python-dev thread “can this overflow (list insertion)?” <20000812145155.A7629@ActiveState.com>, <https://marc.info/?l=python-dev&m=107666472818169\>. At that time, ob_size was an int, so overflow checking was definitely needed. Later, revision 7fdc639bc5b4 changed ob_size to Py_ssize_t, and then revision 606818c33e50 updated the overflow check from INT_MAX to PY_SSIZE_T_MAX.

BTW I made a small mistake in my previous message. The worst case would be extending a list with itself. But I think the conclusion is still the same.

zhangyangyu · 2016-08-01T04:47:54Z

So these checks are superfluous? Do we need to remove them?

Hmm, I still doubt such checks should consider platform characteristics first. In theory List can be PY_SSIZE_T_MAX length. Do we have to put the PY_SIZE_MAX // sizeof(PyObject *) limit on it?

ztane · 2016-08-01T06:18:05Z

I don't believe Python would really ever work on a platform with non-8-bit-bytes, I believe there are way too much assumptions *everywhere*. You can program in C on such a platform, yes, but not that sure about Python.

And on 8-bit-byte platfomrs, there is no large model with 16-bit pointers anywhere. There just are not enough bits that you could have multiple 64k byte-addressable segments that are addressed with 16-bit pointers.

It might be that some obscure platform in the past would have had 128k memory, with large pointers, 2 allocatable 64k segments, >16 bit char pointer and 16-bit object pointers pointing to even bytes, but I doubt you'd be really porting *Python 3* to such a platform, basically we're talking about something like Commodore 128 here.

vadmium · 2016-08-02T03:20:51Z

Looking over r60793, the overflow check at Modules/cjkcodecs/multibytecodec.c:836 looks vulnerable to being optimized away, because it can only detect the overflow if the line above has already overflowed. Perhaps change PY_SSIZE_T_MAX to MAXDECPENDING. I wonder if any of the GCC optimization and warning modes can detect this case?

Also, Python/ast.c:3988 checks using PY_SIZE_MAX, but then passes the value to PyBytes_FromStringAndSize(), which expects ssize_t and in the best case would raise SystemError.

zhangyangyu · 2016-08-02T05:10:57Z

I agree. For multibytecode, how about switching the positions of the two checks? If npendings + ctx->pendingsize overflows, the result can be anything, larger, smaller than or equal to MAXDECPENDING. For ast, although a SystemError may be raised but the message seems not obvious to the reason.

vadmium · 2016-08-06T00:59:02Z

Xiang: regarding your overflow_fix_in_listextend.patch, what do you think about adding a comment or debugging assertion instead, something like:

/* It should not be possible to allocate a list large enough to cause an overflow on any relevant platform */
assert(m < PY_SSIZE_T_MAX - n);

zhangyangyu · 2016-08-06T03:18:08Z

It's good Martin. Just commit it.

python-dev · 2017-01-14T07:25:17Z

New changeset dd2c7d497878 by Martin Panter in branch '3.5':
Issues bpo-1621, bpo-29145: Test for str.join() overflow
https://hg.python.org/cpython/rev/dd2c7d497878

New changeset eb6eafafdb44 by Martin Panter in branch 'default':
Issue bpo-1621: Overflow should not be possible in listextend()
https://hg.python.org/cpython/rev/eb6eafafdb44

serhiy-storchaka · 2017-09-29T13:32:44Z

Martin, do you mind to create a PR for your ctypes_v2.patch?

vadmium · 2018-05-19T01:33:23Z

Sorry I haven’t made a PR for ctypes_v2.patch, but I don’t mind if someone else takes over. I understand the HAVE_LONG_LONG check may no longer necessary for newer Python versions.

miss-islington · 2018-09-11T23:18:05Z

New changeset 6c7d67c by Miss Islington (bot) (Sergey Fedoseev) in branch 'master':
bpo-1621: Avoid signed integer overflow in set_table_resize(). (GH-9059)
6c7d67c

vstinner · 2018-09-12T00:26:22Z

newsize <<= 1; // The largest possible value is PY_SSIZE_T_MAX + 1.

Previously, there was a explicitly check for error raising PyErr_NoMemory() on overflow. Now you rely on PyMem_Malloc() to detect the overflow. I'm not sure that it's a good idea.

JeffreyWalton · 2018-09-12T00:53:35Z

On Tue, Sep 11, 2018 at 8:26 PM, STINNER Victor <report@bugs.python.org> wrote:

STINNER Victor <vstinner@redhat.com> added the comment:

> newsize <<= 1; // The largest possible value is PY_SSIZE_T_MAX + 1.

Previously, there was a explicitly check for error raising PyErr_NoMemory() on overflow. Now you rely on PyMem_Malloc() to detect the overflow. I'm not sure that it's a good idea.

+1. It will probably work as expected on Solaris and other OSes that
don't oversubscribe memory. It will probably fail in unexpected ways
on Linux when the allocation succeeds but then the OOM killer hits a
random process.

Jeff

sir-sigurd · 2018-09-12T02:18:31Z

Now you rely on PyMem_Malloc() to detect the overflow.

Now overflow is not possible or am I missing something?

vstinner · 2018-09-12T07:52:57Z

I asked if there is an issue. In fact, all Python memory allocators start by checking if the size is larger than PY_SSIZE_T_MAX. Example:

void *
PyMem_RawMalloc(size_t size)
{
    /*
     * Limit ourselves to PY_SSIZE_T_MAX bytes to prevent security holes.
     * Most python internals blindly use a signed Py_ssize_t to track
     * things without checking for overflows or negatives.
     * As size_t is unsigned, checking for size < 0 is not required.
     */
    if (size > (size_t)PY_SSIZE_T_MAX)
        return NULL;
    return _PyMem_Raw.malloc(_PyMem_Raw.ctx, size);
}

vstinner · 2018-09-13T18:54:22Z

Benjamin: what do you think of adding an explicit check after the "new_size <<= 1;" loop?

    if (new_size > (size_t)PY_SSIZE_T_MAX) {
        PyErr_NoMemory();
        return -1;
    }

Technically, PyMem_Malloc() already implements the check, so it's not really needed. So I'm not sure if it's needed :-)

vstinner · 2018-10-19T22:48:50Z

New changeset a9274f7 by Victor Stinner (Miss Islington (bot)) in branch '3.6':
bpo-1621: Avoid signed integer overflow in set_table_resize(). (GH-9059) (GH-9199)
a9274f7

vstinner · 2018-10-19T22:50:38Z

New changeset 6665802 by Victor Stinner (Miss Islington (bot)) in branch '3.7':
bpo-1621: Avoid signed integer overflow in set_table_resize() (GH-9059) (GH-9198)
6665802

vstinner · 2018-10-19T22:55:35Z

Thank you very much to the task force who worked on this issues which can be seen as boring and useless, but are very important nowadays with C compilers which are more and more agressive to optimize everything (I'm looking at you clang!).

This bug is open for 11 years and dozens and dozens of undefined behaviours have been fixed in the meanwhile.

This bug is a giant beast with many patches and many pull requests. I dislike such bug, it's very hard to follow them. I suggest to open new bugs for undefined behaviour on specific functions, rather than a very vague "let's open a single bug to track everything".

It's now time to close the issue.

gpshead added the type-security A security issue label Dec 14, 2007

gpshead changed the title ~~Python should compile with -Wstrict-overflow when using gcc~~ Do not assume signed integer overflow behavior Dec 17, 2007

gpshead added 3.7 (EOL) end of life 3.8 only security fixes labels Aug 25, 2018

vstinner closed this as completed Oct 19, 2018

ezio-melotti transferred this issue from another repository Apr 10, 2022

Do not assume signed integer overflow behavior #45962

Do not assume signed integer overflow behavior #45962

Comments

gpshead commented Dec 14, 2007

gpshead commented Dec 14, 2007

tiran commented Dec 14, 2007

tiran commented Dec 14, 2007

tiran commented Dec 14, 2007

loewis mannequin commented Dec 14, 2007

malemburg commented Dec 14, 2007

avassalotti commented Dec 17, 2007

gpshead commented Dec 17, 2007

gvanrossum commented Jan 9, 2008

gvanrossum commented Jan 9, 2008

donmez mannequin commented Jan 9, 2008

loewis mannequin commented Jan 9, 2008

avassalotti commented Jan 11, 2008

donmez mannequin commented Jan 11, 2008

donmez mannequin commented Jan 11, 2008

loewis mannequin commented Jan 11, 2008

donmez mannequin commented Jan 11, 2008

gvanrossum commented Jan 18, 2008

donmez mannequin commented Jan 18, 2008

donmez mannequin commented Jan 18, 2008

gvanrossum commented Jan 18, 2008

tiran commented Jan 18, 2008

donmez mannequin commented Jan 18, 2008

donmez mannequin commented Jan 18, 2008

tiran commented Jan 18, 2008

donmez mannequin commented Jan 18, 2008

gvanrossum commented Jan 18, 2008

gvanrossum commented Jan 18, 2008

vadmium commented Jul 24, 2016

vadmium commented Jul 24, 2016

zhangyangyu commented Jul 24, 2016

python-dev mannequin commented Jul 25, 2016

zhangyangyu commented Jul 31, 2016

vadmium commented Aug 1, 2016

zhangyangyu commented Aug 1, 2016

vadmium commented Aug 1, 2016

zhangyangyu commented Aug 1, 2016

ztane mannequin commented Aug 1, 2016

vadmium commented Aug 2, 2016

zhangyangyu commented Aug 2, 2016

vadmium commented Aug 6, 2016

zhangyangyu commented Aug 6, 2016

python-dev mannequin commented Jan 14, 2017

serhiy-storchaka commented Sep 29, 2017

vadmium commented May 19, 2018

miss-islington commented Sep 11, 2018

vstinner commented Sep 12, 2018

JeffreyWalton mannequin commented Sep 12, 2018

sir-sigurd mannequin commented Sep 12, 2018

vstinner commented Sep 12, 2018

vstinner commented Sep 13, 2018

vstinner commented Oct 19, 2018

vstinner commented Oct 19, 2018

vstinner commented Oct 19, 2018