classification
Title: Do not assume signed integer overflow behavior
Type: security Stage: patch review
Components: Versions: Python 3.8, Python 3.7, Python 3.6
process
Status: open Resolution:
Dependencies: 13312 27473 29145 Superseder:
Assigned To: Nosy List: Jeffrey.Walton, alex, alexandre.vassalotti, deadshort, dmalcolm, donmez, fweimer, jcea, jwilk, loewis, mark.dickinson, martin.panter, matejcik, miss-islington, nnorwitz, pitrou, python-dev, serhiy.storchaka, sir-sigurd, vstinner, xiang.zhang, ztane
Priority: normal Keywords: patch

Created on 2007-12-14 00:43 by gregory.p.smith, last changed 2018-09-13 18:54 by vstinner.

Files
File name Uploaded Description Edit
config.patch christian.heimes, 2007-12-14 03:23
overflow-error.patch donmez, 2008-01-18 20:58
overflow-error2.patch donmez, 2008-01-18 21:11
overflow-error3.patch donmez, 2008-01-18 21:13 Fix whitespace change and comment.
overflow-error4.patch donmez, 2008-01-18 23:15 Fix -fwrapv check, thanks tiran
fix-overflows-try1.patch donmez, 2008-01-18 23:35
fix-overflows-try2.patch donmez, 2008-01-20 01:48 Better patch
fix-overflows-try3.patch donmez, 2008-01-20 03:29
fix-overflows-final.patch donmez, 2008-01-20 11:36
csv.patch donmez, 2008-01-28 03:02
issue1621_hashes_and_sets.patch mark.dickinson, 2011-09-24 15:57 review
trapv.patch martin.panter, 2016-07-15 02:53 Committed & superseded review
set-overflow.patch martin.panter, 2016-07-15 02:57
slice-step.patch martin.panter, 2016-07-19 03:44 Supersedes trapv.patch review
tuple_and_list.patch xiang.zhang, 2016-07-22 08:12 review
thread.patch martin.panter, 2016-07-23 04:35 review
array-size.patch martin.panter, 2016-07-23 05:17 Superseded review
tuple_and_list_v2.patch xiang.zhang, 2016-07-23 15:06 review
ctypes_v2.patch martin.panter, 2016-07-24 12:08 Supersedes array-size.patch review
unicode.patch martin.panter, 2016-07-24 12:12 Committed review
tuple_and_list_v3.patch xiang.zhang, 2016-07-24 13:33 Committed review
overflow_fix_in_listextend.patch xiang.zhang, 2016-07-31 12:19 Superseded review
Pull Requests
URL Status Linked Edit
PR 9059 merged sir-sigurd, 2018-09-04 10:46
PR 9198 open miss-islington, 2018-09-11 23:18
PR 9199 open miss-islington, 2018-09-11 23:18
PR 9261 merged benjamin.peterson, 2018-09-13 16:48
PR 9261 merged benjamin.peterson, 2018-09-13 16:48
Messages (125)
msg58602 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2007-12-14 00:43
The resolution to http://bugs.python.org/issue1608 looks like it'll add
a -fwrapv gcc flag when building python.  This works around the issue
nicely on one compiler (gcc) but doesn't fix our fundamentally broken code.

We should fix all dependencies on integer overflow behavior, starting by
making everything compile properly with gcc's -Wstrict-overflow and
-Werror flags.
msg58609 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2007-12-14 02:19
My gcc 4.1 doesn't have the -Wstrict-overflow option.

gcc-Version 4.1.3 20070929 (prerelease) (Ubuntu 4.1.2-16ubuntu2)
msg58610 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2007-12-14 02:45
Should we use -ansi (C90 aka C89 standard) option, too? Python core
compiles fine with -ansi but together with -Werror it breaks several
extensions:

_bsddb            _codecs_iso2022   _ctypes
_socket           _ssl              linuxaudiodev
msg58611 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2007-12-14 03:23
Socket and SSL are using bluetooth.h which defines some functionas as
inline. Inline isn't part of C89. Linuxaudiodev depends on the 'linux'
macro which is not defined in  C89.

The Python core can be compiled with -ansi but the extension modules
require -std=gnu89.
msg58617 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2007-12-14 07:08
Using ansi is out of scope of this issue, and should not be mixed with
it. -ansi is about disabling certain GCC extensions. This report is
about C code in Python which has undefined behavior.

I think there is disagreement on whether Python should stop relying on
this particular undefined behavior (namely, whether the sum of two large
positive numbers is negative). GvR (apparently) believes that the
compiler should guarantee that the twos-complement semantic is available
throughout the C language.
msg58620 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2007-12-14 09:24
Whatever you change regarding the compiler options for Python, please
make sure that this doesn't effect the default settings used by
distutils to compile external modules (it normally takes the options
straight from the Makefile used for compiling Python).

Otherwise, you're likely going to break lots and lots of extensions. Thanks.
msg58684 - (view) Author: Alexandre Vassalotti (alexandre.vassalotti) * (Python committer) Date: 2007-12-17 06:42
I compiled Python using gcc 4.3.0 with the -Wstrict-overflow, and that's
the only warning I got:  

Objects/doubledigits.c: In function ‘_PyFloat_Digits’:
Objects/doubledigits.c:313: error: assuming signed overflow does not
occur when assuming that (X + c) < X is always false

I am sure yet how to interpret it, though. It says that the overflow
check is in _PyFloat_Digits(), line 313 is in the function add_big(). It
probably means that add_big() gets inlined. I tried to set
-finline-limit=0, but strangely the overflow warning disappears...

I will try to investigate this further, when I will have a bit more time
in my hands.
msg58711 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2007-12-17 23:49
heh if thats the only warning gcc -Wstrict-overflow gives then I've
mistitled the bug.  Fixed.  It'll take some manual code review. 

Anyone know if the commercial analysis tools we've run the code base
through in the past can find these for us?
msg59611 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2008-01-09 17:29
Alexandre, which Python version did you compile with -Wstrict-overflow?
 It would behoove us to check 2.5.2 thoroughly before it goes out the door.

I will contact Coverity to ask if they check for this kind of thing. 
(They just upgraded us to "Rung 2", whatever that may mean. :-)

MvL: I don't want 2s complement throughout the language, I just want the
overflow checks to be reliable.  Since I'd forgotten about the
difference between unsigned and signed overflow, I have no idea how many
overflow checks have been submitted that are relying on signed overflow;
though apparently (if the -Wstrict-overflow results can be trusted)
we're okay.

FWIW, I've heard that some commercial compilers (e.g. XLC) assume that
even *unsigned* overflow is undefined, violating the C standard.  This
would suggest that buffer overflow checks should be coded without
relying on arithmetic overflow at all.  This is possible, just a bit hairy.
msg59612 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2008-01-09 17:30
Marc-Andre: what do you mean by breaking lots and lots of extensions? 
Extensions also contain buffer overflow checks (at least I hope they do
:-) and those should also be guaranteed safe by using -fwrapv or
-fno-strict-overflow (GCC 4.2 and higher) until we're sure there aren't any.
msg59616 - (view) Author: Ismail Donmez (donmez) * Date: 2008-01-09 18:12
-Wstrict-overflow=3 with gcc 4.3 trunk here shows :

Modules/cPickle.c: In function 'Unpickler_noload':
Modules/cPickle.c:4232: warning: assuming signed overflow does not occur
when assuming that (X - c) > X is always false

Modules/cPickle.c:194: warning: assuming signed overflow does not occur
when assuming that (X - c) > X is always false
Modules/cPickle.c: In function 'load':
Modules/cPickle.c:4232: warning: assuming signed overflow does not occur
when assuming that (X - c) > X is always false

But also note that -fno-strict-aliasing is also just another workaround
and its more serious than -fwrapv.
msg59619 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2008-01-09 18:59
> But also note that -fno-strict-aliasing is also just another workaround
> and its more serious than -fwrapv.

Sure - however, that is fixed in Python 3 (and unrelated to this issue)
msg59692 - (view) Author: Alexandre Vassalotti (alexandre.vassalotti) * (Python committer) Date: 2008-01-11 03:06
Hm. I don't get any warning, related to the overflow issue, neither with
-Wstrict-overflow=3, nor -Wstrict-overflow=5. Are the cPickle warnings
already fixed?
msg59693 - (view) Author: Ismail Donmez (donmez) * Date: 2008-01-11 03:24
Make sure you use gcc 4.3 trunk and at least -O2 is enabled. I tested
revision 59895 from release25-maint branch.
msg59694 - (view) Author: Ismail Donmez (donmez) * Date: 2008-01-11 03:26
FWIW gcc hacker Ian Lance Taylor has a nice article about signed
overflow optimizations in gcc, see http://www.airs.com/blog/archives/120
. Reading that it might be better to use -fno-strict-overflow instead of
-fwrapv.

Regards,
ismail
msg59696 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2008-01-11 08:43
> FWIW gcc hacker Ian Lance Taylor has a nice article about signed
> overflow optimizations in gcc, see http://www.airs.com/blog/archives/120
> . Reading that it might be better to use -fno-strict-overflow instead of
> -fwrapv.

Please be specific. I read it, and I don't think it's better to use
-fno-strict-overflow.
msg59699 - (view) Author: Ismail Donmez (donmez) * Date: 2008-01-11 09:47
Ian says -fno-strict-overflow still allows some optimizations, and his
example code shows less assembly is produced with -fno-strict-overflow.
But of course your opinion matters on this one, not mine.

Regards,
ismail
msg60078 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2008-01-18 01:22
I think the -Wstrict-overflow option may not be enough for the audit we
need.

The overflow issue in expandtabs() still exists (in 2.5 as well as in
the trunk):

            if (*p == '\n' || *p == '\r') {
                i += j;
                old_j = j = 0;
                if (i < 0) {
                    PyErr_SetString(PyExc_OverflowError,
                                    "new string is too long");
                    return NULL;
                }
            }

Here i and j are signed ints (Py_ssize_t) initially know to be >= 0; i
can only become < 0 through overflow.  This is the place where Ismail
(cartman) found a crash because the test was optimized away by GCC 4.3
before we added -fwrap.

If we ever hope to clean up the code to the point where -fwrapv is no
longer needed, the audit should find this spot!  (Good thing we at least
had a unittest for the overflow check.  This should be standard practice
for all overflow checks, as long it doesn't require allocating a GB or
more of memory.)
msg60079 - (view) Author: Ismail Donmez (donmez) * Date: 2008-01-18 01:50
FWIW I reported this to GCC bugzilla as a missing diagnostic @
http://gcc.gnu.org/PR34843
msg60102 - (view) Author: Ismail Donmez (donmez) * Date: 2008-01-18 16:48
Problem was that -Wall at the end was resetting -Wstrict-overflow, so
here is the current results for signed overflow warnings (python 2.5
branch SVN), a lot of them :

Parser/acceler.c: In function 'fixstate':
Parser/acceler.c:90: warning: assuming signed overflow does not occur
when changing X +- C1 cmp C2 to X cmp C1 +- C2

Parser/node.c: In function 'PyNode_AddChild':
Parser/node.c:90: warning: assuming signed overflow does not occur when
changing X +- C1 cmp C2 to X cmp C1 +- C2
Parser/node.c:90: warning: assuming signed overflow does not occur when
changing X +- C1 cmp C2 to X cmp C1 +- C2

Parser/firstsets.c: In function 'calcfirstset':
Parser/firstsets.c:71: warning: assuming signed overflow does not occur
when simplifying conditional to constant

Parser/pgen.c: In function 'compile_item':
Parser/pgen.c:268: warning: assuming signed overflow does not occur when
changing X +- C1 cmp C2 to X cmp C1 +- C2
Parser/pgen.c: In function '_Py_pgen':
Parser/pgen.c:454: warning: assuming signed overflow does not occur when
simplifying conditional to constant
Parser/pgen.c:556: warning: assuming signed overflow does not occur when
simplifying conditional to constant
Parser/pgen.c:604: warning: assuming signed overflow does not occur when
simplifying conditional to constant
Parser/pgen.c:611: warning: assuming signed overflow does not occur when
simplifying conditional to constant

Parser/tokenizer.c: In function 'new_string':
Parser/tokenizer.c:175: warning: assuming signed overflow does not occur
when changing X +- C1 cmp C2 to X cmp C1 +- C2
Parser/tokenizer.c: In function 'tok_get':
Parser/tokenizer.c:1163: warning: assuming signed overflow does not
occur when changing X +- C1 cmp C2 to X cmp C1 +- C2
Parser/tokenizer.c: In function 'PyTokenizer_Get':
Parser/tokenizer.c:1443: warning: assuming signed overflow does not
occur when changing X +- C1 cmp C2 to X cmp C1 +- C2
Parser/tokenizer.c:1443: warning: assuming signed overflow does not
occur when changing X +- C1 cmp C2 to X cmp C1 +- C2
Parser/tokenizer.c: In function 'PyTokenizer_FromString':
Parser/tokenizer.c:607: warning: assuming signed overflow does not occur
when changing X +- C1 cmp C2 to X cmp C1 +- C2

Objects/abstract.c: In function 'PyObject_CallMethodObjArgs':
Objects/abstract.c:2038: warning: assuming signed overflow does not
occur when changing X +- C1 cmp C2 to X cmp C1 +- C2
Objects/abstract.c:2038: warning: assuming signed overflow does not
occur when simplifying conditional to constant
Objects/abstract.c: In function 'PyObject_CallFunctionObjArgs':
Objects/abstract.c:2038: warning: assuming signed overflow does not
occur when changing X +- C1 cmp C2 to X cmp C1 +- C2
Objects/abstract.c:2038: warning: assuming signed overflow does not
occur when simplifying conditional to constant

Objects/intobject.c: In function 'PyInt_FromUnicode':
Objects/intobject.c:394: warning: assuming signed overflow does not
occur when changing X +- C1 cmp C2 to X cmp C1 +- C2

Objects/listobject.c: In function 'merge_at':
Objects/listobject.c:1595: warning: assuming signed overflow does not
occur when changing X +- C1 cmp C2 to X cmp C1 +- C2
Objects/listobject.c:1459: warning: assuming signed overflow does not
occur when changing X +- C1 cmp C2 to X cmp C1 +- C2
Objects/listobject.c:1459: warning: assuming signed overflow does not
occur when changing X +- C1 cmp C2 to X cmp C1 +- C2
Objects/listobject.c:1595: warning: assuming signed overflow does not
occur when changing X +- C1 cmp C2 to X cmp C1 +- C2

Objects/longobject.c: In function 'PyLong_FromUnicode':
Objects/longobject.c:1701: warning: assuming signed overflow does not
occur when changing X +- C1 cmp C2 to X cmp C1 +- C2
Objects/longobject.c: In function '_PyLong_AsScaledDouble':
Objects/longobject.c:703: warning: assuming signed overflow does not
occur when changing X +- C1 cmp C2 to X cmp C1 +- C2
Objects/longobject.c:703: warning: assuming signed overflow does not
occur when changing X +- C1 cmp C2 to X cmp C1 +- C2
Objects/longobject.c: In function 'long_sub':
Objects/longobject.c:1978: warning: assuming signed overflow does not
occur when simplifying conditional to constant
Objects/longobject.c: In function 'l_divmod':
Objects/longobject.c:1802: warning: assuming signed overflow does not
occur when changing X +- C1 cmp C2 to X cmp C1 +- C2
Objects/longobject.c:1802: warning: assuming signed overflow does not
occur when changing X +- C1 cmp C2 to X cmp C1 +- C2

Objects/stringobject.c: In function 'string_expandtabs':
Objects/stringobject.c:3331: warning: assuming signed overflow does not
occur when simplifying conditional to constant
Objects/stringobject.c:3339: warning: assuming signed overflow does not
occur when simplifying conditional to constant
Objects/stringobject.c: In function 'string_replace':
Objects/stringobject.c:2509: warning: assuming signed overflow does not
occur when simplifying conditional to constant
Objects/stringobject.c:2509: warning: assuming signed overflow does not
occur when simplifying conditional to constant
Objects/stringobject.c:2509: warning: assuming signed overflow does not
occur when simplifying conditional to constant
Objects/stringobject.c:2509: warning: assuming signed overflow does not
occur when simplifying conditional to constant
Objects/stringobject.c:2672: warning: assuming signed overflow does not
occur when simplifying conditional to constant
Objects/stringobject.c: In function 'string_count':
Objects/stringlib/count.h:24: warning: assuming signed overflow does not
occur when simplifying conditional to constant
Objects/stringlib/count.h:24: warning: assuming signed overflow does not
occur when simplifying conditional to constant

Objects/unicodeobject.c: In function 'unicode_expandtabs':
Objects/unicodeobject.c:5719: warning: assuming signed overflow does not
occur when simplifying conditional to constant
Objects/unicodeobject.c:5727: warning: assuming signed overflow does not
occur when simplifying conditional to constant
Objects/unicodeobject.c: In function 'PyUnicodeUCS4_Compare':
Objects/unicodeobject.c:5376: warning: assuming signed overflow does not
occur when changing X +- C1 cmp C2 to X cmp C1 +- C2
Objects/unicodeobject.c:5376: warning: assuming signed overflow does not
occur when changing X +- C1 cmp C2 to X cmp C1 +- C2
Objects/unicodeobject.c: In function 'PyUnicodeUCS4_Join':
Objects/unicodeobject.c:4659: warning: assuming signed overflow does not
occur when simplifying conditional to constant

Python/ast.c: In function 'ast_for_genexp':
Python/ast.c:1195: warning: assuming signed overflow does not occur when
simplifying conditional to constant
Python/ast.c:1160: warning: assuming signed overflow does not occur when
simplifying conditional to constant
Python/ast.c: In function 'ast_for_atom':
Python/ast.c:1040: warning: assuming signed overflow does not occur when
simplifying conditional to constant
Python/ast.c:1005: warning: assuming signed overflow does not occur when
simplifying conditional to constant

Python/bltinmodule.c: In function 'builtin_map':
Python/bltinmodule.c:907: warning: assuming signed overflow does not
occur when changing X +- C1 cmp C2 to X cmp C1 +- C2
Python/bltinmodule.c:847: warning: assuming signed overflow does not
occur when changing X +- C1 cmp C2 to X cmp C1 +- C2
Python/bltinmodule.c:847: warning: assuming signed overflow does not
occur when changing X +- C1 cmp C2 to X cmp C1 +- C2

Parser/tokenizer.c:1163: warning: assuming signed overflow does not
occur when changing X +- C1 cmp C2 to X cmp C1 +- C2
Parser/tokenizer.c: In function 'PyTokenizer_Get':
Parser/tokenizer.c:1443: warning: assuming signed overflow does not
occur when changing X +- C1 cmp C2 to X cmp C1 +- C2
Parser/tokenizer.c:1443: warning: assuming signed overflow does not
occur when changing X +- C1 cmp C2 to X cmp C1 +- C2

Python/getargs.c:994: warning: assuming signed overflow does not occur
when changing X +- C1 cmp C2 to X cmp C1 +- C2
Python/getargs.c:1040: warning: assuming signed overflow does not occur
when changing X +- C1 cmp C2 to X cmp C1 +- C2
Python/getargs.c: In function 'seterror':
Python/getargs.c:357: warning: assuming signed overflow does not occur
when changing X +- C1 cmp C2 to X cmp C1 +- C2

Python/import.c: In function 'PyImport_ExtendInittab':
Python/import.c:3129: warning: assuming signed overflow does not occur
when simplifying conditional to constant

Python/modsupport.c: In function 'va_build_value':
Python/modsupport.c:529: warning: assuming signed overflow does not
occur when simplifying conditional to constant

Python/sysmodule.c: In function 'sys_getframe':
Python/sysmodule.c:650: warning: assuming signed overflow does not occur
when changing X +- C1 cmp C2 to X cmp C1 +- C2

Modules/gcmodule.c: In function 'collect':
Modules/gcmodule.c:767: warning: assuming signed overflow does not occur
when changing X +- C1 cmp C2 to X cmp C1 +- C2

./Modules/_sre.c: In function 'sre_match':
./Modules/_sre.c:1002: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1069: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1086: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1143: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1185: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1214: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1238: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1251: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1277: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1291: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1308: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1395: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1408: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c: In function 'sre_umatch':
./Modules/_sre.c:1002: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1069: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1086: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1143: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1185: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1214: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1238: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1251: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1277: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1291: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1308: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1395: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1408: warning: assuming signed overflow does not occur
when simplifying conditional to constant

/packages/python-2.5/Modules/stropmodule.c: In function 'strop_replace':
/packages/python-2.5/Modules/stropmodule.c:1102: warning: assuming
signed overflow does not occur when changing X +- C1 cmp C2 to X cmp C1
+- C2

/packages/python-2.5/Modules/_heapqmodule.c: In function 'heappop':
/packages/python-2.5/Modules/_heapqmodule.c:146: warning: assuming
signed overflow does not occur when changing X +- C1 cmp C2 to X cmp C1
+- C2

/packages/python-2.5/Modules/_hotshot.c: In function 'pack_line_times':
/packages/python-2.5/Modules/_hotshot.c:693: warning: assuming signed
overflow does not occur when changing X +- C1 cmp C2 to X cmp C1 +- C2
/packages/python-2.5/Modules/_hotshot.c: In function 'pack_frame_times':
/packages/python-2.5/Modules/_hotshot.c:706: warning: assuming signed
overflow does not occur when changing X +- C1 cmp C2 to X cmp C1 +- C2

/packages/python-2.5/Modules/binascii.c: In function 'binascii_a2b_base64':
/packages/python-2.5/Modules/binascii.c:320: warning: assuming signed
overflow does not occur when changing X +- C1 cmp C2 to X cmp C1 +- C2
/packages/python-2.5/Modules/binascii.c: In function 'binascii_b2a_uu':
/packages/python-2.5/Modules/binascii.c:287: warning: assuming signed
overflow does not occur when changing X +- C1 cmp C2 to X cmp C1 +- C2

/packages/python-2.5/Modules/parsermodule.c: In function
'validate_subscript':
/packages/python-2.5/Modules/parsermodule.c:2811: warning: assuming
signed overflow does not occur when changing X +- C1 cmp C2 to X cmp C1
+- C2

/packages/python-2.5/Modules/cPickle.c: In function 'Unpickler_noload':
/packages/python-2.5/Modules/cPickle.c:193: warning: assuming signed
overflow does not occur when simplifying conditional to constant
/packages/python-2.5/Modules/cPickle.c:194: warning: assuming signed
overflow does not occur when reducing constant in comparison
/packages/python-2.5/Modules/cPickle.c:4232: warning: assuming signed
overflow does not occur when assuming that (X - c) > X is always false
/packages/python-2.5/Modules/cPickle.c:194: warning: assuming signed
overflow does not occur when assuming that (X - c) > X is always false
/packages/python-2.5/Modules/cPickle.c: In function 'load':
/packages/python-2.5/Modules/cPickle.c:4232: warning: assuming signed
overflow does not occur when assuming that (X - c) > X is always false

/packages/python-2.5/Modules/audioop.c: In function 'audioop_ratecv':
/packages/python-2.5/Modules/audioop.c:1150: warning: assuming signed
overflow does not occur when changing X +- C1 cmp C2 to X cmp C1 +- C2

/packages/python-2.5/Modules/imageop.c: In function 'imageop_dither2grey2':
/packages/python-2.5/Modules/imageop.c:430: warning: assuming signed
overflow does not occur when simplifying conditional to constant

/packages/python-2.5/Modules/_csv.c: In function 'join_append_data':
/packages/python-2.5/Modules/_csv.c:969: warning: assuming signed
overflow does not occur when simplifying conditional to constant

/packages/python-2.5/Modules/expat/xmlparse.c: In function 'getAttributeId':
/packages/python-2.5/Modules/expat/xmlparse.c:5337: warning: assuming
signed overflow does not occur when simplifying conditional to constant

/packages/python-2.5/Modules/dlmodule.c: In function 'dl_call':
/packages/python-2.5/Modules/dlmodule.c:103: warning: assuming signed
overflow does not occur when changing X +- C1 cmp C2 to X cmp C1 +- C2
msg60103 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2008-01-18 17:53
Thanks! Good catch about -Wall. I think I am now able to reproduce these
results with gcc 4.2. These results, while much more disturbing
regarding the state of our code base, at least restore my faith in GCC. :-)
msg60105 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2008-01-18 18:16
I still don't get additional error messages on the trunk. I've altered
the configure.in file to include -Wstrict-overflow=5 after -Wall:

gcc -pthread -c -fno-strict-aliasing -g -Wall -Wstrict-prototypes
-Wstrict-overflow=5  -I. -IInclude -I./Include   -DPy_BUILD_CORE -o
Objects/listobject.o Objects/listobject.c

Either all problems are already solved or I'm doing something wrong here.

$ LC_ALL=C gcc-4.2 -v
Using built-in specs.
Target: i486-linux-gnu
Configured with: ../src/configure -v
--enable-languages=c,c++,fortran,objc,obj-c++,treelang --prefix=/usr
--enable-shared --with-system-zlib --libexecdir=/usr/lib
--without-included-gettext --enable-threads=posix --enable-nls
--with-gxx-include-dir=/usr/include/c++/4.2 --program-suffix=-4.2
--enable-clocale=gnu --enable-libstdcxx-debug --enable-mpfr
--enable-targets=all --enable-checking=release --build=i486-linux-gnu
--host=i486-linux-gnu --target=i486-linux-gnu
Thread model: posix
gcc version 4.2.1 (Ubuntu 4.2.1-5ubuntu4)
msg60107 - (view) Author: Ismail Donmez (donmez) * Date: 2008-01-18 18:27
-Wstrict-overflow=5 is not valid afaik its 1-3, 3 for most verbose also
you need a recent gcc 4.3 snapshot for best results, check your
distribution for gcc-snapshot package.

About the -Wall thing it seems to be a gcc bug, but for now workaround
is easy :-)

Regards,
ismail
msg60109 - (view) Author: Ismail Donmez (donmez) * Date: 2008-01-18 18:50
Btw I think we need an unsigned version of Py_ssize_t to fix this
problem cleanly. I am not sure if you would agree with me though.
msg60110 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2008-01-18 18:56
I don't think we can make Py_ssize_t unsigned. On several occasions
Python uses -1 as error flag or default flag.
msg60111 - (view) Author: Ismail Donmez (donmez) * Date: 2008-01-18 18:59
No I mean we need a new unsigned variant. Else we will have to cast it
to unsigned for many overflow cases which is ugly.
msg60114 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2008-01-18 20:47
The proper thing to do here is to add

-Werror=strict-overflow

to the CFLAGS (*before* -Wall -- we should fix the position of -Wall!);
this will turn all those spots into errors, forcing us to fix them, and
alerting users who might be using a newer compiler than we tested with.

This should be done in favor of -fwrapv, but only if strict-overflow is
supported (which we can find out in the same way as we found out whether
-fwrapv is supported).  I think in practice this means GCC 4.2 or newer.

Can someone come up with a patch?
msg60115 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2008-01-18 20:49
An unsigned variant of Py_ssize_t would just be size_t -- that's a much
older type than ssize_t.  I don't think we need to invent a Py_ name for it.
msg60116 - (view) Author: Ismail Donmez (donmez) * Date: 2008-01-18 20:58
Replace -fwrapv with -Wstrict-overflow=3 -Werror=strict-overflow when
supported. Guido, does this do what you wanted?

Regards,
ismail
msg60118 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2008-01-18 21:05
Close, I'd like to keep the -fwrapv if -Wstrict-overflow isn't supported.

Also, would checking this in mean we can't build with GCC 4.3 until
those issues are all fixed?
msg60120 - (view) Author: Ismail Donmez (donmez) * Date: 2008-01-18 21:11
Yes it breaks compilation with gcc 4.3. Fixing these bugs are mostly
s/int/unsigned int. But some parts of code need Python wisdom :/ 

New patch attached adressing your comment.
msg60121 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2008-01-18 21:16
Would you mind also adding patches for the places you think you can fix,
and providing us with a list of places you need help with?

O'm hoping that Greg or Christian can help reviewing these and
committing them.  Thanks much for your help BTW!
msg60124 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2008-01-18 22:24
The -fwrapv doesn't look right. You aren't testing for -fwrapv at all ;)
msg60125 - (view) Author: Ismail Donmez (donmez) * Date: 2008-01-18 23:35
First stub at fixing overflows, regresses following tests :

test_doctests.py
test_locale.py
test_long.py
test_long_future.py
test_optparse.py
test_pickle.py
test_str.py (crash)
test_string.py (crash)
test_unicode.py (crash)
test_userstring.py (crash)
test_xpickle.py

Not great, but a start.
msg60126 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2008-01-18 23:40
> Btw I think we need an unsigned version of Py_ssize_t to fix this
> problem cleanly. I am not sure if you would agree with me though.

There is an unsigned version, it's called "size_t".
msg60146 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2008-01-19 11:01
Crashes ain't good ;)

I suggest that you chance only a small portion of files at a time, then
make && ./python Lib/test/regrtest.py. Start with the Parser, then move
over to AST and the rest of Python/.

You may have to remove all pyc and pyo files if you change code related
to the Parser, AST and byte code marshaling. I'm not sure if it's
required but it's worth a shot. Bytecode mismatches can lead to strange
errors.
msg60246 - (view) Author: Ismail Donmez (donmez) * Date: 2008-01-19 23:15
I created a git repo for my fixes over
http://repo.or.cz/w/pytest.git?a=shortlog;h=overflow-fix .

Now as tiran suggested I fix one file and make sure nothing regressed.
But! Feel free to beat me to it and fix this. I am all new to this and
progress might be and possibly be slow.
msg60254 - (view) Author: Ismail Donmez (donmez) * Date: 2008-01-20 01:48
With second patch now python builds without any overflow warnings, no
new regressions. Please test and/or review.

Only thing left is fixing Modules subdirectory.

Thanks.
msg60260 - (view) Author: Ismail Donmez (donmez) * Date: 2008-01-20 03:29
Possibly last one before final patch, only Modules/_sre.c left to fix, I
appreciate help on that. Please ignore tab problems, I think that can be
fixed later on.

Thanks.
msg61272 - (view) Author: Ismail Donmez (donmez) * Date: 2008-01-20 11:36
Final patch should be complete. Used a trick in _sre.c, instead of i < 0
, I used 
i + i < i to trick gcc.
msg61286 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2008-01-20 13:01
Ismail Donmez wrote:
> Ismail Donmez added the comment:
> 
> Final patch should be complete. Used a trick in _sre.c, instead of i < 0
> , I used 
> i + i < i to trick gcc.

I'm going to review your patch later.

Christian
msg61291 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2008-01-20 13:56
Ismail Donmez wrote:
> Final patch should be complete. Used a trick in _sre.c, instead of i < 0
> , I used 
> i + i < i to trick gcc.
> 
> Added file: http://bugs.python.org/file9242/fix-overflows-final.patch

Does the C89 standard allow this code?

int q = 1;
int p = (unsigned)q;

I've never seen an unsigned cast without a type. Does the code compile
with gcc -std=C89?

Christian
msg61319 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2008-01-20 17:47
> Does the C89 standard allow this code?
> 
> int q = 1;
> int p = (unsigned)q;
> I've never seen an unsigned cast without a type.

Yes, that's fine; it's a different spelling of "unsigned int".
In C99, 6.7.2p1 defines the following groups as equivalent:
- short, signed short, short int, or signed short int
- unsigned short, or unsigned short int
- int, signed, or signed int
- unsigned, or unsigned int
- long, signed long, long int, or signed long int
- unsigned long, or unsigned long int
- long long, signed long long, long long int, or
            signed long long int
- unsigned long long, or unsigned long long int

Specifiers may occur in any order, so you may also write
"int short unsigned".
msg61320 - (view) Author: Ismail Donmez (donmez) * Date: 2008-01-20 17:57
Hi Christian,

unsigned cast is actually suggested by GCC developers to force correct
wrapping for signed types. And thanks to Martin, it makes sense :-)
msg61757 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2008-01-28 02:17
I'm just starting to look at the patch.  The first one changes i to
unsigned in Modules/_csv.c.  Hopefully most of them are like this.  The
code is fine as it is.  There is no reliance on overflow AFAICT.  It's
just that the breaking condition from the loop is not in the for (...).
 I think this change is fine to avoid a warning.  Just pointing out that
in this one case, it's not a real problem.

Change to heapq doesn't seem needed.  I looked at the warning generated
from this and it's if (!n).  This seems to indicate the compiler thinks
that n could be negative.  That should not be possible.  It came from
PyList_GET_SIZE.  We had verified the object was already a list.  So
this value should be between 0 and PY_SSIZE_T_MAX.  We check for 0, so
it might be > 0.  After decrementing n, it should be between 0 and
PY_SSIZE_T_MAX-1.  Of course, the compiler can't know the value can't be
negative (or PY_SSIZE_T_MIN) which would cause an underflow.

Change to hotshot should avoid a cast, so it's slightly better with this
approach.  Although with the change to size_t, the cast in flush_data
can be removed (just after the fwrite).

I don't see the reason to need for the change in sre.c, but I'm pretty
sure there are other overflows.

audioop definitely looks needed.
cPickle looks necessary.
expat/xmlparse.c is interesting--not sure if it's really necessary.  In
practice this probably can't be reached.
gc can't really overflow given that NUM_GENERATIONS is 3 and not likely
to grow much more. :-)

I stopped looking at this point.  It looks like some of these are really
needed.  Others are not possible given other invariants (the compiler
can't know about).  I like the idea of silencing compiler warnings. 
However, I fear this may generate a different problem.  Namely
signed/unsigned mismatches.

What happens if you add this warning:  -Wsign-compare
I think we got rid of most of those before (probably not true as much
for Modules/*.c).  I think this introduces many more.  Would it be
possible to come up with a patch that doesn't introduce more warnings
w/-Wsign-compare?

One potential issue with this patch is that while the additions might
have guaranteed semantics, we might have other problems when doing:

   size_t value = PyXXX_Size();
   if (value < 0) ...

I'm hoping that if we can use both -Wstrict-overflow and
-Wsign-conversion, eliminate all warnings, resulting in better code. 
(You could also try building with g++.  The core used to work without
warnings.  The modules still had a ways to go.)

Also what is the current state?  What has been implemented and what else
needs to be done?  Perhaps we should make other bug report(s) to address
other tangents that were discussed in this thread.
msg61758 - (view) Author: Ismail Donmez (donmez) * Date: 2008-01-28 02:28
Thanks for the through review! I will add -Wsign-compare and fix new
warnings.
Btw current state is with the patch -fwrapv is not needed and no
regressions.
msg61759 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2008-01-28 02:41
I just added -Wstrict-overflow to the code in sysmodule.c
(sys_getframe) and tried with gcc 4.2.1.  It doesn't warn.  I wonder
if 4.3 is more picky or warns when it shouldn't?

Unless if I changed the code so it doesn't work:

typedef struct {int ref;} PyObject;
typedef struct { PyObject* f_back; } PyFrameObject;
int PyArg_ParseTuple(PyObject*, const char*, int*);

PyObject *
sys_getframe(PyFrameObject *f, PyObject *self, PyObject *args)
{
        int depth = -1;

        if (!PyArg_ParseTuple(args, "|i:_getframe", &depth))
                return 0;

        while (depth > 0 && f != 0) {
                f = (PyFrameObject*)f->f_back;
                --depth;
        }
        return (PyObject*)f;
}

Compiled with:
gcc-4.2.1-glibc-2.3.2/x86_64-unknown-linux-gnu/bin/x86_64-unknown-linux-gnu-gcc
-Wstrict-overflow -c xx.c

produced no warnings.  This is not a stock gcc 4.2.1, so that could
also be an issue.  Did I run it correctly.  Is there anything else I
need to do?  If you run the code above with gcc 4.3, does it produce a
warning?
msg61760 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2008-01-28 02:41
Unfortunately I have no time to work on this myself, but in order to
make progress I recommend checking in things piecemeal so that the same
changes don't get reviewed over and over again.  I propose that each
submit reference this bug ("Partial fix for issue #1621" I'd suggest)
and that the submits are recorded here (e.g. "fixed <filename> in rXXX
(2.5.2), rYYY (2.6)").  Then hopefully only a few hard cases will remain.

With Neal, I don't see what the warning in _csv is about. What condition
is being turned into a constant? Is the compiler perhaps rearranging the
code so as to insert "if (field[0] == '\0') goto XXX;" in front of the
for-loop where XXX jumps into the middle of the condition in the
if-statement immediately following the for-loop, and skipping that
if-block when breaking of the loop later? That would be clever, and
correct, and I'm not sure if making i unsigned is going to help -- in
fact it might make the compiler decide it can't use that optimization...
msg61761 - (view) Author: Ismail Donmez (donmez) * Date: 2008-01-28 02:45
To Neal,

Can you try with -Wstrict-overflow=3 , but yes I am using gcc 4.3 trunk.

To Guido,

I'll check _csv.c issue closely. Shall I create the new bug reports or
will reviewers will do so and CC me maybe?
msg61762 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2008-01-28 02:51
On Jan 27, 2008 6:45 PM, Ismail Donmez <report@bugs.python.org> wrote:
>
> Can you try with -Wstrict-overflow=3 , but yes I am using gcc 4.3 trunk.

I just tried with =1, =2, =3, and no =.  All the same result:  no warning.

Ismail, thanks for going through all this effort.  It's very helpful.
msg61763 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2008-01-28 02:54
Don't create new bug reports!
msg61765 - (view) Author: Ismail Donmez (donmez) * Date: 2008-01-28 03:02
Neal,

I'll try to answer your questions one by one, first with _csv.c compiler
issues :

Modules/_csv.c:969: warning: assuming signed overflow does not occur
when simplifying conditional to constant

There is a check inside loop like this:

               if (c == '\0')
                       break;

Instead of this if we do the check in the for :

+       for (i = 0; i < strlen(field) ; i++) {

and remove the if check compiler no longer issues a warning also csv
test passes with this. Attached patch implements this optimization.

Guido, you don't have to shout, you know noone pays me per python
bugreport I create :)
msg61766 - (view) Author: Ismail Donmez (donmez) * Date: 2008-01-28 03:10
_sre.c case is the most interesting one , compiler says :

./Modules/_sre.c:1002: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1069: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1086: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1143: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1185: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1214: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1238: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1251: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1277: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1291: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1308: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1395: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1408: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c: In function 'sre_umatch':
./Modules/_sre.c:1002: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1069: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1086: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1143: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1185: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1214: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1238: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1251: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1277: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1291: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1308: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1395: warning: assuming signed overflow does not occur
when simplifying conditional to constant
./Modules/_sre.c:1408: warning: assuming signed overflow does not occur
when simplifying conditional to constant

all lines refer to : RETURN_ON_ERROR(ret);

My investigation :

On line 1002 we got RETURN_ON_ERROR(ret); but ret is already 0 and not
set to anything this if will always be false.

Same for line 1069, ret is still zero there. Maybe I am missing
something here?
msg61767 - (view) Author: Ismail Donmez (donmez) * Date: 2008-01-28 03:16
For xmlparse.c compiler says :

Modules/expat/xmlparse.c:5337: warning: assuming signed overflow does
not occur when simplifying conditional to constant

Its impossible for j to overflow here due to name[i] check but I am not
sure what gcc is optimizing here.
msg61768 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2008-01-28 03:20
> for (i = 0; i < strlen(field) ; i++) {

This looks inefficient.  Why not

 for (i = 0; field[i] != '\0'; i++) {

???
msg61770 - (view) Author: Ismail Donmez (donmez) * Date: 2008-01-28 03:32
Hah strlen in a loop, a nice beginner mistake but its 5.30 AM here so
please excuse me, Guido your version of course is way better. But with
that version compiler issues 

Modules/_csv.c:969: warning: assuming signed overflow does not occur when 
simplifying conditional to constant

again, strlen() just fooled that optimization it seems. So there should
be another way to optimize this loop.
msg61774 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2008-01-28 08:14
> With Neal, I don't see what the warning in _csv is about. What condition
> is being turned into a constant? Is the compiler perhaps rearranging the
> code so as to insert "if (field[0] == '\0') goto XXX;" in front of the
> for-loop where XXX jumps into the middle of the condition in the
> if-statement immediately following the for-loop, and skipping that
> if-block when breaking of the loop later?

Indeed that's what happens. In the case of breaking the loop later,
the compiler can skip the if-block only if signed ints never overflow,
hence the warning.

Another way of silencing the warning is to test field[0]=='\0' in the
if-statement. This might also somewhat pessimize the code, but allows
the compiler to eliminate i altogether.
msg61784 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2008-01-28 17:54
I wonder if it would help making i a Py_ssize_t instead of an int?
msg61785 - (view) Author: Ismail Donmez (donmez) * Date: 2008-01-28 17:54
Neal,

You could btw check
http://repo.or.cz/w/pytest.git?a=shortlog;h=overflow-fix which have each
fix seperate so that reviewing is easy. Just ignore configure changes
thats for later.

Thanks,
ismail
msg61788 - (view) Author: Ismail Donmez (donmez) * Date: 2008-01-28 17:57
> Guido van Rossum added the comment:
>
> I wonder if it would help making i a Py_ssize_t instead of an int?

gcc still issues the same warning with that.
msg61791 - (view) Author: Ismail Donmez (donmez) * Date: 2008-01-28 18:05
gcc is optimizing the second if check , for specifically i == 0 seems to
redundant according to gcc.

        if (i == 0 && quote_empty) {
                if (dialect->quoting == QUOTE_NONE) {
                        PyErr_Format(error_obj,
                                     "single empty field record must be
quoted");
                        return -1;
                }
                else
                        *quoted = 1;
        }
msg61792 - (view) Author: Ismail Donmez (donmez) * Date: 2008-01-28 18:09
Moving the empty check before the loop will fix this and possibly
optimize empty string handling.
msg62616 - (view) Author: Ismail Donmez (donmez) * Date: 2008-02-21 11:03
Any news on this? Also gcc 4.3 & gcc 4.2.3 fixed the -Wall clobbering -
Wstrict-overflow problem, which is good news.
msg87689 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-05-13 15:53
A few comments:

(1) I agree that signed overflows should be avoided
where possible.

(2) I think some of the changes in the latest patch
(fix-overflows-final.patch) are unnecessary, and
decrease readability a bit. An example is the following
chunk for the x_divrem function in Objects/longobject.c.

@@ -1799,7 +1799,7 @@ x_divrem(PyLongObject *v1, PyLongObject *w1, PyLongObject **prem)
 	k = size_v - size_w;
 	a = _PyLong_New(k + 1);
 
-	for (j = size_v; a != NULL && k >= 0; --j, --k) {
+	for (j = size_v; a != NULL && k >= 0; j = (unsigned)j - 1 , k = (unsigned)k - 1) {
 		digit vj = (j >= size_v) ? 0 : v->ob_digit[j];
 		twodigits q;
 		stwodigits carry = 0;

In this case it's easy to see from the code that j and k
will always be nonnegative, so replacing --j with j =
(unsigned)j - 1 is unnecessary.  (This chunk no longer
applies for 2.7 and 3.1, btw, since x_divrem got
rewritten recently.)  Similar comments apply to the
change:

-			min_gallop -= min_gallop > 1;
+			if (min_gallop > 1) min_gallop = (unsigned)min_gallop - 1;

in Objects/listobject.c.  Here it's even clearer that
the cast is unnecessary.

I assume these changes were made to silence warnings from
-Wstrict-overflow, but I don't think that should be a goal:
I'd suggest only making changes where there's a genuine
possibility of overflow (even if it's a remote one), and
leaving the code unchanged if it's reasonably easy to
see that overflow is impossible.


(3) unicode_hash also needs fixing, as do the lookup
algorithms for set and dict.  Both use overflowing
arithmetic on signed types as a matter of course.
Probably a good few of the hash algorithms for the
various object types in Objects/ are suspect.
msg87690 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2009-05-13 16:01
"""I assume these changes were made to silence warnings from
-Wstrict-overflow, but I don't think that should be a goal:
I'd suggest only making changes where there's a genuine
possibility of overflow (even if it's a remote one), and
leaving the code unchanged if it's reasonably easy to
see that overflow is impossible."""

There is a lot of value in being able to compile with -Wstrict-overflow 
and know that every warning omitted is something to be looked at.  I 
think it is advantageous to make all code pass this.  Having any 
"expected" warnings during compilation tends to lead people to ignore 
all warnings.

That said, I agree those particular examples of unnecessary casts are 
ugly and should be written differently if they are actually done to 
prevent a warning.
msg87693 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-05-13 16:45
> There is a lot of value in being able to compile with -Wstrict-overflow 
> and know that every warning omitted is something to be looked at.

I agree in principle; this certainly applies to -Wall.  But -Wstrict-
overflow doesn't do a particularly good job of finding signed overflow
cases:  there are a good few false positives, and it doesn't pick up
the many cases of actual everyday signed overflow e.g., in unicode_hash, 
byteshash, set_lookkey, etc.), so it doesn't seem a particular good basis 
for code rewriting.
msg87694 - (view) Author: Ismail Donmez (donmez) * Date: 2009-05-13 16:48
You should be using gcc 4.4 to get the best warning behaviour.
msg87704 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-05-13 20:05
Thanks Ismail.  I'm currently using gcc-4.4 with the -ftrapv (not 
-fwrapv!) option to see how much breaks.  (Answer: quite a lot. :-[ )

I'm finding many overflow checks that look like:

	size = Py_SIZE(a) * n;
	if (n && size / n != Py_SIZE(a)) {
		PyErr_SetString(PyExc_OverflowError,
			"repeated bytes are too long");
		return NULL;
	}

where size and n have type Py_ssize_t.  That particular one comes
from bytesobject.c (in py3k), but this style of check occurs
frequently throughout the source.

Do people think that all these should be fixed?  

The fix itself s reasonably straightforward:  instead of multiplying
and then checking for an overflow that's already happened (and hence
has already invoked undefined behaviour according to the standards),
get an upper bound for n *first* by dividing PY_SSIZE_T_MAX
by Py_SIZE(a) and use that to do the overflow check *before*
the multiplication.  It shouldn't be less efficient:  either way
involves an integer division, a comparison, and a multiplication.

The hard part is finding all the places that need to be fixed.
msg87707 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2009-05-13 20:38
> I'm finding many overflow checks that look like:
> 
> 	size = Py_SIZE(a) * n;
> 	if (n && size / n != Py_SIZE(a)) {
> 		PyErr_SetString(PyExc_OverflowError,
> 			"repeated bytes are too long");
> 		return NULL;
> 	}
> 
> where size and n have type Py_ssize_t.  That particular one comes
> from bytesobject.c (in py3k), but this style of check occurs
> frequently throughout the source.
> 
> Do people think that all these should be fixed?  

If this really invokes undefined behavior already (i.e. a compiler
could set "size" to -1, and have the test fail - ie. not give
an exception, and still be conforming) - then absolutely yes.

> The fix itself s reasonably straightforward:  instead of multiplying
> and then checking for an overflow that's already happened (and hence
> has already invoked undefined behaviour according to the standards),
> get an upper bound for n *first* by dividing PY_SSIZE_T_MAX
> by Py_SIZE(a) and use that to do the overflow check *before*
> the multiplication.  It shouldn't be less efficient:  either way
> involves an integer division, a comparison, and a multiplication.

[and then perform the multiplication unsigned, to silence the
warning - right?]

I think there is a second solution: perform the multiplication
unsigned in the first place. For unsigned multiplication, IIUC,
overflow behavior is guaranteed in standard C (i.e. it's modulo
2**N, where N is the number of value bits for the unsigned value).

So the code would change to

nbytes = (size_t)Py_SIZE(a)*n;
if (n && (nbytes > Py_SSIZE_T_MAX || nbytes/n != Py_SIZE(a))...
size = (Py_ssize_t)nbytes;
msg87708 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-05-13 20:58
> [and then perform the multiplication unsigned, to silence the
> warning - right?]

That wasn't actually what I was thinking:  I was proposing to rewrite it 
as:

	if (Py_SIZE(a) > 0 && n > PY_SSIZE_T_MAX/Py_SIZE(a)) {
		PyErr_SetString(PyExc_OverflowError,
			"repeated bytes are too long");
		return NULL;
	}
	size = Py_SIZE(a) * n;

The multiplication should be safe from overflow, and I don't get
any warning at all either with this rewrite (using -O3 -Wall -Wextra -
Wsigned-overflow=5) or from the original code, so there's nothing to 
silence.

> I think there is a second solution: perform the multiplication
> unsigned in the first place.

That would work too.  I find the above code clearer, though.  It's not 
immediately obvious to me that the current overflow condition actually 
works, even assuming wraparound on overflow; I find myself having to 
think about the mathematics every time.

In general, it seems to me that the set of places reported by -Wsigned-
overflow is a poor match for the set of places that need to be fixed.  -
Wsigned-overflow only gives a warning when that particular version of 
gcc, with those particular flags, happens to make use of the no-overflow 
assumption for some particular optimization.  Certainly each of the 
places reported by -Wsigned-overflow should be investigated, but I don't 
believe it's worth 'fixing' correct code just to get rid of warnings 
from this particular warning option.
msg87712 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2009-05-13 21:32
> 	size = Py_SIZE(a) * n;
> 
> The multiplication should be safe from overflow, and I don't get
> any warning at all either with this rewrite (using -O3 -Wall -Wextra -
> Wsigned-overflow=5) or from the original code, so there's nothing to 
> silence.

This is puzzling, isn't it? It *could* overflow, could it not?

>> I think there is a second solution: perform the multiplication
>> unsigned in the first place.
> 
> That would work too.  I find the above code clearer, though.

I agree in this case. In general, I'm not convinced that it
is always possible to rewrite the code in that way conveniently.
msg87730 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-05-14 09:00
> This is puzzling, isn't it?

I don't see why.  There's nothing in -Wall -Wextra -Wsigned-overflow
that asks for warnings for code that might overflow. Indeed, I don't
see how any compiler could reasonably provide such warnings without
flagging (almost) every occurrence of arithmetic on signed integers
as suspect.[*] 

The -ftrapv option is useful for catching genuine signed-integer
overflows at runtime, but it can still only catch those cases
that actually get exercised (e.g., by the Python test suite).


[*] Even some operations on unsigned integers would have to be
flagged: the C expression "(unsigned short)x * (unsigned short)y"
also has the potential to invoke undefined behaviour, thanks to
C's integer promotion rules.
msg87731 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-05-14 09:50
Aargh!

s/Wsigned-overflow/Wstrict-overflow/g

Sorry.
msg87908 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2009-05-16 18:40
> I don't see why.  There's nothing in -Wall -Wextra -Wsigned-overflow
> that asks for warnings for code that might overflow.

Ah, right. I misunderstood (rather, didn't bother checking) what
-Wsigned-overflow really does.
msg141871 - (view) Author: deadshort (deadshort) Date: 2011-08-10 15:47
Since this is still dribbling along I'll point out intobject.c:int_pow() and:

        prev = ix;              /* Save value for overflow check */
        if (iw & 1) {
            ix = ix*temp;
            if (temp == 0)
                break; /* Avoid ix / 0 */
            if (ix / temp != prev) {
                return PyLong_Type.tp_as_number->nb_power(
                    (PyObject *)v,
                    (PyObject *)w,
                    (PyObject *)z);
            }
        }

which I misclassified in http://bugs.python.org/issue12701
msg144490 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2011-09-24 08:24
Resetting versions.
msg144492 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2011-09-24 08:27
Clang has an -ftrapv option that seems to be less buggy and more complete than gcc's.  Compiling and running the test suite with that option enabled looks like a good way to catch a lot of these signed overflows.
msg144499 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2011-09-24 14:48
Do we consider these overflows as bugs in Python, or do we declare that we expect compilers to "do the right thing" for the bug fix releases (i.e. care only about the default branch). I'd personally vote for the latter - i.e. don't apply any of the resulting changes to the maintenance releases, and target the issue only for 3.3.

Realistically, a compiler that invokes truly undefined behavior for signed overflow has no chance of getting 3.2 compiled correctly, and we have no chance of finding all these issues within the lifetime of 3.2.

If that is agreed, I would start committing changes that fix the issues Mark already discussed in 2009 (unless he wants to commit them himself).
msg144501 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2011-09-24 15:57
> don't apply any of the resulting changes to the maintenance releases,
> and target the issue only for 3.3.

That sounds fine to me, though if we find more instances of signed overflow that actually trigger test failures in the maintenance branches (like the int_pow one) on mainstream compilers, we might want to fix those there too, on a case-by-case basis.

To get started, here's a patch that fixes occurrences of signed overflow in the bytes, str and tuple hash methods, and also in set lookups.  It also fixes a related and minor casting inconsistency in dictobject.c  (which was using (size_t)hash & mask in some places, and just 'hash & mask' in others).  These are the minimal changes required to get Python to build completely using Clang with '-ftrapv' turned on and --with-pydebug enabled.
msg144503 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2011-09-24 17:19
New changeset 698fa089ce70 by Mark Dickinson in branch 'default':
Issue #1621: Fix undefined behaviour in bytes.__hash__, str.__hash__, tuple.__hash__, frozenset.__hash__ and set indexing operations.
http://hg.python.org/cpython/rev/698fa089ce70
msg144505 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2011-09-24 18:12
New changeset 5e456e1a9e8c by Mark Dickinson in branch 'default':
Issue #1621: Fix undefined behaviour from signed overflow in get_integer (stringlib/formatter.h)
http://hg.python.org/cpython/rev/5e456e1a9e8c
msg144524 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2011-09-25 14:35
New changeset 3fb9464f9b02 by Mark Dickinson in branch 'default':
Issue #1621: Fix undefined behaviour from signed overflow in datetime module hashes, array and list iterations, and get_integer (stringlib/string_format.h)
http://hg.python.org/cpython/rev/3fb9464f9b02
msg147958 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2011-11-19 17:10
See also issue #9530.
msg213729 - (view) Author: Jeffrey Walton (Jeffrey.Walton) * Date: 2014-03-16 15:07
Also see http://bugs.python.org/issue20944 for suggestions to identify the offending code.
msg270084 - (view) Author: Antti Haapala (ztane) * Date: 2016-07-10 13:05
One common case where signed integer overflow has been assumed has been the wraparound/overflow checks like in http://bugs.python.org/issue27473 

I propose that such commonly erroneous tasks such as overflow checks be implemented as common macros in CPython as getting them right is not quite easy (http://c-faq.com/misc/sd26.html); it would also make the C code more self-documenting.

Thus instead of writing

     if (va.len > PY_SSIZE_T_MAX - vb.len) {
  
one would write something like
    
    if (PY_SSIZE_T_SUM_OVERFLOWS(va.len, vb.len)) {

and the mere fact that such a macro *wasn't* used there would signal about possible problems with the comparison.
msg270460 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-07-15 02:53
Inspired by Issue 27473, I did a quick and dirty scan for obvious places that expect overflow to wrap, and found the following, which I think should be fixed:

Modules/_ctypes/_ctypes.c:1388, in PyCArrayType_new()
Objects/listobject.c:492, in list_concat()
Objects/tupleobject.c:457, in tupleconcat()
Objects/listobject.c:845, in listextend()

Also I played with enabling GCC’s -ftrapv option. Attached is a patch with three changes:

1. configure --with-pydebug enables -ftrapv (experimental, not sure everyone would want this)

2. Easy fix for negation overflow in audioop (I am happy to apply this part)

3. Avoid dumb overflows at end of for loop in Element Tree code when handling slices with step=sys.maxsize. Technically the overflow is undefined behaviour, but the change is annoying, because ignoring the overflow at the end of the loop is much simpler than adding special cases. Not sure what I think about this part.
msg270461 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-07-15 02:57
I added Issue 13312 as a dependency, there is currently a test for a negative year that relies on overflow handling.

Here is a patch where I tried to fix overflow detection for huge set objects.
msg270463 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-07-15 03:48
The audioop part LGTM. If this case was found with the help of -ftrapv, I'm for adding this option in a debug build.
msg270562 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-07-16 15:16
I tried the newer -fsanitize=undefined mode, and it is better than -ftrapv. It adds instrumentation that by default nicely reports the errors and continues running.

My problem with the large slice step is not restricted to Element Tree; it affects list objects too:

>>> "abcdef"[3::sys.maxsize]
Objects/unicodeobject.c:13794:55: runtime error: signed integer overflow: 3 + 9223372036854775807 cannot be represented in type 'long int'
'd'

Regarding Antti’s overflow macros, I noticed there is already a macro _PyTime_check_mul_overflow() in Python/pytime.c which does that kind of thing. Maybe it could help, though I am not sure. Has this sort of thing been done in other projects? We might need to be careful about the sign, e.g. clarify the macro is only for positive values, add an assertion, or handle both positive and negative.
msg270580 - (view) Author: Jeffrey Walton (Jeffrey.Walton) * Date: 2016-07-16 19:03
>  Has this sort of thing been done in other projects?

Yes.

If you are using C, you can use safe_iop. Android uses it for safer
integer operations. If you are using C++, you can use David LeBlanc's
SafeInt class. Microsoft uses it for safer inter operations.

Jeff
msg270582 - (view) Author: Antti Haapala (ztane) * Date: 2016-07-16 19:14
Gnulib portability library has 
https://www.gnu.org/software/gnulib/manual/html_node/Integer-Range-Overflow.html and https://www.gnu.org/softwarhe/gnulib/manual/html_node/Integer-Type-Overflow.html and even macros for producing well-defined integer wraparound for signed integers: https://www.gnu.org/software/gnulib/manual/html_node/Wraparound-Arithmetic.html

That code is under GPL but I believe there is no problem if someone just looks into that for ideas on how to write similar macros.
msg270807 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2016-07-19 03:09
New changeset d6a86018ab33 by Martin Panter in branch 'default':
Issue #1621: Avoid signed int negation overflow in audioop
https://hg.python.org/cpython/rev/d6a86018ab33
msg270809 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-07-19 03:44
I committed the fix for negation in audioop.

slice-step.patch includes a better fix for the remaining part of trapv.patch, with Element Tree slicing. I think this fix is much less intrusive, and I have copied it to other places that handle slicing, and added corresponding test cases.

The undefined behaviour sanitizer produces lots of errors about bit shifting signed integers in low-level modules like ctypes, struct, audioop. Typically this is for code converting signed integers to and from bytes, and big/little-endian conversions. This is technically undefined behaviour, but I think it may be less serious than the other overflows with traditional arithmetic like addition and multiplication. E.g. GCC explicitly documents <https://gcc.gnu.org/onlinedocs/gcc/Integers-implementation.html> that this is handled as expected with twos-complement, so with GCC there should be no nasty surprises with optimizing out undefined behaviour. My set-overflow.patch would also be in this boat.
msg270823 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-07-19 12:57
Does this work with negative steps?
msg270827 - (view) Author: Antti Haapala (ztane) * Date: 2016-07-19 14:06
About shifts, according to C standard, right shifts >> of negative numbers are implementation-defined:

   "[in E1 >> E2], If E1 has a signed type and a negative value, the
   resulting value is implementation-defined."

In K&R this meant that the number can be either zero-extended or sign-extended. In any case it cannot lead to undefined behaviour, but the implementation must document what is happening there. Now, GCC says that >> is always arithmetic/sign-extended. This is the implementation-defined behaviour, now GCC has defined what its implementation will do, but some implementation can choose zero-extension. (unlikely)

As for the other part as it says "GCC does not use the latitude given in C99 only to treat certain aspects of signed ‘<<’ as undefined". But GCC6 will diagnose that now with -Wextra, and thus it changed already, as with -Werror -Wextra the code doesn't necessarily compile any longer, which is fine. Note that this "certain -- only" refers to that section where the C99 and C11 explicitly say that the behaviour is undefined and C89 doesn't say anything. It could as well be argued that in C89 it is undefined by omission.

Additionally all shifts that shift by more than or equal to the width *still* have undefined behaviour (as do shifts by negative amount). IIRC they work differently on ARM vs x86: in x86 the shift can be mod 32 on 386, and in ARM, mod 256.
msg270869 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-07-20 12:48
Serhiy: slice-step.patch seems to be fine with negative slice steps. The actual indexes are still positive, and “unsigned” arithmetic is really modular arithmetic, so when you add the negative “unsigned” step value, it decrements the index properly.

Antti: if you use the sanitizer, (almost?) all the shift errors are for left shifts, either of a positive signed overflow, or a negative value. There is a bit more discussion of bit shift errors in Issue 20932. Examples:

Modules/audioop.c:1527:43: runtime error: left shift of negative value -24
Objects/unicodeobject.c:5152:29: runtime error: left shift of 255 by 24 places cannot be represented in type 'int'

I didn’t see any sanitizer reports about right shifts; perhaps it doesn’t report those (being implemenation-defined, rather than undefined, behaviour). And the only report about an excessive shift size is due to a known bug in ctypes, Issue 15119.
msg270977 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016-07-22 08:12
@Martin, attach a patch to fix the overflow check you mentioned in tuple and list objects.
msg271057 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-07-23 04:35
Apart from the empty “if” statement style (see review), tuple_and_list.patch looks good to me.

I understand the patches from 2011 and earlier have all been committed (correct me if I missed something).

Here is another patch fixing a 64-bit overflow in _thread, detected by the test_timeout() method in test_threading:
Modules/_threadmodule.c:59:17: runtime error: signed integer overflow: 6236628528058 + 9223372036000000000 cannot be represented in type 'long int'
msg271058 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-07-23 05:09
Perhaps we should add a test for the __length_hint__() overflow to tuple_and_list.patch:

>>> a = [1,2,3,4]
>>> import sys
>>> class B:
...  def __iter__(s): return s
...  def __next__(s): raise StopIteration()
...  def __length_hint__(s): return sys.maxsize
... 
>>> a.extend(B())
Objects/listobject.c:844:8: runtime error: signed integer overflow: 4 + 2147483647 cannot be represented in type 'int'

array-size.patch fixes the ctypes array size overflow (including a test).
msg271084 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016-07-23 15:06
Change tuple_and_list.patch with empty curly braces. I don't add the test for __length_hint__. According to the comment, when overflow happens, it is either ignored or a MemoryError will finally be raised. I am not willing to test a MemoryError in this case.

BTW, how do you get the error?
msg271118 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-07-24 00:10
The error message comes from Undefined Behaviour Sanitizer, which was added to newer versions of GCC and Clang. Currently I am compiling with

./configure --with-pydebug CC="gcc -fsanitize=undefined -fno-sanitize=alignment -fno-sanitize=shift"

https://gcc.gnu.org/onlinedocs/gcc/Instrumentation-Options.html#index-fsanitize_003dundefined-962

I thought it is worth adding a test for the impossible __length_hint__() value. Since the test iterator returns no elements, there will not be a MemoryError, but if overflow detection is enabled (such as UB Sanitizer or -ftrapv), it is guaranteed to exercise the overflow path and would be detected.
msg271136 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016-07-24 07:45
It's cool, but I get one problem when writing tests. support.captured_stderr cannot capture the runtime error message. So we have nothing to do with the assertion and the test will always succeed even when overflow does happen (the message will be output). To solve this problem, we have to do io redirect in file descriptor level, I wonder does this deserve that?
msg271143 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-07-24 12:08
Xiang: I don’t think we need to make the tests do anything special. Just make sure they exercise the code that handles overflows. I have been running the test suite without any -j0 option, and I can look over the output and see the error messages. Or if we get to a stage where all the errors are eliminated, you could run with UBSAN_OPTIONS=halt_on_error=1. E.g. in this patch, I add two simple tests to cover parts of the code that weren’t covered before (and if I hadn’t fixed the overflows, the tests would trigger extra UBSAN errors).

ctypes_v2.patch is an update of array-size.patch. I improved the test case, and added a new fix for overflows like the following:

>>> class S(ctypes.Structure):
...     _fields_ = (("field", ctypes.c_longlong, 64),)
... 
>>> s = S()
>>> s.field = 3
Modules/_ctypes/cfield.c:900:9: runtime error: signed integer overflow: -9223372036854775808 - 1 cannot be represented in type 'long long int'
msg271144 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-07-24 12:12
unicode.patch avoids an overflow in PyUnicode_Join():

>>> size = int(sys.maxsize**0.5) + 1
>>> "".join(("A" * size,) * size)
Objects/unicodeobject.c:9927:12: runtime error: signed integer overflow: 46341 + 2147441940 cannot be represented in type 'int'
OverflowError: join() result is too long for a Python string
msg271150 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016-07-24 13:33
It turns out you just want to see the output. That is easy. Patch v3 adding the test.
msg271223 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2016-07-25 03:44
New changeset db93af6080e7 by Martin Panter in branch 'default':
Issue #1621: Avoid signed overflow in list and tuple operations
https://hg.python.org/cpython/rev/db93af6080e7
msg271735 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016-07-31 12:19
Martin, I upload a patch to fix another possible overflow in listextend.
msg271759 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-08-01 01:26
overflow_fix_in_listextend.patch: I doubt Python supports the kinds of platform where this overflow would be possible. It may require pointers smaller than 32 bits, or char objects larger than 8 bits. Perhaps we could just add a comment explaining we assume the overflow cannot happen.

It seems list objects will hold one pointer for each element, but the overflow involves the number of elements. Python defines PY_SSIZE_T_MAX as PY_SIZE_MAX // 2. For the overflow to happen we would need

m + n > PY_SSIZE_T_MAX

Assuming a “flat” address space that can allocate up to PY_SIZE_MAX bytes _in total_, the total number of elements cannot exceed

m + n == PY_SIZE_MAX // sizeof (PyObject *)

So in this scenario, the overflow cannot happen unless sizeof (PyObject *) == 1.

Considering things like the 16-bit segmented Intel “large” memory model (which I doubt Python is compatible with), each list could _independently_ be up to PY_SIZE_MAX bytes. Therefore the total number of elements may reach

m + n == PY_SIZE_MAX // sizeof (PyObject *) * 2

So even in this case, sizeof (PyObject *) == 4 (large model) is fine, but anything less (e.g. 16-bit char, or 1-byte segment + 2-byte offset) might overflow.
msg271763 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016-08-01 02:51
Hmm, I don't tend to infer platform characteristics. IMHO, it's a simple problem: sum up two lists' length which may overflow in logic.

With your argument, does it means it seems somewhat meaningless to have a List a Py_ssize_t length since it can never reach it? Checks against PY_SSIZE_T_MAX have already existed (for example, in ins1).
msg271766 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-08-01 04:25
The check in ins1() was originally added in revision b9002da46f69. I presume it references the Python-dev thread “can this overflow (list insertion)?” <20000812145155.A7629@ActiveState.com>, <https://marc.info/?l=python-dev&m=107666472818169>. At that time, ob_size was an int, so overflow checking was definitely needed. Later, revision 7fdc639bc5b4 changed ob_size to Py_ssize_t, and then revision 606818c33e50 updated the overflow check from INT_MAX to PY_SSIZE_T_MAX.

BTW I made a small mistake in my previous message. The worst case would be extending a list with itself. But I think the conclusion is still the same.
msg271768 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016-08-01 04:47
So these checks are superfluous? Do we need to remove them?

Hmm, I still doubt such checks should consider platform characteristics first. In theory List can be PY_SSIZE_T_MAX length. Do we have to put the PY_SIZE_MAX // sizeof(PyObject *) limit on it?
msg271769 - (view) Author: Antti Haapala (ztane) * Date: 2016-08-01 06:18
I don't believe Python would really ever work on a platform with non-8-bit-bytes, I believe there are way too much assumptions *everywhere*. You can program in C on such a platform, yes, but not that sure about Python.

And on 8-bit-byte platfomrs, there is no large model with 16-bit pointers anywhere. There just are not enough bits that you could have multiple 64k byte-addressable segments that are addressed with 16-bit pointers. 

It might be that some obscure platform in the past would have had 128k memory, with large pointers, 2 allocatable 64k segments, >16 bit char pointer and 16-bit object pointers pointing to even bytes, but I doubt you'd be really porting *Python 3* to such a platform, basically we're talking about something like Commodore 128 here.
msg271800 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-08-02 03:20
Looking over r60793, the overflow check at Modules/cjkcodecs/multibytecodec.c:836 looks vulnerable to being optimized away, because it can only detect the overflow if the line above has already overflowed. Perhaps change PY_SSIZE_T_MAX to MAXDECPENDING. I wonder if any of the GCC optimization and warning modes can detect this case?

Also, Python/ast.c:3988 checks using PY_SIZE_MAX, but then passes the value to PyBytes_FromStringAndSize(), which expects ssize_t and in the best case would raise SystemError.
msg271803 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016-08-02 05:10
I agree. For multibytecode, how about switching the positions of the two checks? If npendings + ctx->pendingsize overflows, the result can be anything, larger, smaller than or equal to MAXDECPENDING. For ast, although a SystemError may be raised but the message seems not obvious to the reason.
msg272077 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-08-06 00:59
Xiang: regarding your overflow_fix_in_listextend.patch, what do you think about adding a comment or debugging assertion instead, something like:

/* It should not be possible to allocate a list large enough to cause an overflow on any relevant platform */
assert(m < PY_SSIZE_T_MAX - n);
msg272078 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016-08-06 03:18
It's good Martin. Just commit it.
msg285466 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2017-01-14 07:25
New changeset dd2c7d497878 by Martin Panter in branch '3.5':
Issues #1621, #29145: Test for str.join() overflow
https://hg.python.org/cpython/rev/dd2c7d497878

New changeset eb6eafafdb44 by Martin Panter in branch 'default':
Issue #1621: Overflow should not be possible in listextend()
https://hg.python.org/cpython/rev/eb6eafafdb44
msg303330 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-09-29 13:32
Martin, do you mind to create a PR for your ctypes_v2.patch?
msg317087 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2018-05-19 01:33
Sorry I haven’t made a PR for ctypes_v2.patch, but I don’t mind if someone else takes over. I understand the HAVE_LONG_LONG check may no longer necessary for newer Python versions.
msg325091 - (view) Author: miss-islington (miss-islington) Date: 2018-09-11 23:18
New changeset 6c7d67ce83a62b5f0fe5c53a6df602827451bf7f by Miss Islington (bot) (Sergey Fedoseev) in branch 'master':
bpo-1621: Avoid signed integer overflow in set_table_resize(). (GH-9059)
https://github.com/python/cpython/commit/6c7d67ce83a62b5f0fe5c53a6df602827451bf7f
msg325105 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-09-12 00:26
> newsize <<= 1; // The largest possible value is PY_SSIZE_T_MAX + 1. 

Previously, there was a explicitly check for error raising  PyErr_NoMemory() on overflow. Now you rely on PyMem_Malloc() to detect the overflow. I'm not sure that it's a good idea.
msg325109 - (view) Author: Jeffrey Walton (Jeffrey.Walton) * Date: 2018-09-12 00:53
On Tue, Sep 11, 2018 at 8:26 PM, STINNER Victor <report@bugs.python.org> wrote:
>
> STINNER Victor <vstinner@redhat.com> added the comment:
>
>> newsize <<= 1; // The largest possible value is PY_SSIZE_T_MAX + 1.
>
> Previously, there was a explicitly check for error raising  PyErr_NoMemory() on overflow. Now you rely on PyMem_Malloc() to detect the overflow. I'm not sure that it's a good idea.

+1. It will probably work as expected on Solaris and other OSes that
don't oversubscribe memory. It will probably fail in unexpected ways
on Linux when the allocation succeeds but then the OOM killer hits a
random process.

Jeff
msg325113 - (view) Author: Sergey Fedoseev (sir-sigurd) * Date: 2018-09-12 02:18
> Now you rely on PyMem_Malloc() to detect the overflow.

Now overflow is not possible or am I missing something?
msg325128 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-09-12 07:52
I asked if there is an issue. In fact, all Python memory allocators start by checking if the size is larger than PY_SSIZE_T_MAX. Example:

void *
PyMem_RawMalloc(size_t size)
{
    /*
     * Limit ourselves to PY_SSIZE_T_MAX bytes to prevent security holes.
     * Most python internals blindly use a signed Py_ssize_t to track
     * things without checking for overflows or negatives.
     * As size_t is unsigned, checking for size < 0 is not required.
     */
    if (size > (size_t)PY_SSIZE_T_MAX)
        return NULL;
    return _PyMem_Raw.malloc(_PyMem_Raw.ctx, size);
}
msg325284 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-09-13 18:54
Benjamin: what do you think of adding an explicit check after the "new_size <<= 1;" loop?

    if (new_size > (size_t)PY_SSIZE_T_MAX) {
        PyErr_NoMemory();
        return -1;
    }

Technically, PyMem_Malloc() already implements the check, so it's not really needed. So I'm not sure if it's needed :-)
History
Date User Action Args
2018-09-13 18:54:21vstinnersetmessages: + msg325284
2018-09-13 16:48:01benjamin.petersonsetpull_requests: + pull_request8693
2018-09-13 16:48:00benjamin.petersonsetpull_requests: + pull_request8692
2018-09-12 07:52:57vstinnersetmessages: + msg325128
2018-09-12 02:18:31sir-sigurdsetnosy: + sir-sigurd
messages: + msg325113
2018-09-12 00:53:34Jeffrey.Waltonsetmessages: + msg325109
2018-09-12 00:26:22vstinnersetmessages: + msg325105
2018-09-11 23:18:25miss-islingtonsetpull_requests: + pull_request8634
2018-09-11 23:18:13miss-islingtonsetpull_requests: + pull_request8633
2018-09-11 23:18:04miss-islingtonsetnosy: + miss-islington
messages: + msg325091
2018-09-04 10:46:26sir-sigurdsetpull_requests: + pull_request8519
2018-08-25 00:15:33gregory.p.smithsetversions: + Python 3.7, Python 3.8, - Python 3.3
2018-05-19 01:33:22martin.pantersetmessages: + msg317087
2017-09-29 13:33:05serhiy.storchakalinkissue31637 superseder
2017-09-29 13:32:44serhiy.storchakasetmessages: + msg303330
2017-01-14 07:25:17python-devsetmessages: + msg285466
2017-01-09 04:24:18martin.pantersetdependencies: + failing overflow checks in replace_*
2016-08-06 03:18:08xiang.zhangsetmessages: + msg272078
2016-08-06 00:59:03martin.pantersetmessages: + msg272077
2016-08-02 10:35:06christian.heimessetnosy: - christian.heimes
2016-08-02 05:10:57xiang.zhangsetmessages: + msg271803
2016-08-02 03:20:51martin.pantersetmessages: + msg271800
2016-08-01 06:18:05ztanesetmessages: + msg271769
2016-08-01 04:47:54xiang.zhangsetmessages: + msg271768
2016-08-01 04:25:54martin.pantersetmessages: + msg271766
2016-08-01 03:22:57gregory.p.smithsetnosy: - gregory.p.smith
2016-08-01 02:51:50xiang.zhangsetmessages: + msg271763
2016-08-01 01:26:13martin.pantersetmessages: + msg271759
2016-07-31 12:19:43xiang.zhangsetfiles: + overflow_fix_in_listextend.patch

messages: + msg271735
2016-07-25 03:44:28python-devsetmessages: + msg271223
2016-07-24 13:33:30xiang.zhangsetfiles: + tuple_and_list_v3.patch

messages: + msg271150
2016-07-24 12:12:53martin.pantersetfiles: + unicode.patch

messages: + msg271144
2016-07-24 12:08:27martin.pantersetfiles: + ctypes_v2.patch

messages: + msg271143
2016-07-24 07:45:44xiang.zhangsetmessages: + msg271136
2016-07-24 00:10:29martin.pantersetmessages: + msg271118
2016-07-23 15:06:50xiang.zhangsetfiles: + tuple_and_list_v2.patch

messages: + msg271084
2016-07-23 05:17:08martin.pantersetfiles: + array-size.patch
2016-07-23 05:09:51martin.pantersetmessages: + msg271058
2016-07-23 04:35:47martin.pantersetfiles: + thread.patch

messages: + msg271057
2016-07-22 08:12:10xiang.zhangsetfiles: + tuple_and_list.patch
nosy: + xiang.zhang
messages: + msg270977

2016-07-20 15:34:09gvanrossumsetnosy: - gvanrossum
2016-07-20 12:48:43martin.pantersetmessages: + msg270869
2016-07-19 14:06:40ztanesetmessages: + msg270827
2016-07-19 12:57:32serhiy.storchakasetmessages: + msg270823
2016-07-19 03:44:38martin.pantersetfiles: + slice-step.patch

messages: + msg270809
versions: + Python 3.6
2016-07-19 03:09:15python-devsetmessages: + msg270807
2016-07-16 19:14:50ztanesetmessages: + msg270582
2016-07-16 19:03:35Jeffrey.Waltonsetmessages: + msg270580
2016-07-16 15:16:00martin.pantersetmessages: + msg270562
2016-07-15 03:48:13serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg270463
2016-07-15 02:57:54martin.pantersetfiles: + set-overflow.patch

messages: + msg270461
2016-07-15 02:53:36martin.pantersetfiles: + trapv.patch

nosy: + martin.panter
messages: + msg270460

dependencies: + test_time fails: strftime('%Y', y) for negative year
2016-07-10 13:05:56ztanesetnosy: + ztane
messages: + msg270084
2016-07-10 07:43:23martin.pantersetdependencies: + bytes_concat seems to check overflow using undefined behaviour
2014-10-14 17:16:16skrahsetnosy: - skrah
2014-10-10 13:10:52jwilksetnosy: + jwilk
2014-03-16 15:07:08Jeffrey.Waltonsetnosy: + Jeffrey.Walton
messages: + msg213729
2013-03-08 08:47:47fweimersetnosy: + fweimer
2011-11-19 17:10:39mark.dickinsonsetmessages: + msg147958
2011-09-25 14:35:02python-devsetmessages: + msg144524
2011-09-24 18:12:00python-devsetmessages: + msg144505
2011-09-24 17:19:43python-devsetnosy: + python-dev
messages: + msg144503
2011-09-24 15:57:05mark.dickinsonsetfiles: + issue1621_hashes_and_sets.patch

messages: + msg144501
versions: - Python 2.7, Python 3.2
2011-09-24 14:48:03loewissetpriority: high -> normal

messages: + msg144499
2011-09-24 08:27:18mark.dickinsonsetmessages: + msg144492
2011-09-24 08:24:54mark.dickinsonsetmessages: + msg144490
versions: + Python 3.3, - Python 2.6, Python 3.1
2011-09-14 04:17:55alexsetnosy: + alex
2011-09-14 01:29:07jceasetnosy: + jcea
2011-08-10 15:47:11deadshortsetnosy: + deadshort
messages: + msg141871
2011-06-01 06:15:27terry.reedysetversions: - Python 2.5
2011-02-10 09:29:33skrahsetnosy: + skrah
2011-02-10 09:28:35skrahlinkissue11167 superseder
2010-05-21 16:51:49dmalcolmsetnosy: + dmalcolm
2010-05-11 20:55:56terry.reedysetversions: + Python 2.5
2010-05-11 20:55:31terry.reedysetversions: + Python 2.7, Python 3.2, - Python 2.5, Python 3.0
2009-05-16 18:40:23loewissetmessages: + msg87908
2009-05-14 09:50:36mark.dickinsonsetmessages: + msg87731
2009-05-14 09:00:18mark.dickinsonsetmessages: + msg87730
2009-05-13 21:32:41loewissetmessages: + msg87712
2009-05-13 20:58:27mark.dickinsonsetmessages: + msg87708
2009-05-13 20:38:52loewissetmessages: + msg87707
2009-05-13 20:05:40mark.dickinsonsetmessages: + msg87704
2009-05-13 16:48:17donmezsetmessages: + msg87694
2009-05-13 16:45:41mark.dickinsonsetmessages: + msg87693
2009-05-13 16:01:08gregory.p.smithsetmessages: + msg87690
2009-05-13 15:53:06mark.dickinsonsetnosy: + mark.dickinson
messages: + msg87689
2009-05-12 13:29:05pitrousetnosy: + pitrou
2009-05-12 13:28:06ajaksu2setnosy: + vstinner
versions: + Python 3.1

stage: patch review
2008-03-10 17:25:27matejciksetnosy: + matejcik
2008-02-21 11:03:16donmezsetmessages: + msg62616
2008-01-28 18:09:31donmezsetmessages: + msg61792
2008-01-28 18:05:19donmezsetmessages: + msg61791
2008-01-28 17:57:10donmezsetmessages: + msg61788
2008-01-28 17:54:39donmezsetmessages: + msg61785
2008-01-28 17:54:09gvanrossumsetmessages: + msg61784
2008-01-28 08:14:20loewissetmessages: + msg61774
2008-01-28 03:32:53donmezsetmessages: + msg61770
2008-01-28 03:20:28gvanrossumsetmessages: + msg61768
2008-01-28 03:16:57donmezsetmessages: + msg61767
2008-01-28 03:10:02donmezsetmessages: + msg61766
2008-01-28 03:02:23donmezsetfiles: + csv.patch
messages: + msg61765
2008-01-28 02:54:17gvanrossumsetmessages: + msg61763
2008-01-28 02:51:06nnorwitzsetmessages: + msg61762
2008-01-28 02:45:11donmezsetmessages: + msg61761
2008-01-28 02:41:25gvanrossumsetmessages: + msg61760
2008-01-28 02:41:05nnorwitzsetmessages: + msg61759
2008-01-28 02:28:49donmezsetmessages: + msg61758
2008-01-28 02:17:36nnorwitzsetmessages: + msg61757
2008-01-25 22:08:46nnorwitzsetnosy: + nnorwitz
2008-01-20 17:57:55donmezsetmessages: + msg61320
2008-01-20 17:47:01loewissetmessages: + msg61319
2008-01-20 13:56:09christian.heimessetmessages: + msg61291
2008-01-20 13:01:06christian.heimessetmessages: + msg61286
2008-01-20 11:36:59donmezsetfiles: + fix-overflows-final.patch
messages: + msg61272
2008-01-20 03:29:12donmezsetfiles: + fix-overflows-try3.patch
messages: + msg60260
2008-01-20 01:48:15donmezsetfiles: + fix-overflows-try2.patch
messages: + msg60254
2008-01-19 23:15:57donmezsetmessages: + msg60246
2008-01-19 12:58:46christian.heimessetpriority: high
2008-01-19 11:01:47christian.heimessetmessages: + msg60146
2008-01-18 23:40:40loewissetmessages: + msg60126
2008-01-18 23:35:59donmezsetfiles: + fix-overflows-try1.patch
messages: + msg60125
2008-01-18 23:15:06donmezsetfiles: + overflow-error4.patch
2008-01-18 22:24:17christian.heimessetmessages: + msg60124
2008-01-18 21:16:53gvanrossumsetkeywords: + patch
messages: + msg60121
2008-01-18 21:13:36donmezsetfiles: + overflow-error3.patch
2008-01-18 21:11:25donmezsetfiles: + overflow-error2.patch
messages: + msg60120
2008-01-18 21:05:32gvanrossumsetmessages: + msg60118
2008-01-18 20:58:31donmezsetfiles: + overflow-error.patch
messages: + msg60116
2008-01-18 20:49:07gvanrossumsetmessages: + msg60115
2008-01-18 20:47:53gvanrossumsetmessages: + msg60114
2008-01-18 18:59:33donmezsetmessages: + msg60111
2008-01-18 18:56:44christian.heimessetmessages: + msg60110
2008-01-18 18:50:22donmezsetmessages: + msg60109
2008-01-18 18:27:52donmezsetmessages: + msg60107
2008-01-18 18:16:26christian.heimessetmessages: + msg60105
2008-01-18 17:53:41gvanrossumsetmessages: + msg60103
2008-01-18 16:48:33donmezsetmessages: + msg60102
2008-01-18 01:50:58donmezsetmessages: + msg60079
2008-01-18 01:22:43gvanrossumsetmessages: + msg60078
2008-01-11 09:47:51donmezsetmessages: + msg59699
2008-01-11 08:43:37loewissetmessages: + msg59696
2008-01-11 03:26:28donmezsetmessages: + msg59694
2008-01-11 03:24:37donmezsetmessages: + msg59693
2008-01-11 03:06:25alexandre.vassalottisetmessages: + msg59692
2008-01-09 18:59:17loewissetmessages: + msg59619
2008-01-09 18:12:52donmezsetnosy: + donmez
messages: + msg59616
2008-01-09 17:30:53gvanrossumsetmessages: + msg59612
2008-01-09 17:29:05gvanrossumsetnosy: + gvanrossum
messages: + msg59611
2007-12-17 23:49:28gregory.p.smithsetmessages: + msg58711
title: Python should compile with -Wstrict-overflow when using gcc -> Do not assume signed integer overflow behavior
2007-12-17 06:42:02alexandre.vassalottisetnosy: + alexandre.vassalotti
messages: + msg58684
2007-12-14 09:24:33lemburgsetnosy: - lemburg
2007-12-14 09:24:20lemburgsetnosy: + lemburg
messages: + msg58620
2007-12-14 07:08:23loewissetnosy: + loewis
messages: + msg58617
2007-12-14 03:23:51christian.heimessetfiles: + config.patch
messages: + msg58611
2007-12-14 02:45:55christian.heimessetmessages: + msg58610
2007-12-14 02:19:07christian.heimessetnosy: + christian.heimes
messages: + msg58609
2007-12-14 00:43:47gregory.p.smithcreate