Issue 1621: Do not assume signed integer overflow behavior

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/45962

classification

Title:	Do not assume signed integer overflow behavior
Type:	security	Stage:	resolved
Components:		Versions:	Python 3.8, Python 3.7, Python 3.6

process

Status:	closed	Resolution:	fixed
Dependencies:	13312 27473 29145	Superseder:
Assigned To:		Nosy List:	Jeffrey.Walton, alex, alexandre.vassalotti, deadshort, dmalcolm, donmez, fweimer, jcea, jwilk, loewis, mark.dickinson, martin.panter, matejcik, miss-islington, nnorwitz, pitrou, python-dev, serhiy.storchaka, sir-sigurd, vstinner, xiang.zhang, ztane
Priority:	normal	Keywords:	patch

Created on 2007-12-14 00:43 by gregory.p.smith, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name	Uploaded	Description	Edit
config.patch	christian.heimes, 2007-12-14 03:23
overflow-error.patch	donmez, 2008-01-18 20:58
overflow-error2.patch	donmez, 2008-01-18 21:11
overflow-error3.patch	donmez, 2008-01-18 21:13	Fix whitespace change and comment.
overflow-error4.patch	donmez, 2008-01-18 23:15	Fix -fwrapv check, thanks tiran
fix-overflows-try1.patch	donmez, 2008-01-18 23:35
fix-overflows-try2.patch	donmez, 2008-01-20 01:48	Better patch
fix-overflows-try3.patch	donmez, 2008-01-20 03:29
fix-overflows-final.patch	donmez, 2008-01-20 11:36
csv.patch	donmez, 2008-01-28 03:02
issue1621_hashes_and_sets.patch	mark.dickinson, 2011-09-24 15:57		review
trapv.patch	martin.panter, 2016-07-15 02:53	Committed & superseded	review
set-overflow.patch	martin.panter, 2016-07-15 02:57	Superseded
slice-step.patch	martin.panter, 2016-07-19 03:44	=> #36946; supersedes trapv.patch	review
tuple_and_list.patch	xiang.zhang, 2016-07-22 08:12		review
thread.patch	martin.panter, 2016-07-23 04:35	=> Issue 33632	review
array-size.patch	martin.panter, 2016-07-23 05:17	Superseded	review
tuple_and_list_v2.patch	xiang.zhang, 2016-07-23 15:06		review
ctypes_v2.patch	martin.panter, 2016-07-24 12:08	Supersedes array-size.patch	review
unicode.patch	martin.panter, 2016-07-24 12:12	Committed	review
tuple_and_list_v3.patch	xiang.zhang, 2016-07-24 13:33	Committed	review
overflow_fix_in_listextend.patch	xiang.zhang, 2016-07-31 12:19	Superseded	review

Pull Requests
URL	Status	Linked	Edit
PR 9059	merged	sir-sigurd, 2018-09-04 10:46
PR 9198	merged	miss-islington, 2018-09-11 23:18
PR 9199	merged	miss-islington, 2018-09-11 23:18
PR 9261	merged	benjamin.peterson, 2018-09-13 16:48
PR 9261	merged	benjamin.peterson, 2018-09-13 16:48

Messages (128)
msg58602 - (view)	Author: Gregory P. Smith (gregory.p.smith) *	Date: 2007-12-14 00:43
The resolution to http://bugs.python.org/issue1608 looks like it'll add a -fwrapv gcc flag when building python. This works around the issue nicely on one compiler (gcc) but doesn't fix our fundamentally broken code. We should fix all dependencies on integer overflow behavior, starting by making everything compile properly with gcc's -Wstrict-overflow and -Werror flags.
msg58609 - (view)	Author: Christian Heimes (christian.heimes) *	Date: 2007-12-14 02:19
My gcc 4.1 doesn't have the -Wstrict-overflow option. gcc-Version 4.1.3 20070929 (prerelease) (Ubuntu 4.1.2-16ubuntu2)
msg58610 - (view)	Author: Christian Heimes (christian.heimes) *	Date: 2007-12-14 02:45
Should we use -ansi (C90 aka C89 standard) option, too? Python core compiles fine with -ansi but together with -Werror it breaks several extensions: _bsddb _codecs_iso2022 _ctypes _socket _ssl linuxaudiodev
msg58611 - (view)	Author: Christian Heimes (christian.heimes) *	Date: 2007-12-14 03:23
Socket and SSL are using bluetooth.h which defines some functionas as inline. Inline isn't part of C89. Linuxaudiodev depends on the 'linux' macro which is not defined in C89. The Python core can be compiled with -ansi but the extension modules require -std=gnu89.
msg58617 - (view)	Author: Martin v. Löwis (loewis) *	Date: 2007-12-14 07:08
Using ansi is out of scope of this issue, and should not be mixed with it. -ansi is about disabling certain GCC extensions. This report is about C code in Python which has undefined behavior. I think there is disagreement on whether Python should stop relying on this particular undefined behavior (namely, whether the sum of two large positive numbers is negative). GvR (apparently) believes that the compiler should guarantee that the twos-complement semantic is available throughout the C language.
msg58620 - (view)	Author: Marc-Andre Lemburg (lemburg) *	Date: 2007-12-14 09:24
Whatever you change regarding the compiler options for Python, please make sure that this doesn't effect the default settings used by distutils to compile external modules (it normally takes the options straight from the Makefile used for compiling Python). Otherwise, you're likely going to break lots and lots of extensions. Thanks.
msg58684 - (view)	Author: Alexandre Vassalotti (alexandre.vassalotti) *	Date: 2007-12-17 06:42
I compiled Python using gcc 4.3.0 with the -Wstrict-overflow, and that's the only warning I got: Objects/doubledigits.c: In function ‘_PyFloat_Digits’: Objects/doubledigits.c:313: error: assuming signed overflow does not occur when assuming that (X + c) < X is always false I am sure yet how to interpret it, though. It says that the overflow check is in _PyFloat_Digits(), line 313 is in the function add_big(). It probably means that add_big() gets inlined. I tried to set -finline-limit=0, but strangely the overflow warning disappears... I will try to investigate this further, when I will have a bit more time in my hands.
msg58711 - (view)	Author: Gregory P. Smith (gregory.p.smith) *	Date: 2007-12-17 23:49
heh if thats the only warning gcc -Wstrict-overflow gives then I've mistitled the bug. Fixed. It'll take some manual code review. Anyone know if the commercial analysis tools we've run the code base through in the past can find these for us?
msg59611 - (view)	Author: Guido van Rossum (gvanrossum) *	Date: 2008-01-09 17:29
Alexandre, which Python version did you compile with -Wstrict-overflow? It would behoove us to check 2.5.2 thoroughly before it goes out the door. I will contact Coverity to ask if they check for this kind of thing. (They just upgraded us to "Rung 2", whatever that may mean. :-) MvL: I don't want 2s complement throughout the language, I just want the overflow checks to be reliable. Since I'd forgotten about the difference between unsigned and signed overflow, I have no idea how many overflow checks have been submitted that are relying on signed overflow; though apparently (if the -Wstrict-overflow results can be trusted) we're okay. FWIW, I've heard that some commercial compilers (e.g. XLC) assume that even unsigned overflow is undefined, violating the C standard. This would suggest that buffer overflow checks should be coded without relying on arithmetic overflow at all. This is possible, just a bit hairy.
msg59612 - (view)	Author: Guido van Rossum (gvanrossum) *	Date: 2008-01-09 17:30
Marc-Andre: what do you mean by breaking lots and lots of extensions? Extensions also contain buffer overflow checks (at least I hope they do :-) and those should also be guaranteed safe by using -fwrapv or -fno-strict-overflow (GCC 4.2 and higher) until we're sure there aren't any.
msg59616 - (view)	Author: Ismail Donmez (donmez) *	Date: 2008-01-09 18:12
-Wstrict-overflow=3 with gcc 4.3 trunk here shows : Modules/cPickle.c: In function 'Unpickler_noload': Modules/cPickle.c:4232: warning: assuming signed overflow does not occur when assuming that (X - c) > X is always false Modules/cPickle.c:194: warning: assuming signed overflow does not occur when assuming that (X - c) > X is always false Modules/cPickle.c: In function 'load': Modules/cPickle.c:4232: warning: assuming signed overflow does not occur when assuming that (X - c) > X is always false But also note that -fno-strict-aliasing is also just another workaround and its more serious than -fwrapv.
msg59619 - (view)	Author: Martin v. Löwis (loewis) *	Date: 2008-01-09 18:59
> But also note that -fno-strict-aliasing is also just another workaround > and its more serious than -fwrapv. Sure - however, that is fixed in Python 3 (and unrelated to this issue)
msg59692 - (view)	Author: Alexandre Vassalotti (alexandre.vassalotti) *	Date: 2008-01-11 03:06
Hm. I don't get any warning, related to the overflow issue, neither with -Wstrict-overflow=3, nor -Wstrict-overflow=5. Are the cPickle warnings already fixed?
msg59693 - (view)	Author: Ismail Donmez (donmez) *	Date: 2008-01-11 03:24
Make sure you use gcc 4.3 trunk and at least -O2 is enabled. I tested revision 59895 from release25-maint branch.
msg59694 - (view)	Author: Ismail Donmez (donmez) *	Date: 2008-01-11 03:26
FWIW gcc hacker Ian Lance Taylor has a nice article about signed overflow optimizations in gcc, see http://www.airs.com/blog/archives/120 . Reading that it might be better to use -fno-strict-overflow instead of -fwrapv. Regards, ismail
msg59696 - (view)	Author: Martin v. Löwis (loewis) *	Date: 2008-01-11 08:43
> FWIW gcc hacker Ian Lance Taylor has a nice article about signed > overflow optimizations in gcc, see http://www.airs.com/blog/archives/120 > . Reading that it might be better to use -fno-strict-overflow instead of > -fwrapv. Please be specific. I read it, and I don't think it's better to use -fno-strict-overflow.
msg59699 - (view)	Author: Ismail Donmez (donmez) *	Date: 2008-01-11 09:47
Ian says -fno-strict-overflow still allows some optimizations, and his example code shows less assembly is produced with -fno-strict-overflow. But of course your opinion matters on this one, not mine. Regards, ismail
msg60078 - (view)	Author: Guido van Rossum (gvanrossum) *	Date: 2008-01-18 01:22
I think the -Wstrict-overflow option may not be enough for the audit we need. The overflow issue in expandtabs() still exists (in 2.5 as well as in the trunk): if (p == '\n' \|\| p == '\r') { i += j; old_j = j = 0; if (i < 0) { PyErr_SetString(PyExc_OverflowError, "new string is too long"); return NULL; } } Here i and j are signed ints (Py_ssize_t) initially know to be >= 0; i can only become < 0 through overflow. This is the place where Ismail (cartman) found a crash because the test was optimized away by GCC 4.3 before we added -fwrap. If we ever hope to clean up the code to the point where -fwrapv is no longer needed, the audit should find this spot! (Good thing we at least had a unittest for the overflow check. This should be standard practice for all overflow checks, as long it doesn't require allocating a GB or more of memory.)
msg60079 - (view)	Author: Ismail Donmez (donmez) *	Date: 2008-01-18 01:50
FWIW I reported this to GCC bugzilla as a missing diagnostic @ http://gcc.gnu.org/PR34843
msg60102 - (view)	Author: Ismail Donmez (donmez) *	Date: 2008-01-18 16:48
Problem was that -Wall at the end was resetting -Wstrict-overflow, so here is the current results for signed overflow warnings (python 2.5 branch SVN), a lot of them : Parser/acceler.c: In function 'fixstate': Parser/acceler.c:90: warning: assuming signed overflow does not occur when changing X +- C1 cmp C2 to X cmp C1 +- C2 Parser/node.c: In function 'PyNode_AddChild': Parser/node.c:90: warning: assuming signed overflow does not occur when changing X +- C1 cmp C2 to X cmp C1 +- C2 Parser/node.c:90: warning: assuming signed overflow does not occur when changing X +- C1 cmp C2 to X cmp C1 +- C2 Parser/firstsets.c: In function 'calcfirstset': Parser/firstsets.c:71: warning: assuming signed overflow does not occur when simplifying conditional to constant Parser/pgen.c: In function 'compile_item': Parser/pgen.c:268: warning: assuming signed overflow does not occur when changing X +- C1 cmp C2 to X cmp C1 +- C2 Parser/pgen.c: In function '_Py_pgen': Parser/pgen.c:454: warning: assuming signed overflow does not occur when simplifying conditional to constant Parser/pgen.c:556: warning: assuming signed overflow does not occur when simplifying conditional to constant Parser/pgen.c:604: warning: assuming signed overflow does not occur when simplifying conditional to constant Parser/pgen.c:611: warning: assuming signed overflow does not occur when simplifying conditional to constant Parser/tokenizer.c: In function 'new_string': Parser/tokenizer.c:175: warning: assuming signed overflow does not occur when changing X +- C1 cmp C2 to X cmp C1 +- C2 Parser/tokenizer.c: In function 'tok_get': Parser/tokenizer.c:1163: warning: assuming signed overflow does not occur when changing X +- C1 cmp C2 to X cmp C1 +- C2 Parser/tokenizer.c: In function 'PyTokenizer_Get': Parser/tokenizer.c:1443: warning: assuming signed overflow does not occur when changing X +- C1 cmp C2 to X cmp C1 +- C2 Parser/tokenizer.c:1443: warning: assuming signed overflow does not occur when changing X +- C1 cmp C2 to X cmp C1 +- C2 Parser/tokenizer.c: In function 'PyTokenizer_FromString': Parser/tokenizer.c:607: warning: assuming signed overflow does not occur when changing X +- C1 cmp C2 to X cmp C1 +- C2 Objects/abstract.c: In function 'PyObject_CallMethodObjArgs': Objects/abstract.c:2038: warning: assuming signed overflow does not occur when changing X +- C1 cmp C2 to X cmp C1 +- C2 Objects/abstract.c:2038: warning: assuming signed overflow does not occur when simplifying conditional to constant Objects/abstract.c: In function 'PyObject_CallFunctionObjArgs': Objects/abstract.c:2038: warning: assuming signed overflow does not occur when changing X +- C1 cmp C2 to X cmp C1 +- C2 Objects/abstract.c:2038: warning: assuming signed overflow does not occur when simplifying conditional to constant Objects/intobject.c: In function 'PyInt_FromUnicode': Objects/intobject.c:394: warning: assuming signed overflow does not occur when changing X +- C1 cmp C2 to X cmp C1 +- C2 Objects/listobject.c: In function 'merge_at': Objects/listobject.c:1595: warning: assuming signed overflow does not occur when changing X +- C1 cmp C2 to X cmp C1 +- C2 Objects/listobject.c:1459: warning: assuming signed overflow does not occur when changing X +- C1 cmp C2 to X cmp C1 +- C2 Objects/listobject.c:1459: warning: assuming signed overflow does not occur when changing X +- C1 cmp C2 to X cmp C1 +- C2 Objects/listobject.c:1595: warning: assuming signed overflow does not occur when changing X +- C1 cmp C2 to X cmp C1 +- C2 Objects/longobject.c: In function 'PyLong_FromUnicode': Objects/longobject.c:1701: warning: assuming signed overflow does not occur when changing X +- C1 cmp C2 to X cmp C1 +- C2 Objects/longobject.c: In function '_PyLong_AsScaledDouble': Objects/longobject.c:703: warning: assuming signed overflow does not occur when changing X +- C1 cmp C2 to X cmp C1 +- C2 Objects/longobject.c:703: warning: assuming signed overflow does not occur when changing X +- C1 cmp C2 to X cmp C1 +- C2 Objects/longobject.c: In function 'long_sub': Objects/longobject.c:1978: warning: assuming signed overflow does not occur when simplifying conditional to constant Objects/longobject.c: In function 'l_divmod': Objects/longobject.c:1802: warning: assuming signed overflow does not occur when changing X +- C1 cmp C2 to X cmp C1 +- C2 Objects/longobject.c:1802: warning: assuming signed overflow does not occur when changing X +- C1 cmp C2 to X cmp C1 +- C2 Objects/stringobject.c: In function 'string_expandtabs': Objects/stringobject.c:3331: warning: assuming signed overflow does not occur when simplifying conditional to constant Objects/stringobject.c:3339: warning: assuming signed overflow does not occur when simplifying conditional to constant Objects/stringobject.c: In function 'string_replace': Objects/stringobject.c:2509: warning: assuming signed overflow does not occur when simplifying conditional to constant Objects/stringobject.c:2509: warning: assuming signed overflow does not occur when simplifying conditional to constant Objects/stringobject.c:2509: warning: assuming signed overflow does not occur when simplifying conditional to constant Objects/stringobject.c:2509: warning: assuming signed overflow does not occur when simplifying conditional to constant Objects/stringobject.c:2672: warning: assuming signed overflow does not occur when simplifying conditional to constant Objects/stringobject.c: In function 'string_count': Objects/stringlib/count.h:24: warning: assuming signed overflow does not occur when simplifying conditional to constant Objects/stringlib/count.h:24: warning: assuming signed overflow does not occur when simplifying conditional to constant Objects/unicodeobject.c: In function 'unicode_expandtabs': Objects/unicodeobject.c:5719: warning: assuming signed overflow does not occur when simplifying conditional to constant Objects/unicodeobject.c:5727: warning: assuming signed overflow does not occur when simplifying conditional to constant Objects/unicodeobject.c: In function 'PyUnicodeUCS4_Compare': Objects/unicodeobject.c:5376: warning: assuming signed overflow does not occur when changing X +- C1 cmp C2 to X cmp C1 +- C2 Objects/unicodeobject.c:5376: warning: assuming signed overflow does not occur when changing X +- C1 cmp C2 to X cmp C1 +- C2 Objects/unicodeobject.c: In function 'PyUnicodeUCS4_Join': Objects/unicodeobject.c:4659: warning: assuming signed overflow does not occur when simplifying conditional to constant Python/ast.c: In function 'ast_for_genexp': Python/ast.c:1195: warning: assuming signed overflow does not occur when simplifying conditional to constant Python/ast.c:1160: warning: assuming signed overflow does not occur when simplifying conditional to constant Python/ast.c: In function 'ast_for_atom': Python/ast.c:1040: warning: assuming signed overflow does not occur when simplifying conditional to constant Python/ast.c:1005: warning: assuming signed overflow does not occur when simplifying conditional to constant Python/bltinmodule.c: In function 'builtin_map': Python/bltinmodule.c:907: warning: assuming signed overflow does not occur when changing X +- C1 cmp C2 to X cmp C1 +- C2 Python/bltinmodule.c:847: warning: assuming signed overflow does not occur when changing X +- C1 cmp C2 to X cmp C1 +- C2 Python/bltinmodule.c:847: warning: assuming signed overflow does not occur when changing X +- C1 cmp C2 to X cmp C1 +- C2 Parser/tokenizer.c:1163: warning: assuming signed overflow does not occur when changing X +- C1 cmp C2 to X cmp C1 +- C2 Parser/tokenizer.c: In function 'PyTokenizer_Get': Parser/tokenizer.c:1443: warning: assuming signed overflow does not occur when changing X +- C1 cmp C2 to X cmp C1 +- C2 Parser/tokenizer.c:1443: warning: assuming signed overflow does not occur when changing X +- C1 cmp C2 to X cmp C1 +- C2 Python/getargs.c:994: warning: assuming signed overflow does not occur when changing X +- C1 cmp C2 to X cmp C1 +- C2 Python/getargs.c:1040: warning: assuming signed overflow does not occur when changing X +- C1 cmp C2 to X cmp C1 +- C2 Python/getargs.c: In function 'seterror': Python/getargs.c:357: warning: assuming signed overflow does not occur when changing X +- C1 cmp C2 to X cmp C1 +- C2 Python/import.c: In function 'PyImport_ExtendInittab': Python/import.c:3129: warning: assuming signed overflow does not occur when simplifying conditional to constant Python/modsupport.c: In function 'va_build_value': Python/modsupport.c:529: warning: assuming signed overflow does not occur when simplifying conditional to constant Python/sysmodule.c: In function 'sys_getframe': Python/sysmodule.c:650: warning: assuming signed overflow does not occur when changing X +- C1 cmp C2 to X cmp C1 +- C2 Modules/gcmodule.c: In function 'collect': Modules/gcmodule.c:767: warning: assuming signed overflow does not occur when changing X +- C1 cmp C2 to X cmp C1 +- C2 ./Modules/_sre.c: In function 'sre_match': ./Modules/_sre.c:1002: warning: assuming signed overflow does not occur when simplifying conditional to constant ./Modules/_sre.c:1069: warning: assuming signed overflow does not occur when simplifying conditional to constant ./Modules/_sre.c:1086: warning: assuming signed overflow does not occur when simplifying conditional to constant ./Modules/_sre.c:1143: warning: assuming signed overflow does not occur when simplifying conditional to constant ./Modules/_sre.c:1185: warning: assuming signed overflow does not occur when simplifying conditional to constant ./Modules/_sre.c:1214: warning: assuming signed overflow does not occur when simplifying conditional to constant ./Modules/_sre.c:1238: warning: assuming signed overflow does not occur when simplifying conditional to constant ./Modules/_sre.c:1251: warning: assuming signed overflow does not occur when simplifying conditional to constant ./Modules/_sre.c:1277: warning: assuming signed overflow does not occur when simplifying conditional to constant ./Modules/_sre.c:1291: warning: assuming signed overflow does not occur when simplifying conditional to constant ./Modules/_sre.c:1308: warning: assuming signed overflow does not occur when simplifying conditional to constant ./Modules/_sre.c:1395: warning: assuming signed overflow does not occur when simplifying conditional to constant ./Modules/_sre.c:1408: warning: assuming signed overflow does not occur when simplifying conditional to constant ./Modules/_sre.c: In function 'sre_umatch': ./Modules/_sre.c:1002: warning: assuming signed overflow does not occur when simplifying conditional to constant ./Modules/_sre.c:1069: warning: assuming signed overflow does not occur when simplifying conditional to constant ./Modules/_sre.c:1086: warning: assuming signed overflow does not occur when simplifying conditional to constant ./Modules/_sre.c:1143: warning: assuming signed overflow does not occur when simplifying conditional to constant ./Modules/_sre.c:1185: warning: assuming signed overflow does not occur when simplifying conditional to constant ./Modules/_sre.c:1214: warning: assuming signed overflow does not occur when simplifying conditional to constant ./Modules/_sre.c:1238: warning: assuming signed overflow does not occur when simplifying conditional to constant ./Modules/_sre.c:1251: warning: assuming signed overflow does not occur when simplifying conditional to constant ./Modules/_sre.c:1277: warning: assuming signed overflow does not occur when simplifying conditional to constant ./Modules/_sre.c:1291: warning: assuming signed overflow does not occur when simplifying conditional to constant ./Modules/_sre.c:1308: warning: assuming signed overflow does not occur when simplifying conditional to constant ./Modules/_sre.c:1395: warning: assuming signed overflow does not occur when simplifying conditional to constant ./Modules/_sre.c:1408: warning: assuming signed overflow does not occur when simplifying conditional to constant /packages/python-2.5/Modules/stropmodule.c: In function 'strop_replace': /packages/python-2.5/Modules/stropmodule.c:1102: warning: assuming signed overflow does not occur when changing X +- C1 cmp C2 to X cmp C1 +- C2 /packages/python-2.5/Modules/_heapqmodule.c: In function 'heappop': /packages/python-2.5/Modules/_heapqmodule.c:146: warning: assuming signed overflow does not occur when changing X +- C1 cmp C2 to X cmp C1 +- C2 /packages/python-2.5/Modules/_hotshot.c: In function 'pack_line_times': /packages/python-2.5/Modules/_hotshot.c:693: warning: assuming signed overflow does not occur when changing X +- C1 cmp C2 to X cmp C1 +- C2 /packages/python-2.5/Modules/_hotshot.c: In function 'pack_frame_times': /packages/python-2.5/Modules/_hotshot.c:706: warning: assuming signed overflow does not occur when changing X +- C1 cmp C2 to X cmp C1 +- C2 /packages/python-2.5/Modules/binascii.c: In function 'binascii_a2b_base64': /packages/python-2.5/Modules/binascii.c:320: warning: assuming signed overflow does not occur when changing X +- C1 cmp C2 to X cmp C1 +- C2 /packages/python-2.5/Modules/binascii.c: In function 'binascii_b2a_uu': /packages/python-2.5/Modules/binascii.c:287: warning: assuming signed overflow does not occur when changing X +- C1 cmp C2 to X cmp C1 +- C2 /packages/python-2.5/Modules/parsermodule.c: In function 'validate_subscript': /packages/python-2.5/Modules/parsermodule.c:2811: warning: assuming signed overflow does not occur when changing X +- C1 cmp C2 to X cmp C1 +- C2 /packages/python-2.5/Modules/cPickle.c: In function 'Unpickler_noload': /packages/python-2.5/Modules/cPickle.c:193: warning: assuming signed overflow does not occur when simplifying conditional to constant /packages/python-2.5/Modules/cPickle.c:194: warning: assuming signed overflow does not occur when reducing constant in comparison /packages/python-2.5/Modules/cPickle.c:4232: warning: assuming signed overflow does not occur when assuming that (X - c) > X is always false /packages/python-2.5/Modules/cPickle.c:194: warning: assuming signed overflow does not occur when assuming that (X - c) > X is always false /packages/python-2.5/Modules/cPickle.c: In function 'load': /packages/python-2.5/Modules/cPickle.c:4232: warning: assuming signed overflow does not occur when assuming that (X - c) > X is always false /packages/python-2.5/Modules/audioop.c: In function 'audioop_ratecv': /packages/python-2.5/Modules/audioop.c:1150: warning: assuming signed overflow does not occur when changing X +- C1 cmp C2 to X cmp C1 +- C2 /packages/python-2.5/Modules/imageop.c: In function 'imageop_dither2grey2': /packages/python-2.5/Modules/imageop.c:430: warning: assuming signed overflow does not occur when simplifying conditional to constant /packages/python-2.5/Modules/_csv.c: In function 'join_append_data': /packages/python-2.5/Modules/_csv.c:969: warning: assuming signed overflow does not occur when simplifying conditional to constant /packages/python-2.5/Modules/expat/xmlparse.c: In function 'getAttributeId': /packages/python-2.5/Modules/expat/xmlparse.c:5337: warning: assuming signed overflow does not occur when simplifying conditional to constant /packages/python-2.5/Modules/dlmodule.c: In function 'dl_call': /packages/python-2.5/Modules/dlmodule.c:103: warning: assuming signed overflow does not occur when changing X +- C1 cmp C2 to X cmp C1 +- C2
msg60103 - (view)	Author: Guido van Rossum (gvanrossum) *	Date: 2008-01-18 17:53
Thanks! Good catch about -Wall. I think I am now able to reproduce these results with gcc 4.2. These results, while much more disturbing regarding the state of our code base, at least restore my faith in GCC. :-)
msg60105 - (view)	Author: Christian Heimes (christian.heimes) *	Date: 2008-01-18 18:16
I still don't get additional error messages on the trunk. I've altered the configure.in file to include -Wstrict-overflow=5 after -Wall: gcc -pthread -c -fno-strict-aliasing -g -Wall -Wstrict-prototypes -Wstrict-overflow=5 -I. -IInclude -I./Include -DPy_BUILD_CORE -o Objects/listobject.o Objects/listobject.c Either all problems are already solved or I'm doing something wrong here. $ LC_ALL=C gcc-4.2 -v Using built-in specs. Target: i486-linux-gnu Configured with: ../src/configure -v --enable-languages=c,c++,fortran,objc,obj-c++,treelang --prefix=/usr --enable-shared --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --enable-nls --with-gxx-include-dir=/usr/include/c++/4.2 --program-suffix=-4.2 --enable-clocale=gnu --enable-libstdcxx-debug --enable-mpfr --enable-targets=all --enable-checking=release --build=i486-linux-gnu --host=i486-linux-gnu --target=i486-linux-gnu Thread model: posix gcc version 4.2.1 (Ubuntu 4.2.1-5ubuntu4)
msg60107 - (view)	Author: Ismail Donmez (donmez) *	Date: 2008-01-18 18:27
-Wstrict-overflow=5 is not valid afaik its 1-3, 3 for most verbose also you need a recent gcc 4.3 snapshot for best results, check your distribution for gcc-snapshot package. About the -Wall thing it seems to be a gcc bug, but for now workaround is easy :-) Regards, ismail
msg60109 - (view)	Author: Ismail Donmez (donmez) *	Date: 2008-01-18 18:50
Btw I think we need an unsigned version of Py_ssize_t to fix this problem cleanly. I am not sure if you would agree with me though.
msg60110 - (view)	Author: Christian Heimes (christian.heimes) *	Date: 2008-01-18 18:56
I don't think we can make Py_ssize_t unsigned. On several occasions Python uses -1 as error flag or default flag.
msg60111 - (view)	Author: Ismail Donmez (donmez) *	Date: 2008-01-18 18:59
No I mean we need a new unsigned variant. Else we will have to cast it to unsigned for many overflow cases which is ugly.
msg60114 - (view)	Author: Guido van Rossum (gvanrossum) *	Date: 2008-01-18 20:47
The proper thing to do here is to add -Werror=strict-overflow to the CFLAGS (before -Wall -- we should fix the position of -Wall!); this will turn all those spots into errors, forcing us to fix them, and alerting users who might be using a newer compiler than we tested with. This should be done in favor of -fwrapv, but only if strict-overflow is supported (which we can find out in the same way as we found out whether -fwrapv is supported). I think in practice this means GCC 4.2 or newer. Can someone come up with a patch?
msg60115 - (view)	Author: Guido van Rossum (gvanrossum) *	Date: 2008-01-18 20:49
An unsigned variant of Py_ssize_t would just be size_t -- that's a much older type than ssize_t. I don't think we need to invent a Py_ name for it.
msg60116 - (view)	Author: Ismail Donmez (donmez) *	Date: 2008-01-18 20:58
Replace -fwrapv with -Wstrict-overflow=3 -Werror=strict-overflow when supported. Guido, does this do what you wanted? Regards, ismail
msg60118 - (view)	Author: Guido van Rossum (gvanrossum) *	Date: 2008-01-18 21:05
Close, I'd like to keep the -fwrapv if -Wstrict-overflow isn't supported. Also, would checking this in mean we can't build with GCC 4.3 until those issues are all fixed?
msg60120 - (view)	Author: Ismail Donmez (donmez) *	Date: 2008-01-18 21:11
Yes it breaks compilation with gcc 4.3. Fixing these bugs are mostly s/int/unsigned int. But some parts of code need Python wisdom :/ New patch attached adressing your comment.
msg60121 - (view)	Author: Guido van Rossum (gvanrossum) *	Date: 2008-01-18 21:16
Would you mind also adding patches for the places you think you can fix, and providing us with a list of places you need help with? O'm hoping that Greg or Christian can help reviewing these and committing them. Thanks much for your help BTW!
msg60124 - (view)	Author: Christian Heimes (christian.heimes) *	Date: 2008-01-18 22:24
The -fwrapv doesn't look right. You aren't testing for -fwrapv at all ;)
msg60125 - (view)	Author: Ismail Donmez (donmez) *	Date: 2008-01-18 23:35
First stub at fixing overflows, regresses following tests : test_doctests.py test_locale.py test_long.py test_long_future.py test_optparse.py test_pickle.py test_str.py (crash) test_string.py (crash) test_unicode.py (crash) test_userstring.py (crash) test_xpickle.py Not great, but a start.
msg60126 - (view)	Author: Martin v. Löwis (loewis) *	Date: 2008-01-18 23:40
> Btw I think we need an unsigned version of Py_ssize_t to fix this > problem cleanly. I am not sure if you would agree with me though. There is an unsigned version, it's called "size_t".
msg60146 - (view)	Author: Christian Heimes (christian.heimes) *	Date: 2008-01-19 11:01
Crashes ain't good ;) I suggest that you chance only a small portion of files at a time, then make && ./python Lib/test/regrtest.py. Start with the Parser, then move over to AST and the rest of Python/. You may have to remove all pyc and pyo files if you change code related to the Parser, AST and byte code marshaling. I'm not sure if it's required but it's worth a shot. Bytecode mismatches can lead to strange errors.
msg60246 - (view)	Author: Ismail Donmez (donmez) *	Date: 2008-01-19 23:15
I created a git repo for my fixes over http://repo.or.cz/w/pytest.git?a=shortlog;h=overflow-fix . Now as tiran suggested I fix one file and make sure nothing regressed. But! Feel free to beat me to it and fix this. I am all new to this and progress might be and possibly be slow.
msg60254 - (view)	Author: Ismail Donmez (donmez) *	Date: 2008-01-20 01:48
With second patch now python builds without any overflow warnings, no new regressions. Please test and/or review. Only thing left is fixing Modules subdirectory. Thanks.
msg60260 - (view)	Author: Ismail Donmez (donmez) *	Date: 2008-01-20 03:29
Possibly last one before final patch, only Modules/_sre.c left to fix, I appreciate help on that. Please ignore tab problems, I think that can be fixed later on. Thanks.
msg61272 - (view)	Author: Ismail Donmez (donmez) *	Date: 2008-01-20 11:36
Final patch should be complete. Used a trick in _sre.c, instead of i < 0 , I used i + i < i to trick gcc.
msg61286 - (view)	Author: Christian Heimes (christian.heimes) *	Date: 2008-01-20 13:01
Ismail Donmez wrote: > Ismail Donmez added the comment: > > Final patch should be complete. Used a trick in _sre.c, instead of i < 0 > , I used > i + i < i to trick gcc. I'm going to review your patch later. Christian
msg61291 - (view)	Author: Christian Heimes (christian.heimes) *	Date: 2008-01-20 13:56
Ismail Donmez wrote: > Final patch should be complete. Used a trick in _sre.c, instead of i < 0 > , I used > i + i < i to trick gcc. > > Added file: http://bugs.python.org/file9242/fix-overflows-final.patch Does the C89 standard allow this code? int q = 1; int p = (unsigned)q; I've never seen an unsigned cast without a type. Does the code compile with gcc -std=C89? Christian
msg61319 - (view)	Author: Martin v. Löwis (loewis) *	Date: 2008-01-20 17:47
> Does the C89 standard allow this code? > > int q = 1; > int p = (unsigned)q; > I've never seen an unsigned cast without a type. Yes, that's fine; it's a different spelling of "unsigned int". In C99, 6.7.2p1 defines the following groups as equivalent: - short, signed short, short int, or signed short int - unsigned short, or unsigned short int - int, signed, or signed int - unsigned, or unsigned int - long, signed long, long int, or signed long int - unsigned long, or unsigned long int - long long, signed long long, long long int, or signed long long int - unsigned long long, or unsigned long long int Specifiers may occur in any order, so you may also write "int short unsigned".
msg61320 - (view)	Author: Ismail Donmez (donmez) *	Date: 2008-01-20 17:57
Hi Christian, unsigned cast is actually suggested by GCC developers to force correct wrapping for signed types. And thanks to Martin, it makes sense :-)
msg61757 - (view)	Author: Neal Norwitz (nnorwitz) *	Date: 2008-01-28 02:17
I'm just starting to look at the patch. The first one changes i to unsigned in Modules/_csv.c. Hopefully most of them are like this. The code is fine as it is. There is no reliance on overflow AFAICT. It's just that the breaking condition from the loop is not in the for (...). I think this change is fine to avoid a warning. Just pointing out that in this one case, it's not a real problem. Change to heapq doesn't seem needed. I looked at the warning generated from this and it's if (!n). This seems to indicate the compiler thinks that n could be negative. That should not be possible. It came from PyList_GET_SIZE. We had verified the object was already a list. So this value should be between 0 and PY_SSIZE_T_MAX. We check for 0, so it might be > 0. After decrementing n, it should be between 0 and PY_SSIZE_T_MAX-1. Of course, the compiler can't know the value can't be negative (or PY_SSIZE_T_MIN) which would cause an underflow. Change to hotshot should avoid a cast, so it's slightly better with this approach. Although with the change to size_t, the cast in flush_data can be removed (just after the fwrite). I don't see the reason to need for the change in sre.c, but I'm pretty sure there are other overflows. audioop definitely looks needed. cPickle looks necessary. expat/xmlparse.c is interesting--not sure if it's really necessary. In practice this probably can't be reached. gc can't really overflow given that NUM_GENERATIONS is 3 and not likely to grow much more. :-) I stopped looking at this point. It looks like some of these are really needed. Others are not possible given other invariants (the compiler can't know about). I like the idea of silencing compiler warnings. However, I fear this may generate a different problem. Namely signed/unsigned mismatches. What happens if you add this warning: -Wsign-compare I think we got rid of most of those before (probably not true as much for Modules/*.c). I think this introduces many more. Would it be possible to come up with a patch that doesn't introduce more warnings w/-Wsign-compare? One potential issue with this patch is that while the additions might have guaranteed semantics, we might have other problems when doing: size_t value = PyXXX_Size(); if (value < 0) ... I'm hoping that if we can use both -Wstrict-overflow and -Wsign-conversion, eliminate all warnings, resulting in better code. (You could also try building with g++. The core used to work without warnings. The modules still had a ways to go.) Also what is the current state? What has been implemented and what else needs to be done? Perhaps we should make other bug report(s) to address other tangents that were discussed in this thread.
msg61758 - (view)	Author: Ismail Donmez (donmez) *	Date: 2008-01-28 02:28
Thanks for the through review! I will add -Wsign-compare and fix new warnings. Btw current state is with the patch -fwrapv is not needed and no regressions.
msg61759 - (view)	Author: Neal Norwitz (nnorwitz) *	Date: 2008-01-28 02:41
I just added -Wstrict-overflow to the code in sysmodule.c (sys_getframe) and tried with gcc 4.2.1. It doesn't warn. I wonder if 4.3 is more picky or warns when it shouldn't? Unless if I changed the code so it doesn't work: typedef struct {int ref;} PyObject; typedef struct { PyObject* f_back; } PyFrameObject; int PyArg_ParseTuple(PyObject, const char, int); PyObject sys_getframe(PyFrameObject f, PyObject self, PyObject args) { int depth = -1; if (!PyArg_ParseTuple(args, "\|i:_getframe", &depth)) return 0; while (depth > 0 && f != 0) { f = (PyFrameObject)f->f_back; --depth; } return (PyObject*)f; } Compiled with: gcc-4.2.1-glibc-2.3.2/x86_64-unknown-linux-gnu/bin/x86_64-unknown-linux-gnu-gcc -Wstrict-overflow -c xx.c produced no warnings. This is not a stock gcc 4.2.1, so that could also be an issue. Did I run it correctly. Is there anything else I need to do? If you run the code above with gcc 4.3, does it produce a warning?
msg61760 - (view)	Author: Guido van Rossum (gvanrossum) *	Date: 2008-01-28 02:41
Unfortunately I have no time to work on this myself, but in order to make progress I recommend checking in things piecemeal so that the same changes don't get reviewed over and over again. I propose that each submit reference this bug ("Partial fix for issue #1621" I'd suggest) and that the submits are recorded here (e.g. "fixed <filename> in rXXX (2.5.2), rYYY (2.6)"). Then hopefully only a few hard cases will remain. With Neal, I don't see what the warning in _csv is about. What condition is being turned into a constant? Is the compiler perhaps rearranging the code so as to insert "if (field[0] == '\0') goto XXX;" in front of the for-loop where XXX jumps into the middle of the condition in the if-statement immediately following the for-loop, and skipping that if-block when breaking of the loop later? That would be clever, and correct, and I'm not sure if making i unsigned is going to help -- in fact it might make the compiler decide it can't use that optimization...
msg61761 - (view)	Author: Ismail Donmez (donmez) *	Date: 2008-01-28 02:45
To Neal, Can you try with -Wstrict-overflow=3 , but yes I am using gcc 4.3 trunk. To Guido, I'll check _csv.c issue closely. Shall I create the new bug reports or will reviewers will do so and CC me maybe?
msg61762 - (view)	Author: Neal Norwitz (nnorwitz) *	Date: 2008-01-28 02:51
On Jan 27, 2008 6:45 PM, Ismail Donmez <report@bugs.python.org> wrote: > > Can you try with -Wstrict-overflow=3 , but yes I am using gcc 4.3 trunk. I just tried with =1, =2, =3, and no =. All the same result: no warning. Ismail, thanks for going through all this effort. It's very helpful.
msg61763 - (view)	Author: Guido van Rossum (gvanrossum) *	Date: 2008-01-28 02:54
Don't create new bug reports!
msg61765 - (view)	Author: Ismail Donmez (donmez) *	Date: 2008-01-28 03:02
Neal, I'll try to answer your questions one by one, first with _csv.c compiler issues : Modules/_csv.c:969: warning: assuming signed overflow does not occur when simplifying conditional to constant There is a check inside loop like this: if (c == '\0') break; Instead of this if we do the check in the for : + for (i = 0; i < strlen(field) ; i++) { and remove the if check compiler no longer issues a warning also csv test passes with this. Attached patch implements this optimization. Guido, you don't have to shout, you know noone pays me per python bugreport I create :)
msg61766 - (view)	Author: Ismail Donmez (donmez) *	Date: 2008-01-28 03:10
_sre.c case is the most interesting one , compiler says : ./Modules/_sre.c:1002: warning: assuming signed overflow does not occur when simplifying conditional to constant ./Modules/_sre.c:1069: warning: assuming signed overflow does not occur when simplifying conditional to constant ./Modules/_sre.c:1086: warning: assuming signed overflow does not occur when simplifying conditional to constant ./Modules/_sre.c:1143: warning: assuming signed overflow does not occur when simplifying conditional to constant ./Modules/_sre.c:1185: warning: assuming signed overflow does not occur when simplifying conditional to constant ./Modules/_sre.c:1214: warning: assuming signed overflow does not occur when simplifying conditional to constant ./Modules/_sre.c:1238: warning: assuming signed overflow does not occur when simplifying conditional to constant ./Modules/_sre.c:1251: warning: assuming signed overflow does not occur when simplifying conditional to constant ./Modules/_sre.c:1277: warning: assuming signed overflow does not occur when simplifying conditional to constant ./Modules/_sre.c:1291: warning: assuming signed overflow does not occur when simplifying conditional to constant ./Modules/_sre.c:1308: warning: assuming signed overflow does not occur when simplifying conditional to constant ./Modules/_sre.c:1395: warning: assuming signed overflow does not occur when simplifying conditional to constant ./Modules/_sre.c:1408: warning: assuming signed overflow does not occur when simplifying conditional to constant ./Modules/_sre.c: In function 'sre_umatch': ./Modules/_sre.c:1002: warning: assuming signed overflow does not occur when simplifying conditional to constant ./Modules/_sre.c:1069: warning: assuming signed overflow does not occur when simplifying conditional to constant ./Modules/_sre.c:1086: warning: assuming signed overflow does not occur when simplifying conditional to constant ./Modules/_sre.c:1143: warning: assuming signed overflow does not occur when simplifying conditional to constant ./Modules/_sre.c:1185: warning: assuming signed overflow does not occur when simplifying conditional to constant ./Modules/_sre.c:1214: warning: assuming signed overflow does not occur when simplifying conditional to constant ./Modules/_sre.c:1238: warning: assuming signed overflow does not occur when simplifying conditional to constant ./Modules/_sre.c:1251: warning: assuming signed overflow does not occur when simplifying conditional to constant ./Modules/_sre.c:1277: warning: assuming signed overflow does not occur when simplifying conditional to constant ./Modules/_sre.c:1291: warning: assuming signed overflow does not occur when simplifying conditional to constant ./Modules/_sre.c:1308: warning: assuming signed overflow does not occur when simplifying conditional to constant ./Modules/_sre.c:1395: warning: assuming signed overflow does not occur when simplifying conditional to constant ./Modules/_sre.c:1408: warning: assuming signed overflow does not occur when simplifying conditional to constant all lines refer to : RETURN_ON_ERROR(ret); My investigation : On line 1002 we got RETURN_ON_ERROR(ret); but ret is already 0 and not set to anything this if will always be false. Same for line 1069, ret is still zero there. Maybe I am missing something here?
msg61767 - (view)	Author: Ismail Donmez (donmez) *	Date: 2008-01-28 03:16
For xmlparse.c compiler says : Modules/expat/xmlparse.c:5337: warning: assuming signed overflow does not occur when simplifying conditional to constant Its impossible for j to overflow here due to name[i] check but I am not sure what gcc is optimizing here.
msg61768 - (view)	Author: Guido van Rossum (gvanrossum) *	Date: 2008-01-28 03:20
> for (i = 0; i < strlen(field) ; i++) { This looks inefficient. Why not for (i = 0; field[i] != '\0'; i++) { ???
msg61770 - (view)	Author: Ismail Donmez (donmez) *	Date: 2008-01-28 03:32
Hah strlen in a loop, a nice beginner mistake but its 5.30 AM here so please excuse me, Guido your version of course is way better. But with that version compiler issues Modules/_csv.c:969: warning: assuming signed overflow does not occur when simplifying conditional to constant again, strlen() just fooled that optimization it seems. So there should be another way to optimize this loop.
msg61774 - (view)	Author: Martin v. Löwis (loewis) *	Date: 2008-01-28 08:14
> With Neal, I don't see what the warning in _csv is about. What condition > is being turned into a constant? Is the compiler perhaps rearranging the > code so as to insert "if (field[0] == '\0') goto XXX;" in front of the > for-loop where XXX jumps into the middle of the condition in the > if-statement immediately following the for-loop, and skipping that > if-block when breaking of the loop later? Indeed that's what happens. In the case of breaking the loop later, the compiler can skip the if-block only if signed ints never overflow, hence the warning. Another way of silencing the warning is to test field[0]=='\0' in the if-statement. This might also somewhat pessimize the code, but allows the compiler to eliminate i altogether.
msg61784 - (view)	Author: Guido van Rossum (gvanrossum) *	Date: 2008-01-28 17:54
I wonder if it would help making i a Py_ssize_t instead of an int?
msg61785 - (view)	Author: Ismail Donmez (donmez) *	Date: 2008-01-28 17:54
Neal, You could btw check http://repo.or.cz/w/pytest.git?a=shortlog;h=overflow-fix which have each fix seperate so that reviewing is easy. Just ignore configure changes thats for later. Thanks, ismail
msg61788 - (view)	Author: Ismail Donmez (donmez) *	Date: 2008-01-28 17:57
> Guido van Rossum added the comment: > > I wonder if it would help making i a Py_ssize_t instead of an int? gcc still issues the same warning with that.
msg61791 - (view)	Author: Ismail Donmez (donmez) *	Date: 2008-01-28 18:05
gcc is optimizing the second if check , for specifically i == 0 seems to redundant according to gcc. if (i == 0 && quote_empty) { if (dialect->quoting == QUOTE_NONE) { PyErr_Format(error_obj, "single empty field record must be quoted"); return -1; } else *quoted = 1; }
msg61792 - (view)	Author: Ismail Donmez (donmez) *	Date: 2008-01-28 18:09
Moving the empty check before the loop will fix this and possibly optimize empty string handling.
msg62616 - (view)	Author: Ismail Donmez (donmez) *	Date: 2008-02-21 11:03
Any news on this? Also gcc 4.3 & gcc 4.2.3 fixed the -Wall clobbering - Wstrict-overflow problem, which is good news.
msg87689 - (view)	Author: Mark Dickinson (mark.dickinson) *	Date: 2009-05-13 15:53
A few comments: (1) I agree that signed overflows should be avoided where possible. (2) I think some of the changes in the latest patch (fix-overflows-final.patch) are unnecessary, and decrease readability a bit. An example is the following chunk for the x_divrem function in Objects/longobject.c. @@ -1799,7 +1799,7 @@ x_divrem(PyLongObject v1, PyLongObject w1, PyLongObject **prem) k = size_v - size_w; a = _PyLong_New(k + 1); - for (j = size_v; a != NULL && k >= 0; --j, --k) { + for (j = size_v; a != NULL && k >= 0; j = (unsigned)j - 1 , k = (unsigned)k - 1) { digit vj = (j >= size_v) ? 0 : v->ob_digit[j]; twodigits q; stwodigits carry = 0; In this case it's easy to see from the code that j and k will always be nonnegative, so replacing --j with j = (unsigned)j - 1 is unnecessary. (This chunk no longer applies for 2.7 and 3.1, btw, since x_divrem got rewritten recently.) Similar comments apply to the change: - min_gallop -= min_gallop > 1; + if (min_gallop > 1) min_gallop = (unsigned)min_gallop - 1; in Objects/listobject.c. Here it's even clearer that the cast is unnecessary. I assume these changes were made to silence warnings from -Wstrict-overflow, but I don't think that should be a goal: I'd suggest only making changes where there's a genuine possibility of overflow (even if it's a remote one), and leaving the code unchanged if it's reasonably easy to see that overflow is impossible. (3) unicode_hash also needs fixing, as do the lookup algorithms for set and dict. Both use overflowing arithmetic on signed types as a matter of course. Probably a good few of the hash algorithms for the various object types in Objects/ are suspect.
msg87690 - (view)	Author: Gregory P. Smith (gregory.p.smith) *	Date: 2009-05-13 16:01
"""I assume these changes were made to silence warnings from -Wstrict-overflow, but I don't think that should be a goal: I'd suggest only making changes where there's a genuine possibility of overflow (even if it's a remote one), and leaving the code unchanged if it's reasonably easy to see that overflow is impossible.""" There is a lot of value in being able to compile with -Wstrict-overflow and know that every warning omitted is something to be looked at. I think it is advantageous to make all code pass this. Having any "expected" warnings during compilation tends to lead people to ignore all warnings. That said, I agree those particular examples of unnecessary casts are ugly and should be written differently if they are actually done to prevent a warning.
msg87693 - (view)	Author: Mark Dickinson (mark.dickinson) *	Date: 2009-05-13 16:45
> There is a lot of value in being able to compile with -Wstrict-overflow > and know that every warning omitted is something to be looked at. I agree in principle; this certainly applies to -Wall. But -Wstrict- overflow doesn't do a particularly good job of finding signed overflow cases: there are a good few false positives, and it doesn't pick up the many cases of actual everyday signed overflow e.g., in unicode_hash, byteshash, set_lookkey, etc.), so it doesn't seem a particular good basis for code rewriting.
msg87694 - (view)	Author: Ismail Donmez (donmez) *	Date: 2009-05-13 16:48
You should be using gcc 4.4 to get the best warning behaviour.
msg87704 - (view)	Author: Mark Dickinson (mark.dickinson) *	Date: 2009-05-13 20:05
Thanks Ismail. I'm currently using gcc-4.4 with the -ftrapv (not -fwrapv!) option to see how much breaks. (Answer: quite a lot. :-[ ) I'm finding many overflow checks that look like: size = Py_SIZE(a) * n; if (n && size / n != Py_SIZE(a)) { PyErr_SetString(PyExc_OverflowError, "repeated bytes are too long"); return NULL; } where size and n have type Py_ssize_t. That particular one comes from bytesobject.c (in py3k), but this style of check occurs frequently throughout the source. Do people think that all these should be fixed? The fix itself s reasonably straightforward: instead of multiplying and then checking for an overflow that's already happened (and hence has already invoked undefined behaviour according to the standards), get an upper bound for n first by dividing PY_SSIZE_T_MAX by Py_SIZE(a) and use that to do the overflow check before the multiplication. It shouldn't be less efficient: either way involves an integer division, a comparison, and a multiplication. The hard part is finding all the places that need to be fixed.
msg87707 - (view)	Author: Martin v. Löwis (loewis) *	Date: 2009-05-13 20:38
> I'm finding many overflow checks that look like: > > size = Py_SIZE(a) * n; > if (n && size / n != Py_SIZE(a)) { > PyErr_SetString(PyExc_OverflowError, > "repeated bytes are too long"); > return NULL; > } > > where size and n have type Py_ssize_t. That particular one comes > from bytesobject.c (in py3k), but this style of check occurs > frequently throughout the source. > > Do people think that all these should be fixed? If this really invokes undefined behavior already (i.e. a compiler could set "size" to -1, and have the test fail - ie. not give an exception, and still be conforming) - then absolutely yes. > The fix itself s reasonably straightforward: instead of multiplying > and then checking for an overflow that's already happened (and hence > has already invoked undefined behaviour according to the standards), > get an upper bound for n first by dividing PY_SSIZE_T_MAX > by Py_SIZE(a) and use that to do the overflow check before > the multiplication. It shouldn't be less efficient: either way > involves an integer division, a comparison, and a multiplication. [and then perform the multiplication unsigned, to silence the warning - right?] I think there is a second solution: perform the multiplication unsigned in the first place. For unsigned multiplication, IIUC, overflow behavior is guaranteed in standard C (i.e. it's modulo 2*N, where N is the number of value bits for the unsigned value). So the code would change to nbytes = (size_t)Py_SIZE(a)n; if (n && (nbytes > Py_SSIZE_T_MAX \|\| nbytes/n != Py_SIZE(a))... size = (Py_ssize_t)nbytes;
msg87708 - (view)	Author: Mark Dickinson (mark.dickinson) *	Date: 2009-05-13 20:58
> [and then perform the multiplication unsigned, to silence the > warning - right?] That wasn't actually what I was thinking: I was proposing to rewrite it as: if (Py_SIZE(a) > 0 && n > PY_SSIZE_T_MAX/Py_SIZE(a)) { PyErr_SetString(PyExc_OverflowError, "repeated bytes are too long"); return NULL; } size = Py_SIZE(a) * n; The multiplication should be safe from overflow, and I don't get any warning at all either with this rewrite (using -O3 -Wall -Wextra - Wsigned-overflow=5) or from the original code, so there's nothing to silence. > I think there is a second solution: perform the multiplication > unsigned in the first place. That would work too. I find the above code clearer, though. It's not immediately obvious to me that the current overflow condition actually works, even assuming wraparound on overflow; I find myself having to think about the mathematics every time. In general, it seems to me that the set of places reported by -Wsigned- overflow is a poor match for the set of places that need to be fixed. - Wsigned-overflow only gives a warning when that particular version of gcc, with those particular flags, happens to make use of the no-overflow assumption for some particular optimization. Certainly each of the places reported by -Wsigned-overflow should be investigated, but I don't believe it's worth 'fixing' correct code just to get rid of warnings from this particular warning option.
msg87712 - (view)	Author: Martin v. Löwis (loewis) *	Date: 2009-05-13 21:32
> size = Py_SIZE(a) * n; > > The multiplication should be safe from overflow, and I don't get > any warning at all either with this rewrite (using -O3 -Wall -Wextra - > Wsigned-overflow=5) or from the original code, so there's nothing to > silence. This is puzzling, isn't it? It could overflow, could it not? >> I think there is a second solution: perform the multiplication >> unsigned in the first place. > > That would work too. I find the above code clearer, though. I agree in this case. In general, I'm not convinced that it is always possible to rewrite the code in that way conveniently.
msg87730 - (view)	Author: Mark Dickinson (mark.dickinson) *	Date: 2009-05-14 09:00
> This is puzzling, isn't it? I don't see why. There's nothing in -Wall -Wextra -Wsigned-overflow that asks for warnings for code that might overflow. Indeed, I don't see how any compiler could reasonably provide such warnings without flagging (almost) every occurrence of arithmetic on signed integers as suspect.[] The -ftrapv option is useful for catching genuine signed-integer overflows at runtime, but it can still only catch those cases that actually get exercised (e.g., by the Python test suite). [] Even some operations on unsigned integers would have to be flagged: the C expression "(unsigned short)x * (unsigned short)y" also has the potential to invoke undefined behaviour, thanks to C's integer promotion rules.
msg87731 - (view)	Author: Mark Dickinson (mark.dickinson) *	Date: 2009-05-14 09:50
Aargh! s/Wsigned-overflow/Wstrict-overflow/g Sorry.
msg87908 - (view)	Author: Martin v. Löwis (loewis) *	Date: 2009-05-16 18:40
> I don't see why. There's nothing in -Wall -Wextra -Wsigned-overflow > that asks for warnings for code that might overflow. Ah, right. I misunderstood (rather, didn't bother checking) what -Wsigned-overflow really does.
msg141871 - (view)	Author: deadshort (deadshort)	Date: 2011-08-10 15:47
Since this is still dribbling along I'll point out intobject.c:int_pow() and: prev = ix; /* Save value for overflow check / if (iw & 1) { ix = ixtemp; if (temp == 0) break; /* Avoid ix / 0 / if (ix / temp != prev) { return PyLong_Type.tp_as_number->nb_power( (PyObject )v, (PyObject )w, (PyObject )z); } } which I misclassified in http://bugs.python.org/issue12701
msg144490 - (view)	Author: Mark Dickinson (mark.dickinson) *	Date: 2011-09-24 08:24
Resetting versions.
msg144492 - (view)	Author: Mark Dickinson (mark.dickinson) *	Date: 2011-09-24 08:27
Clang has an -ftrapv option that seems to be less buggy and more complete than gcc's. Compiling and running the test suite with that option enabled looks like a good way to catch a lot of these signed overflows.
msg144499 - (view)	Author: Martin v. Löwis (loewis) *	Date: 2011-09-24 14:48
Do we consider these overflows as bugs in Python, or do we declare that we expect compilers to "do the right thing" for the bug fix releases (i.e. care only about the default branch). I'd personally vote for the latter - i.e. don't apply any of the resulting changes to the maintenance releases, and target the issue only for 3.3. Realistically, a compiler that invokes truly undefined behavior for signed overflow has no chance of getting 3.2 compiled correctly, and we have no chance of finding all these issues within the lifetime of 3.2. If that is agreed, I would start committing changes that fix the issues Mark already discussed in 2009 (unless he wants to commit them himself).
msg144501 - (view)	Author: Mark Dickinson (mark.dickinson) *	Date: 2011-09-24 15:57
> don't apply any of the resulting changes to the maintenance releases, > and target the issue only for 3.3. That sounds fine to me, though if we find more instances of signed overflow that actually trigger test failures in the maintenance branches (like the int_pow one) on mainstream compilers, we might want to fix those there too, on a case-by-case basis. To get started, here's a patch that fixes occurrences of signed overflow in the bytes, str and tuple hash methods, and also in set lookups. It also fixes a related and minor casting inconsistency in dictobject.c (which was using (size_t)hash & mask in some places, and just 'hash & mask' in others). These are the minimal changes required to get Python to build completely using Clang with '-ftrapv' turned on and --with-pydebug enabled.
msg144503 - (view)	Author: Roundup Robot (python-dev)	Date: 2011-09-24 17:19
New changeset 698fa089ce70 by Mark Dickinson in branch 'default': Issue #1621: Fix undefined behaviour in bytes.__hash__, str.__hash__, tuple.__hash__, frozenset.__hash__ and set indexing operations. http://hg.python.org/cpython/rev/698fa089ce70
msg144505 - (view)	Author: Roundup Robot (python-dev)	Date: 2011-09-24 18:12
New changeset 5e456e1a9e8c by Mark Dickinson in branch 'default': Issue #1621: Fix undefined behaviour from signed overflow in get_integer (stringlib/formatter.h) http://hg.python.org/cpython/rev/5e456e1a9e8c
msg144524 - (view)	Author: Roundup Robot (python-dev)	Date: 2011-09-25 14:35
New changeset 3fb9464f9b02 by Mark Dickinson in branch 'default': Issue #1621: Fix undefined behaviour from signed overflow in datetime module hashes, array and list iterations, and get_integer (stringlib/string_format.h) http://hg.python.org/cpython/rev/3fb9464f9b02
msg147958 - (view)	Author: Mark Dickinson (mark.dickinson) *	Date: 2011-11-19 17:10
See also issue #9530.
msg213729 - (view)	Author: Jeffrey Walton (Jeffrey.Walton) *	Date: 2014-03-16 15:07
Also see http://bugs.python.org/issue20944 for suggestions to identify the offending code.
msg270084 - (view)	Author: Antti Haapala (ztane) *	Date: 2016-07-10 13:05
One common case where signed integer overflow has been assumed has been the wraparound/overflow checks like in http://bugs.python.org/issue27473 I propose that such commonly erroneous tasks such as overflow checks be implemented as common macros in CPython as getting them right is not quite easy (http://c-faq.com/misc/sd26.html); it would also make the C code more self-documenting. Thus instead of writing if (va.len > PY_SSIZE_T_MAX - vb.len) { one would write something like if (PY_SSIZE_T_SUM_OVERFLOWS(va.len, vb.len)) { and the mere fact that such a macro wasn't used there would signal about possible problems with the comparison.
msg270460 - (view)	Author: Martin Panter (martin.panter) *	Date: 2016-07-15 02:53
Inspired by Issue 27473, I did a quick and dirty scan for obvious places that expect overflow to wrap, and found the following, which I think should be fixed: Modules/_ctypes/_ctypes.c:1388, in PyCArrayType_new() Objects/listobject.c:492, in list_concat() Objects/tupleobject.c:457, in tupleconcat() Objects/listobject.c:845, in listextend() Also I played with enabling GCC’s -ftrapv option. Attached is a patch with three changes: 1. configure --with-pydebug enables -ftrapv (experimental, not sure everyone would want this) 2. Easy fix for negation overflow in audioop (I am happy to apply this part) 3. Avoid dumb overflows at end of for loop in Element Tree code when handling slices with step=sys.maxsize. Technically the overflow is undefined behaviour, but the change is annoying, because ignoring the overflow at the end of the loop is much simpler than adding special cases. Not sure what I think about this part.
msg270461 - (view)	Author: Martin Panter (martin.panter) *	Date: 2016-07-15 02:57
I added Issue 13312 as a dependency, there is currently a test for a negative year that relies on overflow handling. Here is a patch where I tried to fix overflow detection for huge set objects.
msg270463 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2016-07-15 03:48
The audioop part LGTM. If this case was found with the help of -ftrapv, I'm for adding this option in a debug build.
msg270562 - (view)	Author: Martin Panter (martin.panter) *	Date: 2016-07-16 15:16
I tried the newer -fsanitize=undefined mode, and it is better than -ftrapv. It adds instrumentation that by default nicely reports the errors and continues running. My problem with the large slice step is not restricted to Element Tree; it affects list objects too: >>> "abcdef"[3::sys.maxsize] Objects/unicodeobject.c:13794:55: runtime error: signed integer overflow: 3 + 9223372036854775807 cannot be represented in type 'long int' 'd' Regarding Antti’s overflow macros, I noticed there is already a macro _PyTime_check_mul_overflow() in Python/pytime.c which does that kind of thing. Maybe it could help, though I am not sure. Has this sort of thing been done in other projects? We might need to be careful about the sign, e.g. clarify the macro is only for positive values, add an assertion, or handle both positive and negative.
msg270580 - (view)	Author: Jeffrey Walton (Jeffrey.Walton) *	Date: 2016-07-16 19:03
> Has this sort of thing been done in other projects? Yes. If you are using C, you can use safe_iop. Android uses it for safer integer operations. If you are using C++, you can use David LeBlanc's SafeInt class. Microsoft uses it for safer inter operations. Jeff
msg270582 - (view)	Author: Antti Haapala (ztane) *	Date: 2016-07-16 19:14
Gnulib portability library has https://www.gnu.org/software/gnulib/manual/html_node/Integer-Range-Overflow.html and https://www.gnu.org/softwarhe/gnulib/manual/html_node/Integer-Type-Overflow.html and even macros for producing well-defined integer wraparound for signed integers: https://www.gnu.org/software/gnulib/manual/html_node/Wraparound-Arithmetic.html That code is under GPL but I believe there is no problem if someone just looks into that for ideas on how to write similar macros.
msg270807 - (view)	Author: Roundup Robot (python-dev)	Date: 2016-07-19 03:09
New changeset d6a86018ab33 by Martin Panter in branch 'default': Issue #1621: Avoid signed int negation overflow in audioop https://hg.python.org/cpython/rev/d6a86018ab33
msg270809 - (view)	Author: Martin Panter (martin.panter) *	Date: 2016-07-19 03:44
I committed the fix for negation in audioop. slice-step.patch includes a better fix for the remaining part of trapv.patch, with Element Tree slicing. I think this fix is much less intrusive, and I have copied it to other places that handle slicing, and added corresponding test cases. The undefined behaviour sanitizer produces lots of errors about bit shifting signed integers in low-level modules like ctypes, struct, audioop. Typically this is for code converting signed integers to and from bytes, and big/little-endian conversions. This is technically undefined behaviour, but I think it may be less serious than the other overflows with traditional arithmetic like addition and multiplication. E.g. GCC explicitly documents <https://gcc.gnu.org/onlinedocs/gcc/Integers-implementation.html> that this is handled as expected with twos-complement, so with GCC there should be no nasty surprises with optimizing out undefined behaviour. My set-overflow.patch would also be in this boat.
msg270823 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2016-07-19 12:57
Does this work with negative steps?
msg270827 - (view)	Author: Antti Haapala (ztane) *	Date: 2016-07-19 14:06
About shifts, according to C standard, right shifts >> of negative numbers are implementation-defined: "[in E1 >> E2], If E1 has a signed type and a negative value, the resulting value is implementation-defined." In K&R this meant that the number can be either zero-extended or sign-extended. In any case it cannot lead to undefined behaviour, but the implementation must document what is happening there. Now, GCC says that >> is always arithmetic/sign-extended. This is the implementation-defined behaviour, now GCC has defined what its implementation will do, but some implementation can choose zero-extension. (unlikely) As for the other part as it says "GCC does not use the latitude given in C99 only to treat certain aspects of signed ‘<<’ as undefined". But GCC6 will diagnose that now with -Wextra, and thus it changed already, as with -Werror -Wextra the code doesn't necessarily compile any longer, which is fine. Note that this "certain -- only" refers to that section where the C99 and C11 explicitly say that the behaviour is undefined and C89 doesn't say anything. It could as well be argued that in C89 it is undefined by omission. Additionally all shifts that shift by more than or equal to the width still have undefined behaviour (as do shifts by negative amount). IIRC they work differently on ARM vs x86: in x86 the shift can be mod 32 on 386, and in ARM, mod 256.
msg270869 - (view)	Author: Martin Panter (martin.panter) *	Date: 2016-07-20 12:48
Serhiy: slice-step.patch seems to be fine with negative slice steps. The actual indexes are still positive, and “unsigned” arithmetic is really modular arithmetic, so when you add the negative “unsigned” step value, it decrements the index properly. Antti: if you use the sanitizer, (almost?) all the shift errors are for left shifts, either of a positive signed overflow, or a negative value. There is a bit more discussion of bit shift errors in Issue 20932. Examples: Modules/audioop.c:1527:43: runtime error: left shift of negative value -24 Objects/unicodeobject.c:5152:29: runtime error: left shift of 255 by 24 places cannot be represented in type 'int' I didn’t see any sanitizer reports about right shifts; perhaps it doesn’t report those (being implemenation-defined, rather than undefined, behaviour). And the only report about an excessive shift size is due to a known bug in ctypes, Issue 15119.
msg270977 - (view)	Author: Xiang Zhang (xiang.zhang) *	Date: 2016-07-22 08:12
@Martin, attach a patch to fix the overflow check you mentioned in tuple and list objects.
msg271057 - (view)	Author: Martin Panter (martin.panter) *	Date: 2016-07-23 04:35
Apart from the empty “if” statement style (see review), tuple_and_list.patch looks good to me. I understand the patches from 2011 and earlier have all been committed (correct me if I missed something). Here is another patch fixing a 64-bit overflow in _thread, detected by the test_timeout() method in test_threading: Modules/_threadmodule.c:59:17: runtime error: signed integer overflow: 6236628528058 + 9223372036000000000 cannot be represented in type 'long int'
msg271058 - (view)	Author: Martin Panter (martin.panter) *	Date: 2016-07-23 05:09
Perhaps we should add a test for the __length_hint__() overflow to tuple_and_list.patch: >>> a = [1,2,3,4] >>> import sys >>> class B: ... def __iter__(s): return s ... def __next__(s): raise StopIteration() ... def __length_hint__(s): return sys.maxsize ... >>> a.extend(B()) Objects/listobject.c:844:8: runtime error: signed integer overflow: 4 + 2147483647 cannot be represented in type 'int' array-size.patch fixes the ctypes array size overflow (including a test).
msg271084 - (view)	Author: Xiang Zhang (xiang.zhang) *	Date: 2016-07-23 15:06
Change tuple_and_list.patch with empty curly braces. I don't add the test for __length_hint__. According to the comment, when overflow happens, it is either ignored or a MemoryError will finally be raised. I am not willing to test a MemoryError in this case. BTW, how do you get the error?
msg271118 - (view)	Author: Martin Panter (martin.panter) *	Date: 2016-07-24 00:10
The error message comes from Undefined Behaviour Sanitizer, which was added to newer versions of GCC and Clang. Currently I am compiling with ./configure --with-pydebug CC="gcc -fsanitize=undefined -fno-sanitize=alignment -fno-sanitize=shift" https://gcc.gnu.org/onlinedocs/gcc/Instrumentation-Options.html#index-fsanitize_003dundefined-962 I thought it is worth adding a test for the impossible __length_hint__() value. Since the test iterator returns no elements, there will not be a MemoryError, but if overflow detection is enabled (such as UB Sanitizer or -ftrapv), it is guaranteed to exercise the overflow path and would be detected.
msg271136 - (view)	Author: Xiang Zhang (xiang.zhang) *	Date: 2016-07-24 07:45
It's cool, but I get one problem when writing tests. support.captured_stderr cannot capture the runtime error message. So we have nothing to do with the assertion and the test will always succeed even when overflow does happen (the message will be output). To solve this problem, we have to do io redirect in file descriptor level, I wonder does this deserve that?
msg271143 - (view)	Author: Martin Panter (martin.panter) *	Date: 2016-07-24 12:08
Xiang: I don’t think we need to make the tests do anything special. Just make sure they exercise the code that handles overflows. I have been running the test suite without any -j0 option, and I can look over the output and see the error messages. Or if we get to a stage where all the errors are eliminated, you could run with UBSAN_OPTIONS=halt_on_error=1. E.g. in this patch, I add two simple tests to cover parts of the code that weren’t covered before (and if I hadn’t fixed the overflows, the tests would trigger extra UBSAN errors). ctypes_v2.patch is an update of array-size.patch. I improved the test case, and added a new fix for overflows like the following: >>> class S(ctypes.Structure): ... _fields_ = (("field", ctypes.c_longlong, 64),) ... >>> s = S() >>> s.field = 3 Modules/_ctypes/cfield.c:900:9: runtime error: signed integer overflow: -9223372036854775808 - 1 cannot be represented in type 'long long int'
msg271144 - (view)	Author: Martin Panter (martin.panter) *	Date: 2016-07-24 12:12
unicode.patch avoids an overflow in PyUnicode_Join(): >>> size = int(sys.maxsize*0.5) + 1 >>> "".join(("A" size,) * size) Objects/unicodeobject.c:9927:12: runtime error: signed integer overflow: 46341 + 2147441940 cannot be represented in type 'int' OverflowError: join() result is too long for a Python string
msg271150 - (view)	Author: Xiang Zhang (xiang.zhang) *	Date: 2016-07-24 13:33
It turns out you just want to see the output. That is easy. Patch v3 adding the test.
msg271223 - (view)	Author: Roundup Robot (python-dev)	Date: 2016-07-25 03:44
New changeset db93af6080e7 by Martin Panter in branch 'default': Issue #1621: Avoid signed overflow in list and tuple operations https://hg.python.org/cpython/rev/db93af6080e7
msg271735 - (view)	Author: Xiang Zhang (xiang.zhang) *	Date: 2016-07-31 12:19
Martin, I upload a patch to fix another possible overflow in listextend.
msg271759 - (view)	Author: Martin Panter (martin.panter) *	Date: 2016-08-01 01:26
overflow_fix_in_listextend.patch: I doubt Python supports the kinds of platform where this overflow would be possible. It may require pointers smaller than 32 bits, or char objects larger than 8 bits. Perhaps we could just add a comment explaining we assume the overflow cannot happen. It seems list objects will hold one pointer for each element, but the overflow involves the number of elements. Python defines PY_SSIZE_T_MAX as PY_SIZE_MAX // 2. For the overflow to happen we would need m + n > PY_SSIZE_T_MAX Assuming a “flat” address space that can allocate up to PY_SIZE_MAX bytes _in total_, the total number of elements cannot exceed m + n == PY_SIZE_MAX // sizeof (PyObject ) So in this scenario, the overflow cannot happen unless sizeof (PyObject ) == 1. Considering things like the 16-bit segmented Intel “large” memory model (which I doubt Python is compatible with), each list could _independently_ be up to PY_SIZE_MAX bytes. Therefore the total number of elements may reach m + n == PY_SIZE_MAX // sizeof (PyObject ) 2 So even in this case, sizeof (PyObject *) == 4 (large model) is fine, but anything less (e.g. 16-bit char, or 1-byte segment + 2-byte offset) might overflow.
msg271763 - (view)	Author: Xiang Zhang (xiang.zhang) *	Date: 2016-08-01 02:51
Hmm, I don't tend to infer platform characteristics. IMHO, it's a simple problem: sum up two lists' length which may overflow in logic. With your argument, does it means it seems somewhat meaningless to have a List a Py_ssize_t length since it can never reach it? Checks against PY_SSIZE_T_MAX have already existed (for example, in ins1).
msg271766 - (view)	Author: Martin Panter (martin.panter) *	Date: 2016-08-01 04:25
The check in ins1() was originally added in revision b9002da46f69. I presume it references the Python-dev thread “can this overflow (list insertion)?” <20000812145155.A7629@ActiveState.com>, <https://marc.info/?l=python-dev&m=107666472818169>. At that time, ob_size was an int, so overflow checking was definitely needed. Later, revision 7fdc639bc5b4 changed ob_size to Py_ssize_t, and then revision 606818c33e50 updated the overflow check from INT_MAX to PY_SSIZE_T_MAX. BTW I made a small mistake in my previous message. The worst case would be extending a list with itself. But I think the conclusion is still the same.
msg271768 - (view)	Author: Xiang Zhang (xiang.zhang) *	Date: 2016-08-01 04:47
So these checks are superfluous? Do we need to remove them? Hmm, I still doubt such checks should consider platform characteristics first. In theory List can be PY_SSIZE_T_MAX length. Do we have to put the PY_SIZE_MAX // sizeof(PyObject *) limit on it?
msg271769 - (view)	Author: Antti Haapala (ztane) *	Date: 2016-08-01 06:18
I don't believe Python would really ever work on a platform with non-8-bit-bytes, I believe there are way too much assumptions everywhere. You can program in C on such a platform, yes, but not that sure about Python. And on 8-bit-byte platfomrs, there is no large model with 16-bit pointers anywhere. There just are not enough bits that you could have multiple 64k byte-addressable segments that are addressed with 16-bit pointers. It might be that some obscure platform in the past would have had 128k memory, with large pointers, 2 allocatable 64k segments, >16 bit char pointer and 16-bit object pointers pointing to even bytes, but I doubt you'd be really porting Python 3 to such a platform, basically we're talking about something like Commodore 128 here.
msg271800 - (view)	Author: Martin Panter (martin.panter) *	Date: 2016-08-02 03:20
Looking over r60793, the overflow check at Modules/cjkcodecs/multibytecodec.c:836 looks vulnerable to being optimized away, because it can only detect the overflow if the line above has already overflowed. Perhaps change PY_SSIZE_T_MAX to MAXDECPENDING. I wonder if any of the GCC optimization and warning modes can detect this case? Also, Python/ast.c:3988 checks using PY_SIZE_MAX, but then passes the value to PyBytes_FromStringAndSize(), which expects ssize_t and in the best case would raise SystemError.
msg271803 - (view)	Author: Xiang Zhang (xiang.zhang) *	Date: 2016-08-02 05:10
I agree. For multibytecode, how about switching the positions of the two checks? If npendings + ctx->pendingsize overflows, the result can be anything, larger, smaller than or equal to MAXDECPENDING. For ast, although a SystemError may be raised but the message seems not obvious to the reason.
msg272077 - (view)	Author: Martin Panter (martin.panter) *	Date: 2016-08-06 00:59
Xiang: regarding your overflow_fix_in_listextend.patch, what do you think about adding a comment or debugging assertion instead, something like: /* It should not be possible to allocate a list large enough to cause an overflow on any relevant platform */ assert(m < PY_SSIZE_T_MAX - n);
msg272078 - (view)	Author: Xiang Zhang (xiang.zhang) *	Date: 2016-08-06 03:18
It's good Martin. Just commit it.
msg285466 - (view)	Author: Roundup Robot (python-dev)	Date: 2017-01-14 07:25
New changeset dd2c7d497878 by Martin Panter in branch '3.5': Issues #1621, #29145: Test for str.join() overflow https://hg.python.org/cpython/rev/dd2c7d497878 New changeset eb6eafafdb44 by Martin Panter in branch 'default': Issue #1621: Overflow should not be possible in listextend() https://hg.python.org/cpython/rev/eb6eafafdb44
msg303330 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2017-09-29 13:32
Martin, do you mind to create a PR for your ctypes_v2.patch?
msg317087 - (view)	Author: Martin Panter (martin.panter) *	Date: 2018-05-19 01:33
Sorry I haven’t made a PR for ctypes_v2.patch, but I don’t mind if someone else takes over. I understand the HAVE_LONG_LONG check may no longer necessary for newer Python versions.
msg325091 - (view)	Author: miss-islington (miss-islington)	Date: 2018-09-11 23:18
New changeset 6c7d67ce83a62b5f0fe5c53a6df602827451bf7f by Miss Islington (bot) (Sergey Fedoseev) in branch 'master': bpo-1621: Avoid signed integer overflow in set_table_resize(). (GH-9059) https://github.com/python/cpython/commit/6c7d67ce83a62b5f0fe5c53a6df602827451bf7f
msg325105 - (view)	Author: STINNER Victor (vstinner) *	Date: 2018-09-12 00:26
> newsize <<= 1; // The largest possible value is PY_SSIZE_T_MAX + 1. Previously, there was a explicitly check for error raising PyErr_NoMemory() on overflow. Now you rely on PyMem_Malloc() to detect the overflow. I'm not sure that it's a good idea.
msg325109 - (view)	Author: Jeffrey Walton (Jeffrey.Walton) *	Date: 2018-09-12 00:53
On Tue, Sep 11, 2018 at 8:26 PM, STINNER Victor <report@bugs.python.org> wrote: > > STINNER Victor <vstinner@redhat.com> added the comment: > >> newsize <<= 1; // The largest possible value is PY_SSIZE_T_MAX + 1. > > Previously, there was a explicitly check for error raising PyErr_NoMemory() on overflow. Now you rely on PyMem_Malloc() to detect the overflow. I'm not sure that it's a good idea. +1. It will probably work as expected on Solaris and other OSes that don't oversubscribe memory. It will probably fail in unexpected ways on Linux when the allocation succeeds but then the OOM killer hits a random process. Jeff
msg325113 - (view)	Author: Sergey Fedoseev (sir-sigurd) *	Date: 2018-09-12 02:18
> Now you rely on PyMem_Malloc() to detect the overflow. Now overflow is not possible or am I missing something?
msg325128 - (view)	Author: STINNER Victor (vstinner) *	Date: 2018-09-12 07:52
I asked if there is an issue. In fact, all Python memory allocators start by checking if the size is larger than PY_SSIZE_T_MAX. Example: void * PyMem_RawMalloc(size_t size) { /* * Limit ourselves to PY_SSIZE_T_MAX bytes to prevent security holes. * Most python internals blindly use a signed Py_ssize_t to track * things without checking for overflows or negatives. * As size_t is unsigned, checking for size < 0 is not required. */ if (size > (size_t)PY_SSIZE_T_MAX) return NULL; return _PyMem_Raw.malloc(_PyMem_Raw.ctx, size); }
msg325284 - (view)	Author: STINNER Victor (vstinner) *	Date: 2018-09-13 18:54
Benjamin: what do you think of adding an explicit check after the "new_size <<= 1;" loop? if (new_size > (size_t)PY_SSIZE_T_MAX) { PyErr_NoMemory(); return -1; } Technically, PyMem_Malloc() already implements the check, so it's not really needed. So I'm not sure if it's needed :-)
msg328068 - (view)	Author: STINNER Victor (vstinner) *	Date: 2018-10-19 22:48
New changeset a9274f7b3f69519f0746c50f85a68abd926ebe7b by Victor Stinner (Miss Islington (bot)) in branch '3.6': bpo-1621: Avoid signed integer overflow in set_table_resize(). (GH-9059) (GH-9199) https://github.com/python/cpython/commit/a9274f7b3f69519f0746c50f85a68abd926ebe7b
msg328069 - (view)	Author: STINNER Victor (vstinner) *	Date: 2018-10-19 22:50
New changeset 6665802549006eb50c1a68c3489ee3aaf81d0c8e by Victor Stinner (Miss Islington (bot)) in branch '3.7': bpo-1621: Avoid signed integer overflow in set_table_resize() (GH-9059) (GH-9198) https://github.com/python/cpython/commit/6665802549006eb50c1a68c3489ee3aaf81d0c8e
msg328070 - (view)	Author: STINNER Victor (vstinner) *	Date: 2018-10-19 22:55
Thank you very much to the task force who worked on this issues which can be seen as boring and useless, but are very important nowadays with C compilers which are more and more agressive to optimize everything (I'm looking at you clang!). This bug is open for 11 years and dozens and dozens of undefined behaviours have been fixed in the meanwhile. This bug is a giant beast with many patches and many pull requests. I dislike such bug, it's very hard to follow them. I suggest to open new bugs for undefined behaviour on specific functions, rather than a very vague "let's open a single bug to track everything". It's now time to close the issue.

History
Date	User	Action	Args
2022-04-11 14:56:28	admin	set	github: 45962
2018-10-19 22:55:35	vstinner	set	status: open -> closed resolution: fixed messages: + msg328070 stage: patch review -> resolved
2018-10-19 22:50:38	vstinner	set	messages: + msg328069
2018-10-19 22:48:50	vstinner	set	messages: + msg328068
2018-09-13 18:54:21	vstinner	set	messages: + msg325284
2018-09-13 16:48:01	benjamin.peterson	set	pull_requests: + pull_request8693
2018-09-13 16:48:00	benjamin.peterson	set	pull_requests: + pull_request8692
2018-09-12 07:52:57	vstinner	set	messages: + msg325128
2018-09-12 02:18:31	sir-sigurd	set	nosy: + sir-sigurd messages: + msg325113
2018-09-12 00:53:34	Jeffrey.Walton	set	messages: + msg325109
2018-09-12 00:26:22	vstinner	set	messages: + msg325105
2018-09-11 23:18:25	miss-islington	set	pull_requests: + pull_request8634
2018-09-11 23:18:13	miss-islington	set	pull_requests: + pull_request8633
2018-09-11 23:18:04	miss-islington	set	nosy: + miss-islington messages: + msg325091
2018-09-04 10:46:26	sir-sigurd	set	pull_requests: + pull_request8519
2018-08-25 00:15:33	gregory.p.smith	set	versions: + Python 3.7, Python 3.8, - Python 3.3
2018-05-19 01:33:22	martin.panter	set	messages: + msg317087
2017-09-29 13:33:05	serhiy.storchaka	link	issue31637 superseder
2017-09-29 13:32:44	serhiy.storchaka	set	messages: + msg303330
2017-01-14 07:25:17	python-dev	set	messages: + msg285466
2017-01-09 04:24:18	martin.panter	set	dependencies: + failing overflow checks in replace_*
2016-08-06 03:18:08	xiang.zhang	set	messages: + msg272078
2016-08-06 00:59:03	martin.panter	set	messages: + msg272077
2016-08-02 10:35:06	christian.heimes	set	nosy: - christian.heimes
2016-08-02 05:10:57	xiang.zhang	set	messages: + msg271803
2016-08-02 03:20:51	martin.panter	set	messages: + msg271800
2016-08-01 06:18:05	ztane	set	messages: + msg271769
2016-08-01 04:47:54	xiang.zhang	set	messages: + msg271768
2016-08-01 04:25:54	martin.panter	set	messages: + msg271766
2016-08-01 03:22:57	gregory.p.smith	set	nosy: - gregory.p.smith
2016-08-01 02:51:50	xiang.zhang	set	messages: + msg271763
2016-08-01 01:26:13	martin.panter	set	messages: + msg271759
2016-07-31 12:19:43	xiang.zhang	set	files: + overflow_fix_in_listextend.patch messages: + msg271735
2016-07-25 03:44:28	python-dev	set	messages: + msg271223
2016-07-24 13:33:30	xiang.zhang	set	files: + tuple_and_list_v3.patch messages: + msg271150
2016-07-24 12:12:53	martin.panter	set	files: + unicode.patch messages: + msg271144
2016-07-24 12:08:27	martin.panter	set	files: + ctypes_v2.patch messages: + msg271143
2016-07-24 07:45:44	xiang.zhang	set	messages: + msg271136
2016-07-24 00:10:29	martin.panter	set	messages: + msg271118
2016-07-23 15:06:50	xiang.zhang	set	files: + tuple_and_list_v2.patch messages: + msg271084
2016-07-23 05:17:08	martin.panter	set	files: + array-size.patch
2016-07-23 05:09:51	martin.panter	set	messages: + msg271058
2016-07-23 04:35:47	martin.panter	set	files: + thread.patch messages: + msg271057
2016-07-22 08:12:10	xiang.zhang	set	files: + tuple_and_list.patch nosy: + xiang.zhang messages: + msg270977
2016-07-20 15:34:09	gvanrossum	set	nosy: - gvanrossum
2016-07-20 12:48:43	martin.panter	set	messages: + msg270869
2016-07-19 14:06:40	ztane	set	messages: + msg270827
2016-07-19 12:57:32	serhiy.storchaka	set	messages: + msg270823
2016-07-19 03:44:38	martin.panter	set	files: + slice-step.patch messages: + msg270809 versions: + Python 3.6
2016-07-19 03:09:15	python-dev	set	messages: + msg270807
2016-07-16 19:14:50	ztane	set	messages: + msg270582
2016-07-16 19:03:35	Jeffrey.Walton	set	messages: + msg270580
2016-07-16 15:16:00	martin.panter	set	messages: + msg270562
2016-07-15 03:48:13	serhiy.storchaka	set	nosy: + serhiy.storchaka messages: + msg270463
2016-07-15 02:57:54	martin.panter	set	files: + set-overflow.patch messages: + msg270461
2016-07-15 02:53:36	martin.panter	set	files: + trapv.patch nosy: + martin.panter messages: + msg270460 dependencies: + test_time fails: strftime('%Y', y) for negative year
2016-07-10 13:05:56	ztane	set	nosy: + ztane messages: + msg270084
2016-07-10 07:43:23	martin.panter	set	dependencies: + bytes_concat seems to check overflow using undefined behaviour
2014-10-14 17:16:16	skrah	set	nosy: - skrah
2014-10-10 13:10:52	jwilk	set	nosy: + jwilk
2014-03-16 15:07:08	Jeffrey.Walton	set	nosy: + Jeffrey.Walton messages: + msg213729
2013-03-08 08:47:47	fweimer	set	nosy: + fweimer
2011-11-19 17:10:39	mark.dickinson	set	messages: + msg147958
2011-09-25 14:35:02	python-dev	set	messages: + msg144524
2011-09-24 18:12:00	python-dev	set	messages: + msg144505
2011-09-24 17:19:43	python-dev	set	nosy: + python-dev messages: + msg144503
2011-09-24 15:57:05	mark.dickinson	set	files: + issue1621_hashes_and_sets.patch messages: + msg144501 versions: - Python 2.7, Python 3.2
2011-09-24 14:48:03	loewis	set	priority: high -> normal messages: + msg144499
2011-09-24 08:27:18	mark.dickinson	set	messages: + msg144492
2011-09-24 08:24:54	mark.dickinson	set	messages: + msg144490 versions: + Python 3.3, - Python 2.6, Python 3.1
2011-09-14 04:17:55	alex	set	nosy: + alex
2011-09-14 01:29:07	jcea	set	nosy: + jcea
2011-08-10 15:47:11	deadshort	set	nosy: + deadshort messages: + msg141871
2011-06-01 06:15:27	terry.reedy	set	versions: - Python 2.5
2011-02-10 09:29:33	skrah	set	nosy: + skrah
2011-02-10 09:28:35	skrah	link	issue11167 superseder
2010-05-21 16:51:49	dmalcolm	set	nosy: + dmalcolm
2010-05-11 20:55:56	terry.reedy	set	versions: + Python 2.5
2010-05-11 20:55:31	terry.reedy	set	versions: + Python 2.7, Python 3.2, - Python 2.5, Python 3.0
2009-05-16 18:40:23	loewis	set	messages: + msg87908
2009-05-14 09:50:36	mark.dickinson	set	messages: + msg87731
2009-05-14 09:00:18	mark.dickinson	set	messages: + msg87730
2009-05-13 21:32:41	loewis	set	messages: + msg87712
2009-05-13 20:58:27	mark.dickinson	set	messages: + msg87708
2009-05-13 20:38:52	loewis	set	messages: + msg87707
2009-05-13 20:05:40	mark.dickinson	set	messages: + msg87704
2009-05-13 16:48:17	donmez	set	messages: + msg87694
2009-05-13 16:45:41	mark.dickinson	set	messages: + msg87693
2009-05-13 16:01:08	gregory.p.smith	set	messages: + msg87690
2009-05-13 15:53:06	mark.dickinson	set	nosy: + mark.dickinson messages: + msg87689
2009-05-12 13:29:05	pitrou	set	nosy: + pitrou
2009-05-12 13:28:06	ajaksu2	set	nosy: + vstinner versions: + Python 3.1 stage: patch review
2008-03-10 17:25:27	matejcik	set	nosy: + matejcik
2008-02-21 11:03:16	donmez	set	messages: + msg62616
2008-01-28 18:09:31	donmez	set	messages: + msg61792
2008-01-28 18:05:19	donmez	set	messages: + msg61791
2008-01-28 17:57:10	donmez	set	messages: + msg61788
2008-01-28 17:54:39	donmez	set	messages: + msg61785
2008-01-28 17:54:09	gvanrossum	set	messages: + msg61784
2008-01-28 08:14:20	loewis	set	messages: + msg61774
2008-01-28 03:32:53	donmez	set	messages: + msg61770
2008-01-28 03:20:28	gvanrossum	set	messages: + msg61768
2008-01-28 03:16:57	donmez	set	messages: + msg61767
2008-01-28 03:10:02	donmez	set	messages: + msg61766
2008-01-28 03:02:23	donmez	set	files: + csv.patch messages: + msg61765
2008-01-28 02:54:17	gvanrossum	set	messages: + msg61763
2008-01-28 02:51:06	nnorwitz	set	messages: + msg61762
2008-01-28 02:45:11	donmez	set	messages: + msg61761
2008-01-28 02:41:25	gvanrossum	set	messages: + msg61760
2008-01-28 02:41:05	nnorwitz	set	messages: + msg61759
2008-01-28 02:28:49	donmez	set	messages: + msg61758
2008-01-28 02:17:36	nnorwitz	set	messages: + msg61757
2008-01-25 22:08:46	nnorwitz	set	nosy: + nnorwitz
2008-01-20 17:57:55	donmez	set	messages: + msg61320
2008-01-20 17:47:01	loewis	set	messages: + msg61319
2008-01-20 13:56:09	christian.heimes	set	messages: + msg61291
2008-01-20 13:01:06	christian.heimes	set	messages: + msg61286
2008-01-20 11:36:59	donmez	set	files: + fix-overflows-final.patch messages: + msg61272
2008-01-20 03:29:12	donmez	set	files: + fix-overflows-try3.patch messages: + msg60260
2008-01-20 01:48:15	donmez	set	files: + fix-overflows-try2.patch messages: + msg60254
2008-01-19 23:15:57	donmez	set	messages: + msg60246
2008-01-19 12:58:46	christian.heimes	set	priority: high
2008-01-19 11:01:47	christian.heimes	set	messages: + msg60146
2008-01-18 23:40:40	loewis	set	messages: + msg60126
2008-01-18 23:35:59	donmez	set	files: + fix-overflows-try1.patch messages: + msg60125
2008-01-18 23:15:06	donmez	set	files: + overflow-error4.patch
2008-01-18 22:24:17	christian.heimes	set	messages: + msg60124
2008-01-18 21:16:53	gvanrossum	set	keywords: + patch messages: + msg60121
2008-01-18 21:13:36	donmez	set	files: + overflow-error3.patch
2008-01-18 21:11:25	donmez	set	files: + overflow-error2.patch messages: + msg60120
2008-01-18 21:05:32	gvanrossum	set	messages: + msg60118
2008-01-18 20:58:31	donmez	set	files: + overflow-error.patch messages: + msg60116
2008-01-18 20:49:07	gvanrossum	set	messages: + msg60115
2008-01-18 20:47:53	gvanrossum	set	messages: + msg60114
2008-01-18 18:59:33	donmez	set	messages: + msg60111
2008-01-18 18:56:44	christian.heimes	set	messages: + msg60110
2008-01-18 18:50:22	donmez	set	messages: + msg60109
2008-01-18 18:27:52	donmez	set	messages: + msg60107
2008-01-18 18:16:26	christian.heimes	set	messages: + msg60105
2008-01-18 17:53:41	gvanrossum	set	messages: + msg60103
2008-01-18 16:48:33	donmez	set	messages: + msg60102
2008-01-18 01:50:58	donmez	set	messages: + msg60079
2008-01-18 01:22:43	gvanrossum	set	messages: + msg60078
2008-01-11 09:47:51	donmez	set	messages: + msg59699
2008-01-11 08:43:37	loewis	set	messages: + msg59696
2008-01-11 03:26:28	donmez	set	messages: + msg59694
2008-01-11 03:24:37	donmez	set	messages: + msg59693
2008-01-11 03:06:25	alexandre.vassalotti	set	messages: + msg59692
2008-01-09 18:59:17	loewis	set	messages: + msg59619
2008-01-09 18:12:52	donmez	set	nosy: + donmez messages: + msg59616
2008-01-09 17:30:53	gvanrossum	set	messages: + msg59612
2008-01-09 17:29:05	gvanrossum	set	nosy: + gvanrossum messages: + msg59611
2007-12-17 23:49:28	gregory.p.smith	set	messages: + msg58711 title: Python should compile with -Wstrict-overflow when using gcc -> Do not assume signed integer overflow behavior
2007-12-17 06:42:02	alexandre.vassalotti	set	nosy: + alexandre.vassalotti messages: + msg58684
2007-12-14 09:24:33	lemburg	set	nosy: - lemburg
2007-12-14 09:24:20	lemburg	set	nosy: + lemburg messages: + msg58620
2007-12-14 07:08:23	loewis	set	nosy: + loewis messages: + msg58617
2007-12-14 03:23:51	christian.heimes	set	files: + config.patch messages: + msg58611
2007-12-14 02:45:55	christian.heimes	set	messages: + msg58610
2007-12-14 02:19:07	christian.heimes	set	nosy: + christian.heimes messages: + msg58609
2007-12-14 00:43:47	gregory.p.smith	create