Message 95803 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	mark.dickinson
Recipients	mark.dickinson
Date	2009-11-29.11:56:17
SpamBayes Score	0.0
Marked as misclassified	No
Message-id	<1259495783.96.0.0738564373709.issue7406@psf.upfronthosting.co.za>
In-reply-to

Content
Much of the code in Objects/intobject.c assumes that an arithmetic operation on signed longs will wrap modulo 2*(bits_in_long) on overflow. However, signed overflow causes undefined behaviour according to the C standards (e.g., C99 6.5, para. 5), and gcc is known to assume that signed overflow never occurs in correct code, and to make use of this assumption when optimizing. An obvious example is found in int_add, which looks like this: static PyObject int_add(PyIntObject v, PyIntObject w) { register long a, b, x; CONVERT_TO_LONG(v, a); CONVERT_TO_LONG(w, b); x = a + b; if ((x^a) >= 0 \|\| (x^b) >= 0) return PyInt_FromLong(x); return PyLong_Type.tp_as_number->nb_add((PyObject )v, (PyObject )w); } Here Python is relying on the line 'x = a + b' wrapping on overflow. While this code doesn't seem to have caused any problems to date, it's not at all inconceivable that some future version of GCC is clever enough to figure out that (with its assumption that correct code never includes signed overflow) the if condition is always false, so can be optimized away. At that point, a Python interpreter built with this version of GCC would produce incorrect results for int addition. More generally, Python's source makes a number of assumptions about integer arithmetic that aren't guaranteed by the C standards. Most of these assumptions are likely to be harmless on modern machines, but the assumptions should probably at least be documented somewhere, and ideally also checked somewhere in the configuration, so that attempts to port Python to machines that don't obey these assumptions complain loudly. Namely, the source assumes at least that: - C signed ints are represented in two's complement, not ones' complement or sign-and-magnitude. - the bit pattern 1000....000 is not a trap representation (so e.g., INT_MIN = -INT_MAX-1, not -INT_MAX). - conversion from an unsigned integer type to the corresponding signed type wraps modulo 2**(appropriate_number_of_bits). (Relevant standard sections: C99 6.2.6.2, C99 6.3.1.3p3.) See also issue 1621.

Much of the code in Objects/intobject.c assumes that an arithmetic 
operation on signed longs will wrap modulo 2**(bits_in_long) on 
overflow.  However, signed overflow causes undefined behaviour according 
to the C standards (e.g., C99 6.5, para. 5), and gcc is known to assume 
that signed overflow never occurs in correct code, and to make use of 
this assumption when optimizing.

An obvious example is found in int_add, which looks like this:

static PyObject *
int_add(PyIntObject *v, PyIntObject *w)
{
	register long a, b, x;
	CONVERT_TO_LONG(v, a);
	CONVERT_TO_LONG(w, b);
	x = a + b;
	if ((x^a) >= 0 || (x^b) >= 0)
		return PyInt_FromLong(x);
	return PyLong_Type.tp_as_number->nb_add((PyObject *)v, (PyObject 
*)w);
}

Here Python is relying on the line 'x = a + b' wrapping on overflow.  
While this code doesn't seem to have caused any problems to date, it's 
not at all inconceivable that some future version of GCC is clever 
enough to figure out that (with its assumption that correct code never 
includes signed overflow) the if condition is always false, so can be 
optimized away.  At that point, a Python interpreter built with this 
version of GCC would produce incorrect results for int addition.


More generally, Python's source makes a number of assumptions about 
integer arithmetic that aren't guaranteed by the C standards.  Most of 
these assumptions are likely to be harmless on modern machines, but the 
assumptions should probably at least be documented somewhere, and 
ideally also checked somewhere in the configuration, so that attempts to 
port Python to machines that don't obey these assumptions complain 
loudly.  Namely, the source assumes at least that:

- C signed ints are represented in two's complement, not ones'
  complement or sign-and-magnitude.

- the bit pattern 1000....000 is not a trap representation (so
  e.g., INT_MIN = -INT_MAX-1, not -INT_MAX).

- conversion from an unsigned integer type to the corresponding signed
  type wraps modulo 2**(appropriate_number_of_bits).

(Relevant standard sections:  C99 6.2.6.2, C99 6.3.1.3p3.)


See also issue 1621.

History
Date	User	Action	Args
2009-11-29 11:56:24	mark.dickinson	set	recipients: + mark.dickinson
2009-11-29 11:56:23	mark.dickinson	set	messageid: <1259495783.96.0.0738564373709.issue7406@psf.upfronthosting.co.za>
2009-11-29 11:56:22	mark.dickinson	link	issue7406 messages
2009-11-29 11:56:17	mark.dickinson	create