Message 87707 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	loewis
Recipients	alexandre.vassalotti, christian.heimes, donmez, gregory.p.smith, gvanrossum, loewis, mark.dickinson, matejcik, nnorwitz, pitrou, vstinner
Date	2009-05-13.20:38:51
SpamBayes Score	1.0829115e-12
Marked as misclassified	No
Message-id	<4A0B2FD9.9060504@v.loewis.de>
In-reply-to	<1242245141.48.0.021861166145.issue1621@psf.upfronthosting.co.za>

Content
> I'm finding many overflow checks that look like: > > size = Py_SIZE(a) * n; > if (n && size / n != Py_SIZE(a)) { > PyErr_SetString(PyExc_OverflowError, > "repeated bytes are too long"); > return NULL; > } > > where size and n have type Py_ssize_t. That particular one comes > from bytesobject.c (in py3k), but this style of check occurs > frequently throughout the source. > > Do people think that all these should be fixed? If this really invokes undefined behavior already (i.e. a compiler could set "size" to -1, and have the test fail - ie. not give an exception, and still be conforming) - then absolutely yes. > The fix itself s reasonably straightforward: instead of multiplying > and then checking for an overflow that's already happened (and hence > has already invoked undefined behaviour according to the standards), > get an upper bound for n first by dividing PY_SSIZE_T_MAX > by Py_SIZE(a) and use that to do the overflow check before > the multiplication. It shouldn't be less efficient: either way > involves an integer division, a comparison, and a multiplication. [and then perform the multiplication unsigned, to silence the warning - right?] I think there is a second solution: perform the multiplication unsigned in the first place. For unsigned multiplication, IIUC, overflow behavior is guaranteed in standard C (i.e. it's modulo 2*N, where N is the number of value bits for the unsigned value). So the code would change to nbytes = (size_t)Py_SIZE(a)n; if (n && (nbytes > Py_SSIZE_T_MAX \|\| nbytes/n != Py_SIZE(a))... size = (Py_ssize_t)nbytes;

> I'm finding many overflow checks that look like:
> 
> 	size = Py_SIZE(a) * n;
> 	if (n && size / n != Py_SIZE(a)) {
> 		PyErr_SetString(PyExc_OverflowError,
> 			"repeated bytes are too long");
> 		return NULL;
> 	}
> 
> where size and n have type Py_ssize_t.  That particular one comes
> from bytesobject.c (in py3k), but this style of check occurs
> frequently throughout the source.
> 
> Do people think that all these should be fixed?  

If this really invokes undefined behavior already (i.e. a compiler
could set "size" to -1, and have the test fail - ie. not give
an exception, and still be conforming) - then absolutely yes.

> The fix itself s reasonably straightforward:  instead of multiplying
> and then checking for an overflow that's already happened (and hence
> has already invoked undefined behaviour according to the standards),
> get an upper bound for n *first* by dividing PY_SSIZE_T_MAX
> by Py_SIZE(a) and use that to do the overflow check *before*
> the multiplication.  It shouldn't be less efficient:  either way
> involves an integer division, a comparison, and a multiplication.

[and then perform the multiplication unsigned, to silence the
warning - right?]

I think there is a second solution: perform the multiplication
unsigned in the first place. For unsigned multiplication, IIUC,
overflow behavior is guaranteed in standard C (i.e. it's modulo
2**N, where N is the number of value bits for the unsigned value).

So the code would change to

nbytes = (size_t)Py_SIZE(a)*n;
if (n && (nbytes > Py_SSIZE_T_MAX || nbytes/n != Py_SIZE(a))...
size = (Py_ssize_t)nbytes;

History
Date	User	Action	Args
2009-05-13 20:38:54	loewis	set	recipients: + loewis, gvanrossum, nnorwitz, gregory.p.smith, mark.dickinson, pitrou, vstinner, christian.heimes, alexandre.vassalotti, donmez, matejcik
2009-05-13 20:38:52	loewis	link	issue1621 messages
2009-05-13 20:38:51	loewis	create