classification
Title: Overflow in parsing 'float' parameters in PyArg_ParseTuple*
Type: enhancement Stage:
Components: Interpreter Core Versions: Python 3.3
process
Status: closed Resolution: wont fix
Dependencies: Superseder:
Assigned To: mark.dickinson Nosy List: mark.dickinson, serhiy.storchaka, skrah
Priority: normal Keywords: patch

Created on 2012-05-04 14:37 by serhiy.storchaka, last changed 2012-05-07 12:31 by serhiy.storchaka. This issue is now closed.

Files
File name Uploaded Description Edit
getargs_float_overflow.patch serhiy.storchaka, 2012-05-04 14:37 review
getargs_float_overflow_2.patch serhiy.storchaka, 2012-05-04 19:29 review
Messages (12)
msg159937 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-05-04 14:37
In function convertsimple() in Python/getargs.c possible converting out of float range value from double to float.

Fortunately, 'f' format character is not used in CPython source code. But it can be used in the extensions.

Tests will be later.
msg159939 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2012-05-04 14:50
I don't think this change should be made for 2.7 or 3.2, since it has potential to break existing code.

Though technically, conversion of an out-of-range double to a float gives undefined behaviour (C99 6.3.1.5p2), I'm willing to bet that most current compilers just happily return +-infinity in these cases, and there's probably code out there that would break if we changed this.

So for 2.7 or 3.2, we could just return +-inf here instead.  Though even that isn't quite right if you're picky about corner cases, since there are some values *just* outside [-FLOAT_MAX, FLOAT_MAX] that should still round to +-FLOAT_MAX under round-to-nearest.

I suggest leaving this alone for 2.7 and 3.2

For 3.3, it's not clear to me whether it's better to return +-inf or to raise here.
msg159940 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2012-05-04 15:04
I was just remembering that I was *sure* I'd seen the double->float avoiding undefined behaviour issue mentioned on a C mailing list not so long ago.  Turns out that there was a good reason for me remembering that...


https://groups.google.com/group/comp.lang.c/browse_thread/thread/5d93cc742025b298
msg159952 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-05-04 17:56
I also thought about ±∞. But PyLong_AsDouble() raises OverflowError for
out-of-range value. And it checks strictly out of ±DBL_MAX.

Because float(10**1000) returns no float('inf'), but raises an
exception, I think that returning ±∞ will be wrong.
msg159953 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2012-05-04 18:12
> And it checks strictly out of ±DBL_MAX.

Nope.  Values just larger than DBL_MAX won't raise OverflowError here.

> Because float(10**1000) returns no float('inf'), but raises an
> exception, I think that returning ±∞ will be wrong.

Possibly.  But there's also the fact that 3.2 already returns inf here;  we'd need a pretty good reason to break that.  Like I said, I'm not sure which the right way to go here is.
msg159963 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-05-04 19:29
Here is a patch with tests.

> Nope.  Values just larger than DBL_MAX won't raise OverflowError here.

Isn't that a little bit. But values that rounded to DBL_MAX can raise
OverflowError. In any case it's too difficult to achieve strict behavior
in this corner case.

> Possibly.  But there's also the fact that 3.2 already returns inf here;  we'd need a pretty good reason to break that.

In the end, we may add the environment variable
PYTHONDONTRAISEEXCEPTIONIFFLOATOVERFLOWS to control this behaviour.

> Like I said, I'm not sure which the right way to go here is.

Take a look at the tests and may be you'll see the system.
msg159966 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2012-05-04 19:39
> But values that rounded to DBL_MAX can raise
> OverflowError. In any case it's too difficult to achieve strict behavior
> in this corner case.

Well, PyLong_AsDouble *does* achieve strict behaviour in this corner case :-).  Integers less than 0.5 * (sys.float_info.max + 2**1024) in absolute value give finite results;  integers greater than or equal to that bound produce an OverflowError.

> Take a look at the tests and may be you'll see the system.

I don't see how looking at the tests helps with making a decision about breaking backwards compatibility or not. :-)
msg160010 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-05-05 17:41
No one integer produces infinity in 'double' parameter parsing.

But the 'float' parameter parsing can produce infinity, and it can raise
an exception. To be consistent, we need or produce infinity on double
overflow (in this case, we must explicitly produce infinity on float
overflow), or to raise an exception on float overflow.

There is also a third option -- deprecate the 'float' parameter parsing.
Leave the responsibility for the proper overflow handling on the user.
msg160039 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2012-05-05 20:25
The proposal makes sense at first glance, but I agree with Mark that it is
not clear what should be done. For example, all arrays in Python silently
convert to inf:

>>> from numpy import array
>>> x = array([1,2,3], 'f')
>>> x
array([ 1.,  2.,  3.], dtype=float32)
>>> x[0] = 10**100 
>>> x
array([ inf,   2.,   3.], dtype=float32)


Same for array.array and memoryview. I would not be surprised if users rely
on this behavior. Anyway, silently converting to infinity is exactly what
I'd expect (also for double BTW).

 
Regarding undefined behavior: I only know compilers that convert to infinity
without signaling overflow. The tests for the new memoryview implementation 
should include this case (I think!), and I ran the tests with all compilers 
that I've access to.
msg160129 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2012-05-07 10:15
Closing as "won't fix";  Python is sadly far from consistent about returning infinity versus raising OverflowError, in a wide variety of situations.  For example, compare:

* float(Decimal('1e310')) with float(Fraction('1e310')), or 

* struct.pack('f', 1e100) with struct.pack('<f', 1e100), or

* 1e160 * 1e160 with 1e160 ** 2.

Given this, I think it's far from clear what the right answer for the getargs 'f' code is, and in this situation I think we should just stick with the status quo.
msg160132 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2012-05-07 10:35
I agree. Fixing all this would probably require a PEP. It looks like the
original plan was to provide a facility to turn off the Overflow exception:

http://mail.python.org/pipermail/python-dev/2000-May/003990.html
msg160143 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-05-07 12:31
May be proposed tests (except for the overflow) would be helpful? Right now 'f' and 'd' parsing code is not covered by tests.
History
Date User Action Args
2012-05-07 12:31:15serhiy.storchakasetmessages: + msg160143
2012-05-07 10:35:26skrahsetmessages: + msg160132
2012-05-07 10:15:28mark.dickinsonsetstatus: open -> closed
resolution: wont fix
messages: + msg160129
2012-05-05 20:25:38skrahsetmessages: + msg160039
2012-05-05 17:41:06serhiy.storchakasetmessages: + msg160010
2012-05-04 19:39:33mark.dickinsonsetmessages: + msg159966
2012-05-04 19:31:07serhiy.storchakasettype: enhancement
versions: - Python 2.7, Python 3.2
2012-05-04 19:29:50serhiy.storchakasetfiles: + getargs_float_overflow_2.patch

messages: + msg159963
2012-05-04 18:12:08mark.dickinsonsetmessages: + msg159953
2012-05-04 17:56:08serhiy.storchakasetmessages: + msg159952
2012-05-04 15:09:40mark.dickinsonsetassignee: mark.dickinson
2012-05-04 15:04:10mark.dickinsonsetmessages: + msg159940
2012-05-04 14:50:17mark.dickinsonsetnosy: + skrah
2012-05-04 14:50:01mark.dickinsonsetnosy: + mark.dickinson
messages: + msg159939
2012-05-04 14:37:16serhiy.storchakacreate