classification
Title: Avoid raising OverflowError if possible
Type: enhancement Stage:
Components: Interpreter Core Versions: Python 3.7
process
Status: open Resolution:
Dependencies: 28876 29816 29819 29834 29839 Superseder:
Assigned To: Nosy List: Oren Milman, gvanrossum, haypo, mark.dickinson, rhettinger, serhiy.storchaka
Priority: normal Keywords:

Created on 2017-03-17 08:31 by serhiy.storchaka, last changed 2017-03-19 00:35 by rhettinger.

Messages (5)
msg289747 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-03-17 08:31
OverflowError usually is caused by platform limitations. It is raised on the fence between Python and C when convert Python integer to C integer type. On other platform the same input can be accepted or cause raising ValueError if the value is out of range.

I think we should avoid raising OverflowError if possible. If the function accepts only non-negative integers, it should raise the same ValueError, IndexError or OverflowError for -10**100 as for -1. If the function bounds integer value to the range from 0 to 100, it should do this also for integers that don't fit in C integer type. If large argument means allocating an amount of memory that exceeds the address space, it should raise MemoryError rather than OverflowError.

This principle is already supported in the part of the interpreter. For example:

>>> 'abc'[:10**100]
'abc'
>>> 'abc'[-10**100:]
'abc'
>>> bytes([10**100])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: bytes must be in range(0, 256)
>>> round(1.2, 10**100)
1.2
>>> round(1.2, -10**100)
0.0
>>> math.factorial(-10**100)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: factorial() not defined for negative values

This is a meta-issue. Concrete changes will be made in sub-issues.
msg289795 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2017-03-18 07:41
IIRC, there have been previous discussions about little inconsistencies between when objects raise an OverflowError versus MemoryError and IndexError, and ValueError.   I believe that Guido had opined that the choices were made somewhat arbitrarily (by different people at different times) but that it hadn't proved to be an actual problem in practice and that changing exception types after an API has already been released is more disruptive to users (potentially breaking existing, tested code) than living with the minor inconsistencies.

Guido, do you want these exceptions changed and do you agree with the Serhiy's new principles?   

My own thoughts are:
* As a starting point, it seems reasonable to want consistent errors across ranges of input values.  And being more predictable is a virtue as well.
* Changing existing APIs is disruptive, making it more difficult to maintain cross-version code, breaking existing code or tests that use the current exceptions, and creating unnecessary work for Jython, IronPython, and PyPy who would have to follow our myriad of little changes.
* Personally, I find OverflowError to be more self-explanatory of the cause of an exception than MemoryError which is more jarring and seemingly indicative of inadequate memory.
* Likewise, Overflow error is more precise and helpful than ValueError which is very generic, covering a wide variety of problems.
* A lot of third-party tools have evolved over time that mimic the behaviors of built-in types.  If we change those behaviors now, the ecosystem will likely never fully sync-up.
msg289821 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2017-03-18 17:07
If I had to do it over again I would have used OverflowError only for some very narrowly defined conditions and ValueError for "logical" range limitations. In particular OverflowError suggests that the abstraction is slightly broken (since we usually don't think much about how large an integer fits in a register) while ValueError suggests that the caller passed something of the right type but with an inappropriate value.

I'm not too worried about breaking APIs in this case (but that could change if someone finds data showing there are common idioms in actual use that would break).
msg289826 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-03-18 19:24
I don't expect that any code rely on OverflowError. I don't remember any
code catching explicitly this exception.

As MemoryError, it's not common to catch these exceptions.

I expect that Python abstract the hardware if the cost on performance is
acceptable.
msg289835 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2017-03-19 00:35
> I don't expect that any code rely on OverflowError. 
> ...
> As MemoryError, it's not common to catch these exceptions.

So why bother making any changes here at all?  It seems like an invented problem resulting in unnecessary churn, creating more work for downstream implementations and test suites.  It isn't even clear that Python will be any better after the change.

One thing I'm strongly against is changing the published C API for things like PyLong_AsUnsignedLong, https://docs.python.org/3/c-api/long.html#c.PyLong_AsUnsignedLong .  Also, in the content of converting to and from fixed-width C type, an OverflowError seems like exactly the right error.
History
Date User Action Args
2017-03-19 00:35:25rhettingersetmessages: + msg289835
2017-03-18 19:24:39hayposetmessages: + msg289826
2017-03-18 17:07:02gvanrossumsetmessages: + msg289821
2017-03-18 07:41:08rhettingersetnosy: + gvanrossum
messages: + msg289795
2017-03-17 19:58:01serhiy.storchakasetdependencies: + Avoid raising OverflowError in len() when __len__() returns negative large value
2017-03-17 08:45:21serhiy.storchakasetdependencies: + Raise ValueError rather of OverflowError in PyLong_AsUnsignedLong()
2017-03-17 08:36:16serhiy.storchakasetdependencies: + bool of large range raises OverflowError, Get rid of C limitation for shift count in right shift, Avoid raising OverflowError in truncate() if possible
2017-03-17 08:31:01serhiy.storchakacreate