classification
Title: Inconsistency in overflow error messages of integer argument
Type: enhancement Stage: patch review
Components: Interpreter Core Versions: Python 3.7
process
Status: open Resolution:
Dependencies: 15989 28298 29816 Superseder:
Assigned To: serhiy.storchaka Nosy List: Oren Milman, ezio.melotti, mark.dickinson, serhiy.storchaka, terry.reedy
Priority: low Keywords: patch

Created on 2012-09-20 18:51 by serhiy.storchaka, last changed 2017-03-19 14:08 by Oren Milman.

Files
File name Uploaded Description Edit
issue15988_ver1.diff Oren Milman, 2016-10-15 23:43 proposed patches diff file - ver1 review
testPatches.py Oren Milman, 2016-10-15 23:44 an ugly script that tests my patches
PyLong_As_and_PyArg_Parse_patchVer1.diff Oren Milman, 2017-03-18 14:19 review
formats_patchVer1.diff Oren Milman, 2017-03-18 22:48 review
array_patchVer1.diff Oren Milman, 2017-03-19 07:40
hashlib_lzma_pickle_patchVer1.diff Oren Milman, 2017-03-19 08:48 review
time_and_re_patchVer1.diff Oren Milman, 2017-03-19 11:28 review
curses_stat_callproc_abstract_patchVer1.diff Oren Milman, 2017-03-19 12:24
mmap_posix_socket_select_patchVer1.diff Oren Milman, 2017-03-19 14:01
Pull Requests
URL Status Linked Edit
PR 668 open Oren Milman, 2017-03-14 23:15
Messages (26)
msg170827 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-09-20 18:51
PyArg_ParseTuple raises inconsistent overflow error messages for small integer formats. For example:

>>> import _testcapi
>>> _testcapi.getargs_b(100)
100
>>> _testcapi.getargs_b(1000)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
OverflowError: unsigned byte integer is greater than maximum
>>> _testcapi.getargs_b(-1000)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
OverflowError: unsigned byte integer is less than minimum
>>> _testcapi.getargs_b(100000000000000000000)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
OverflowError: Python int too large to convert to C long
>>> _testcapi.getargs_b(-100000000000000000000)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
OverflowError: Python int too large to convert to C long

On platforms with 32-bit int and 64-bit long there will be more such cases.
msg170950 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2012-09-22 01:40
The reason for the different messages is clear -- there are different internal tests (I presume in different functions), each of which raises its own error. What do you propose to change?

The only thing I can imagine is that when the bytes converter calls the Python int to c long function, it somehow intercepts the error return and rewrites the message for consistency. Is this your proposal?
msg170979 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-09-22 09:42
1) Use PyLong_AsLongAndOverflow instead PyLong_AsLong in PyArg_ParseTuple and in some other functions.
2) Unify an integer overflow error messages in different functions.
msg219609 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2014-06-02 17:27
I reconsidered this in the light of #21559. getargs_b requires an integer of type int in range(256). A non-int properly raises TypeError.

>>> from _testcapi import getargs_b as gb
>>> gb(1.0)
Traceback (most recent call last):
  File "<pyshell#7>", line 1, in <module>
    gb(1.0)
TypeError: integer argument expected, got float
>>> import fractions
>>> gb(fractions.Fraction(1, 1))
Traceback (most recent call last):
  File "<pyshell#12>", line 1, in <module>
    gb(fractions.Fraction(1, 1))
TypeError: an integer is required (got type Fraction)

An out-of-range int should, it seems to me, just raise ValueError("int %d not in range(256)" % n). Verification of the range:

>>> gb(255)
255
>>> gb(256)
Traceback (most recent call last):
  File "<pyshell#4>", line 1, in <module>
    gb(256)
OverflowError: unsigned byte integer is greater than maximum
>>> gb(0)
0
>>> gb(-1)
Traceback (most recent call last):
  File "<pyshell#6>", line 1, in <module>
    gb(-1)
OverflowError: unsigned byte integer is less than minimum

The last message is wrong or contradictory. An unsigned (non-negative) int cannot be less than 0.
msg270474 - (view) Author: Oren Milman (Oren Milman) * Date: 2016-07-15 09:57
I would be happy to write a patch for this issue, if you don't mind.

Terry, as you were the one to commit the patch for #21559, I guess you are OK with keeping the OverflowError.
Also, I agree that the error message for '_testcapi.getargs_b(-1)' (and for similar cases) could be improved. IMHO, the wording in PyLong_AsLong is quite clear, and also fits here, e.g. 'Python int too small to convert to C unsigned byte' for '_testcapi.getargs_b(-1)'.

ISTM that implementing the patch by doing the following is quite straightforward:
- replacing PyLong_AsLong with PyLong_AsLongAndOverflow (as Serhiy suggested) in cases 'b', 'h' and 'i', in convertsimple in getargs.c
- changing some error messages in convertsimple
- changing PyLong_AsLong and _PyLong_AsInt so that they too would give more helpful error messages (according to the long-lived suggestions of '/* XXX: could be cute and give a different message for overflow == -1 */')

Serhiy, what did you mean by 'Unify an integer overflow error messages in different functions.'?
msg270478 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-07-15 11:31
You plan looks good, Oren.

I meant what you are planning in the last two items. There are also specific functions for converting to platform-depending integer types (size_t, time_t, uid_t, etc). Two most common patterns: "Python int too large to convert to C long" and "signed short integer is less/greater than maximum". Grep all code for raising OverflowError.
msg270479 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2016-07-15 12:00
At this point in time, I have nothing further to add.  So follow Serhiy's advice.
msg278738 - (view) Author: Oren Milman (Oren Milman) * Date: 2016-10-15 23:43
Sorry for taking so long to come up with a patch..

------------ proposed changes ------------
    1. in Python/getargs.c in seterror, the following patches:
        - add a parameter 'p_format_code' - a pointer to the format code
          whose conversion failed.
        - replace the error message of the OverflowError with
          '{fname}() argument out of range' in case all of the following are
          true:
            * PyErr_Occurred() and the exception is an OverflowError.
            * The PyArg_* function received a format ending with a ':<fname>'
              (i.e. seterror's fname argument is not NULL).
            * The error occurred during a conversion to a C integer type
              which might overflow, i.e. one of the format codes 'bhilLn'.

      With these patches, inconsistent messages could often be fixed by merely
      appending ':<fname>' to the format argument in a call to a PyArg_*
      function.

      Furthermore, Some inconsistent messages are actually fixed by these
      patches alone.
      That is because there are already many calls to PyArg_* functions with a
      format ending with ':<fname>'. Some of them are followed by more
      checks of the parsed arguments, which might result in raising an
      OverflowError/ValueError with an inconsistent error message.
      (e.g. in Modules/itertoolsmodule.c, product_new already calls
      PyArg_ParseTupleAndKeywords with the format "|n:product", so with these
      patches, in case we do 'itertools.product('a', repeat=1 << 1000)', the
      error message would be 'product() argument out of range').

      Also, ISTM this patch is helpful, regardless of this issue (#15988).
      In case a PyArg_* function raises an OverflowError (because a conversion
      to some C integer type failed), knowing which function called the
      PyArg_* function would probably prove more helpful to a user (than
      knowing which C type Python failed to convert to, and whether the
      conversion failed because the number to convert was too large or too
      small).

      I decided to put the patch inside seterror, because (aside from its
      name,) it is always called after a failure of convertitem (which is the
      only caller of convertsimple).

    2. in various files:
        - change some OverflowError/ValueError messages for more clarity, e.g.
          replace 'unsigned byte integer is greater than maximum' with
          'Python int too large to convert to C unsigned char'.
        - add code to "catch" OverflowError/ValueError errors, to prevent
          raising inconsistent error messages
        - append ':<fname>' to formats passed to PyArg_* functions, to utilize
          the first proposed change (to automatically "catch"
          OverflowError/ValueError errors).

    3. in Lib/tests - I was already writing tests for my patches, so I guessed
      I should add some of them (the basic boundary checks) to Lib/tests (in
      case they didn't already exist). I ran into some issues here:
        - test_pickle - I didn't manage to create an int large enough to test
          my patch for save_long in Modules/_pickle.c (and so I didn't add any
          test to test_pickle).
        - test_stat - I didn't find a way to determine the size of mode_t on
          the current platform, except for Windows (so the test I added is
          incomprehensive).

    4. in Objects/longobject.c, make some messages more helpful (as mentioned
      in the last message I posted here).


------------ diff ------------
The proposed patches diff file is attached.


------------ tests ------------
I wrote an ugly script to test my patches, and ran it on my Windows 10 and
Ubuntu 16.04 64-bit VM. The script is attached, but it might fail on another
platform.
  - Note that the script I wrote has (among others) a test for
    'select.devpoll'. I couldn't run it, as I don't have a Solaris machine,
    but the test is identical to that of 'select.poll', so it should run
    smoothly on a Solaris machine. Theoretically.
(Each test in my script verifies the exception's error message, and not only
which exception was raised, so ISTM these kind of tests don't fit in
Lib/tests. Please let me know if you think otherwise.)

In addition, I ran 'python_d.exe -m test -j0' (on my 64-bit Windows 10 and
Ubuntu VM) with and without the patches, and got quite the same output. (That
also means my new tests in various files in Lib/tests have passed.)


##############################################################################

Note that the inconsistent messages my patches fix are only those I found
while going over each 'PyExc_OverflowError' in the codebase (in *.h and *.c
files).
However, I would probably find another bunch of inconsistent messages when I
go over each 'PyExc_ValueError' in the codebase.

I would be happy to do that, but I am afraid I won't have time to (at least)
until next June.
msg278742 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-10-16 06:08
Thank you Oren. I'll make a detailed review in next few days.
msg289337 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-03-10 06:24
The patch no longer applied cleanly (in particular to arraymodule.c). Oren, could you please update it?
msg289355 - (view) Author: Oren Milman (Oren Milman) * Date: 2017-03-10 11:33
of course :)

but some time has passed, so i would re-check stuff, and run tests
etc., so it would probably take me some time.
msg289653 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-03-15 09:07
I just have started reviewing the PR and it looks great!
msg289679 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-03-15 16:00
I think that raising OverflowError exceptions should be avoided if this is possible. If some function requires non-negative integer and raises ValueError for small negative integers and OverflowError for large negative integers, it is desirable to make it raising ValueError for all negative integers. If some function truncates integer arguments to specified limit it shouldn't raise OverflowError for large positive integers, but truncate them to that limit.
msg289684 - (view) Author: Oren Milman (Oren Milman) * Date: 2017-03-15 18:03
now that you opened #29816 and #29819, should I remove from the PR changes in the following functions, and related tests?
- _io_BytesIO_truncate_impl
- _io_StringIO_truncate_impl
- long_rshift
- long_lshift
msg289691 - (view) Author: Oren Milman (Oren Milman) * Date: 2017-03-15 20:27
also, IMHO, we should open an issue named 'raise ValueError instead of OverflowError when a negative int is invalid', and leave for that
new issue some the changes (which you commented about in the PR).

I didn't want to open it without your approval.
here is what I thought should be the contents of the issue (feel
free to edit my draft as you wish and open the issue yourself,
if you think we should open it):


In various C functions, a negative int argument is invalid. in most of these
functions, ValueError is raised for a negative int, or at least for a small
negative int (i.e. a negative int that fits in some signed integer C
type).

as Serhiy mentioned in http://bugs.python.org/issue15988#msg289679 and in
code review comments in https://github.com/python/cpython/pull/668, it might
be that a ValueError should always be raised in such functions, upon receiving
a negative int (even upon receiving a big negative int, such as (-1 << 1000)).

the following functions were identified as relevant only while working on
#15988, so there are probably (a lot?) more relevant functions.

functions where OverflowError is raised only for big negative ints:
    - Objects/bytesobject.c - bytes_new
    - Objects/bytearrayobject.c - bytearray_init
    - functions that would be fixed as part of #29819?
        * Modules/_io/stringio.c - _io_StringIO_truncate_impl
        * Modules/_io/bytesio.c - _io_BytesIO_truncate_impl

functions where OverflowError is raised for any negative int (so a change would
probably break some users code):
    - Objects/longobject.c:
        * PyLong_AsUnsignedLong
        * PyLong_AsSize_t
    - Modules/_ctypes/_ctypes.c - PyCArrayType_new


Note: while patching these functions, one should also make sure that the
patched functions produce consistent error messages, as explained in #15988
(in which it was decided to leave these functions (which currently produce
inconsistent error messages) to this issue.)
msg289693 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-03-15 20:39
I agree. It would be better to remove from the PR changes that can be covered by issue29816, issue29819 and other changes that changes visible behavior and open one or several new issues for them. Maybe one meta-issue and few concrete issues?

Such methods as read() also shouldn't raise OverflowError for negative argument.
msg289695 - (view) Author: Oren Milman (Oren Milman) * Date: 2017-03-15 20:45
I am sorry, but my 2nd term starts on Monday, so I am short on time,
and feel like I would be lucky to even finish working on PR 668.

because of that, and because you obviously have a much better
understanding of this issue than me, I would be happy if you could open
the new issues.
msg289799 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-03-18 09:11
it is hard to discuss such large patch on GitHub since diffs and comments to all files are shown on one page. Rietveld is more appropriate since it shows only one file at the time. I suggest you Oren to provide the next version of your patch as a patch for reviewing on Rietveld and make PRs only for ready patches.

Even after withdrawing some changes or extracting them as separate issues the patch can be too large. I suggest to commit first the part that relates to PyArg_Parse* ang PyLong_As*.
msg289814 - (view) Author: Oren Milman (Oren Milman) * Date: 2017-03-18 14:19
Here is a patch including only changes related to PyLong_As* and PyArg_Parse*
functions.

(I ran the test module, and on my Windows 10, the same tests failed with
and without my patches. However, on my Ubuntu 16.04 VM, none of the tests
failed.)
msg289833 - (view) Author: Oren Milman (Oren Milman) * Date: 2017-03-18 22:48
Here is a patch including only changes related to formats using
the 'c' specifier.

(I ran the test module, and on my Windows 10, the same tests failed with
and without my patches. However, on my Ubuntu 16.04 VM, none of the tests
failed.)
msg289842 - (view) Author: Oren Milman (Oren Milman) * Date: 2017-03-19 07:40
Here is a patch including only changes related to array.

(I ran the test module, and on my Windows 10, the same tests failed with
and without my patches. However, on my Ubuntu 16.04 VM, none of the tests
failed.)
msg289843 - (view) Author: Oren Milman (Oren Milman) * Date: 2017-03-19 08:48
Here is a patch including only changes related to hashlib, lzma and pickle.

(I ran the test module, and on my Windows 10, the same tests failed with
and without my patches. However, on my Ubuntu 16.04 VM, none of the tests
failed.)
msg289846 - (view) Author: Oren Milman (Oren Milman) * Date: 2017-03-19 11:28
Here is a patch including only changes related to time and re.

(I ran the test module, and on my Windows 10, the same tests failed with
and without my patches. However, on my Ubuntu 16.04 VM, none of the tests
failed.)
msg289850 - (view) Author: Oren Milman (Oren Milman) * Date: 2017-03-19 12:24
Here is a patch including only changes related to curses, stat, callproc (ctypes) and sequence_repeat (abstract).

(I ran the test module, and on my Windows 10, the same tests failed with
and without my patches. However, on my Ubuntu 16.04 VM, none of the tests
failed.)
msg289852 - (view) Author: Oren Milman (Oren Milman) * Date: 2017-03-19 14:01
Here is a patch including only changes related to mmap, posix, socket and select.

(I ran the test module, and on my Windows 10, the same tests failed with
and without my patches. However, on my Ubuntu 16.04 VM, none of the tests
failed.)
msg289853 - (view) Author: Oren Milman (Oren Milman) * Date: 2017-03-19 14:08
ISTM that what's left is (except for my 7 sub-patches):
    - making the error messages of long_rshift and long_lshift consistent (waits for #29816)
    - deciding whether to open an issue for changing the type of errors PyLong_AsSize_t raises
History
Date User Action Args
2017-03-19 14:08:28Oren Milmansetmessages: + msg289853
2017-03-19 14:01:03Oren Milmansetfiles: + mmap_posix_socket_select_patchVer1.diff

messages: + msg289852
2017-03-19 12:24:51Oren Milmansetfiles: + curses_stat_callproc_abstract_patchVer1.diff

messages: + msg289850
2017-03-19 11:28:03Oren Milmansetfiles: + time_and_re_patchVer1.diff

messages: + msg289846
2017-03-19 08:48:35Oren Milmansetfiles: + hashlib_lzma_pickle_patchVer1.diff

messages: + msg289843
2017-03-19 07:40:56Oren Milmansetfiles: + array_patchVer1.diff

messages: + msg289842
2017-03-18 22:48:26Oren Milmansetfiles: + formats_patchVer1.diff

messages: + msg289833
2017-03-18 14:19:41Oren Milmansetfiles: + PyLong_As_and_PyArg_Parse_patchVer1.diff

messages: + msg289814
2017-03-18 09:11:28serhiy.storchakasetmessages: + msg289799
2017-03-15 20:45:07Oren Milmansetmessages: + msg289695
2017-03-15 20:39:32serhiy.storchakasetmessages: + msg289693
2017-03-15 20:27:21Oren Milmansetmessages: + msg289691
2017-03-15 18:03:34Oren Milmansetmessages: + msg289684
2017-03-15 16:00:20serhiy.storchakasetmessages: + msg289679
2017-03-15 09:07:17serhiy.storchakasetmessages: + msg289653
2017-03-15 09:06:15serhiy.storchakasetdependencies: + can't set big int-like objects to items in array 'Q', 'L' and 'I', Get rid of C limitation for shift count in right shift
2017-03-14 23:15:35Oren Milmansetpull_requests: + pull_request551
2017-03-10 11:33:20Oren Milmansetmessages: + msg289355
2017-03-10 06:24:42serhiy.storchakasetmessages: + msg289337
2016-10-16 06:08:34serhiy.storchakasetstage: needs patch -> patch review
messages: + msg278742
versions: + Python 3.7, - Python 3.6
2016-10-15 23:44:37Oren Milmansetfiles: + testPatches.py
2016-10-15 23:43:46Oren Milmansetfiles: + issue15988_ver1.diff
keywords: + patch
messages: + msg278738
2016-07-15 12:00:56terry.reedysetmessages: + msg270479
2016-07-15 11:31:55serhiy.storchakasetmessages: + msg270478
versions: + Python 3.6, - Python 3.4
2016-07-15 09:57:48Oren Milmansetnosy: + Oren Milman
messages: + msg270474
2014-06-02 17:27:47terry.reedysetmessages: + msg219609
2013-01-08 07:49:14ezio.melottisetnosy: + ezio.melotti

stage: needs patch
2013-01-07 17:40:51serhiy.storchakasetassignee: serhiy.storchaka
2012-10-16 10:59:36mark.dickinsonsetnosy: + mark.dickinson
2012-10-16 08:38:57serhiy.storchakasetdependencies: + Possible integer overflow of PyLong_AsLong() results
2012-09-22 09:42:58serhiy.storchakasetmessages: + msg170979
2012-09-22 01:40:53terry.reedysetnosy: + terry.reedy
messages: + msg170950
2012-09-20 18:51:09serhiy.storchakacreate