Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistency in overflow error messages of integer argument #60192

Open
serhiy-storchaka opened this issue Sep 20, 2012 · 29 comments
Open

Inconsistency in overflow error messages of integer argument #60192

serhiy-storchaka opened this issue Sep 20, 2012 · 29 comments
Assignees
Labels
3.7 (EOL) end of life interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement

Comments

@serhiy-storchaka
Copy link
Member

BPO 15988
Nosy @terryjreedy, @mdickinson, @ezio-melotti, @serhiy-storchaka, @orenmn
PRs
  • bpo-15988: make various overflow messages more consistent and helpful #668
  • Dependencies
  • bpo-15989: Possible integer overflow of PyLong_AsLong() results
  • bpo-28298: can't set big int-like objects to items in array 'Q', 'L' and 'I'
  • bpo-29816: Get rid of C limitation for shift count in right shift
  • Files
  • issue15988_ver1.diff: proposed patches diff file - ver1
  • testPatches.py: an ugly script that tests my patches
  • PyLong_As_and_PyArg_Parse_patchVer1.diff
  • formats_patchVer1.diff
  • array_patchVer1.diff
  • hashlib_lzma_pickle_patchVer1.diff
  • time_and_re_patchVer1.diff
  • curses_stat_callproc_abstract_patchVer1.diff
  • mmap_posix_socket_select_patchVer1.diff
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/serhiy-storchaka'
    closed_at = None
    created_at = <Date 2012-09-20.18:51:09.840>
    labels = ['interpreter-core', 'type-feature', '3.7']
    title = 'Inconsistency in overflow error messages of integer argument'
    updated_at = <Date 2017-09-28.06:45:38.532>
    user = 'https://github.com/serhiy-storchaka'

    bugs.python.org fields:

    activity = <Date 2017-09-28.06:45:38.532>
    actor = 'Oren Milman'
    assignee = 'serhiy.storchaka'
    closed = False
    closed_date = None
    closer = None
    components = ['Interpreter Core']
    creation = <Date 2012-09-20.18:51:09.840>
    creator = 'serhiy.storchaka'
    dependencies = ['15989', '28298', '29816']
    files = ['45107', '45108', '46733', '46736', '46738', '46739', '46740', '46741', '46742']
    hgrepos = []
    issue_num = 15988
    keywords = ['patch']
    message_count = 29.0
    messages = ['170827', '170950', '170979', '219609', '270474', '270478', '270479', '278738', '278742', '289337', '289355', '289653', '289679', '289684', '289691', '289693', '289695', '289799', '289814', '289833', '289842', '289843', '289846', '289850', '289852', '289853', '299735', '300218', '303192']
    nosy_count = 5.0
    nosy_names = ['terry.reedy', 'mark.dickinson', 'ezio.melotti', 'serhiy.storchaka', 'Oren Milman']
    pr_nums = ['668']
    priority = 'low'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue15988'
    versions = ['Python 3.7']

    @serhiy-storchaka
    Copy link
    Member Author

    PyArg_ParseTuple raises inconsistent overflow error messages for small integer formats. For example:

    >>> import _testcapi
    >>> _testcapi.getargs_b(100)
    100
    >>> _testcapi.getargs_b(1000)
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    OverflowError: unsigned byte integer is greater than maximum
    >>> _testcapi.getargs_b(-1000)
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    OverflowError: unsigned byte integer is less than minimum
    >>> _testcapi.getargs_b(100000000000000000000)
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    OverflowError: Python int too large to convert to C long
    >>> _testcapi.getargs_b(-100000000000000000000)
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    OverflowError: Python int too large to convert to C long

    On platforms with 32-bit int and 64-bit long there will be more such cases.

    @serhiy-storchaka serhiy-storchaka added interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement labels Sep 20, 2012
    @terryjreedy
    Copy link
    Member

    The reason for the different messages is clear -- there are different internal tests (I presume in different functions), each of which raises its own error. What do you propose to change?

    The only thing I can imagine is that when the bytes converter calls the Python int to c long function, it somehow intercepts the error return and rewrites the message for consistency. Is this your proposal?

    @serhiy-storchaka
    Copy link
    Member Author

    1. Use PyLong_AsLongAndOverflow instead PyLong_AsLong in PyArg_ParseTuple and in some other functions.
    2. Unify an integer overflow error messages in different functions.

    @serhiy-storchaka serhiy-storchaka self-assigned this Jan 7, 2013
    @terryjreedy
    Copy link
    Member

    I reconsidered this in the light of bpo-21559. getargs_b requires an integer of type int in range(256). A non-int properly raises TypeError.

    >>> from _testcapi import getargs_b as gb
    >>> gb(1.0)
    Traceback (most recent call last):
      File "<pyshell#7>", line 1, in <module>
        gb(1.0)
    TypeError: integer argument expected, got float
    >>> import fractions
    >>> gb(fractions.Fraction(1, 1))
    Traceback (most recent call last):
      File "<pyshell#12>", line 1, in <module>
        gb(fractions.Fraction(1, 1))
    TypeError: an integer is required (got type Fraction)

    An out-of-range int should, it seems to me, just raise ValueError("int %d not in range(256)" % n). Verification of the range:

    >>> gb(255)
    255
    >>> gb(256)
    Traceback (most recent call last):
      File "<pyshell#4>", line 1, in <module>
        gb(256)
    OverflowError: unsigned byte integer is greater than maximum
    >>> gb(0)
    0
    >>> gb(-1)
    Traceback (most recent call last):
      File "<pyshell#6>", line 1, in <module>
        gb(-1)
    OverflowError: unsigned byte integer is less than minimum

    The last message is wrong or contradictory. An unsigned (non-negative) int cannot be less than 0.

    @orenmn
    Copy link
    Mannequin

    orenmn mannequin commented Jul 15, 2016

    I would be happy to write a patch for this issue, if you don't mind.

    Terry, as you were the one to commit the patch for bpo-21559, I guess you are OK with keeping the OverflowError.
    Also, I agree that the error message for '_testcapi.getargs_b(-1)' (and for similar cases) could be improved. IMHO, the wording in PyLong_AsLong is quite clear, and also fits here, e.g. 'Python int too small to convert to C unsigned byte' for '_testcapi.getargs_b(-1)'.

    ISTM that implementing the patch by doing the following is quite straightforward:

    • replacing PyLong_AsLong with PyLong_AsLongAndOverflow (as Serhiy suggested) in cases 'b', 'h' and 'i', in convertsimple in getargs.c
    • changing some error messages in convertsimple
    • changing PyLong_AsLong and _PyLong_AsInt so that they too would give more helpful error messages (according to the long-lived suggestions of '/* XXX: could be cute and give a different message for overflow == -1 */')

    Serhiy, what did you mean by 'Unify an integer overflow error messages in different functions.'?

    @serhiy-storchaka
    Copy link
    Member Author

    You plan looks good, Oren.

    I meant what you are planning in the last two items. There are also specific functions for converting to platform-depending integer types (size_t, time_t, uid_t, etc). Two most common patterns: "Python int too large to convert to C long" and "signed short integer is less/greater than maximum". Grep all code for raising OverflowError.

    @terryjreedy
    Copy link
    Member

    At this point in time, I have nothing further to add. So follow Serhiy's advice.

    @orenmn
    Copy link
    Mannequin

    orenmn mannequin commented Oct 15, 2016

    Sorry for taking so long to come up with a patch..

    ------------ proposed changes ------------
    1. in Python/getargs.c in seterror, the following patches:
    - add a parameter 'p_format_code' - a pointer to the format code
    whose conversion failed.
    - replace the error message of the OverflowError with
    '{fname}() argument out of range' in case all of the following are
    true:
    * PyErr_Occurred() and the exception is an OverflowError.
    * The PyArg_* function received a format ending with a ':'
    (i.e. seterror's fname argument is not NULL).
    * The error occurred during a conversion to a C integer type
    which might overflow, i.e. one of the format codes 'bhilLn'.

      With these patches, inconsistent messages could often be fixed by merely
      appending ':<fname>' to the format argument in a call to a PyArg_*
      function.
    
      Furthermore, Some inconsistent messages are actually fixed by these
      patches alone.
      That is because there are already many calls to PyArg_* functions with a
      format ending with ':<fname>'. Some of them are followed by more
      checks of the parsed arguments, which might result in raising an
      OverflowError/ValueError with an inconsistent error message.
      (e.g. in [Modules/itertoolsmodule.c](https://github.com/python/cpython/blob/main/Modules/itertoolsmodule.c), product_new already calls
      PyArg_ParseTupleAndKeywords with the format "|n:product", so with these
      patches, in case we do 'itertools.product('a', repeat=1 << 1000)', the
      error message would be 'product() argument out of range').
    
      Also, ISTM this patch is helpful, regardless of this issue (bpo-15988).
      In case a PyArg_* function raises an OverflowError (because a conversion
      to some C integer type failed), knowing which function called the
      PyArg_* function would probably prove more helpful to a user (than
      knowing which C type Python failed to convert to, and whether the
      conversion failed because the number to convert was too large or too
      small).
    
      I decided to put the patch inside seterror, because (aside from its
      name,) it is always called after a failure of convertitem (which is the
      only caller of convertsimple).
    
    2. in various files:
        - change some OverflowError/ValueError messages for more clarity, e.g.
          replace 'unsigned byte integer is greater than maximum' with
          'Python int too large to convert to C unsigned char'.
        - add code to "catch" OverflowError/ValueError errors, to prevent
          raising inconsistent error messages
        - append ':<fname>' to formats passed to PyArg_* functions, to utilize
          the first proposed change (to automatically "catch"
          OverflowError/ValueError errors).
    
    3. in [Lib/tests](https://github.com/python/cpython/blob/main/Lib/tests) - I was already writing tests for my patches, so I guessed
      I should add some of them (the basic boundary checks) to [Lib/tests](https://github.com/python/cpython/blob/main/Lib/tests) (in
      case they didn't already exist). I ran into some issues here:
        - test_pickle - I didn't manage to create an int large enough to test
          my patch for save_long in [Modules/_pickle.c](https://github.com/python/cpython/blob/main/Modules/_pickle.c) (and so I didn't add any
          test to test_pickle).
        - test_stat - I didn't find a way to determine the size of mode_t on
          the current platform, except for Windows (so the test I added is
          incomprehensive).
    
    4. in [Objects/longobject.c](https://github.com/python/cpython/blob/main/Objects/longobject.c), make some messages more helpful (as mentioned
      in the last message I posted here).
    

    ------------ diff ------------
    The proposed patches diff file is attached.

    ------------ tests ------------
    I wrote an ugly script to test my patches, and ran it on my Windows 10 and
    Ubuntu 16.04 64-bit VM. The script is attached, but it might fail on another
    platform.

    • Note that the script I wrote has (among others) a test for
      'select.devpoll'. I couldn't run it, as I don't have a Solaris machine,
      but the test is identical to that of 'select.poll', so it should run
      smoothly on a Solaris machine. Theoretically.
      (Each test in my script verifies the exception's error message, and not only
      which exception was raised, so ISTM these kind of tests don't fit in
      Lib/tests. Please let me know if you think otherwise.)

    In addition, I ran 'python_d.exe -m test -j0' (on my 64-bit Windows 10 and
    Ubuntu VM) with and without the patches, and got quite the same output. (That
    also means my new tests in various files in Lib/tests have passed.)

    ##############################################################################

    Note that the inconsistent messages my patches fix are only those I found
    while going over each 'PyExc_OverflowError' in the codebase (in *.h and *.c
    files).
    However, I would probably find another bunch of inconsistent messages when I
    go over each 'PyExc_ValueError' in the codebase.

    I would be happy to do that, but I am afraid I won't have time to (at least)
    until next June.

    @serhiy-storchaka
    Copy link
    Member Author

    Thank you Oren. I'll make a detailed review in next few days.

    @serhiy-storchaka serhiy-storchaka added the 3.7 (EOL) end of life label Oct 16, 2016
    @serhiy-storchaka
    Copy link
    Member Author

    The patch no longer applied cleanly (in particular to arraymodule.c). Oren, could you please update it?

    @orenmn
    Copy link
    Mannequin

    orenmn mannequin commented Mar 10, 2017

    of course :)

    but some time has passed, so i would re-check stuff, and run tests
    etc., so it would probably take me some time.

    @serhiy-storchaka
    Copy link
    Member Author

    I just have started reviewing the PR and it looks great!

    @serhiy-storchaka
    Copy link
    Member Author

    I think that raising OverflowError exceptions should be avoided if this is possible. If some function requires non-negative integer and raises ValueError for small negative integers and OverflowError for large negative integers, it is desirable to make it raising ValueError for all negative integers. If some function truncates integer arguments to specified limit it shouldn't raise OverflowError for large positive integers, but truncate them to that limit.

    @orenmn
    Copy link
    Mannequin

    orenmn mannequin commented Mar 15, 2017

    now that you opened bpo-29816 and bpo-29819, should I remove from the PR changes in the following functions, and related tests?

    • _io_BytesIO_truncate_impl
    • _io_StringIO_truncate_impl
    • long_rshift
    • long_lshift

    @orenmn
    Copy link
    Mannequin

    orenmn mannequin commented Mar 15, 2017

    also, IMHO, we should open an issue named 'raise ValueError instead of OverflowError when a negative int is invalid', and leave for that
    new issue some the changes (which you commented about in the PR).

    I didn't want to open it without your approval.
    here is what I thought should be the contents of the issue (feel
    free to edit my draft as you wish and open the issue yourself,
    if you think we should open it):

    In various C functions, a negative int argument is invalid. in most of these
    functions, ValueError is raised for a negative int, or at least for a small
    negative int (i.e. a negative int that fits in some signed integer C
    type).

    as Serhiy mentioned in http://bugs.python.org/issue15988#msg289679 and in
    code review comments in #668, it might
    be that a ValueError should always be raised in such functions, upon receiving
    a negative int (even upon receiving a big negative int, such as (-1 << 1000)).

    the following functions were identified as relevant only while working on
    bpo-15988, so there are probably (a lot?) more relevant functions.

    functions where OverflowError is raised only for big negative ints:
    - Objects/bytesobject.c - bytes_new
    - Objects/bytearrayobject.c - bytearray_init
    - functions that would be fixed as part of bpo-29819?
    * Modules/_io/stringio.c - _io_StringIO_truncate_impl
    * Modules/_io/bytesio.c - _io_BytesIO_truncate_impl

    functions where OverflowError is raised for any negative int (so a change would
    probably break some users code):
    - Objects/longobject.c:
    * PyLong_AsUnsignedLong
    * PyLong_AsSize_t
    - Modules/_ctypes/_ctypes.c - PyCArrayType_new

    Note: while patching these functions, one should also make sure that the
    patched functions produce consistent error messages, as explained in bpo-15988
    (in which it was decided to leave these functions (which currently produce
    inconsistent error messages) to this issue.)

    @serhiy-storchaka
    Copy link
    Member Author

    I agree. It would be better to remove from the PR changes that can be covered by bpo-29816, bpo-29819 and other changes that changes visible behavior and open one or several new issues for them. Maybe one meta-issue and few concrete issues?

    Such methods as read() also shouldn't raise OverflowError for negative argument.

    @orenmn
    Copy link
    Mannequin

    orenmn mannequin commented Mar 15, 2017

    I am sorry, but my 2nd term starts on Monday, so I am short on time,
    and feel like I would be lucky to even finish working on PR 668.

    because of that, and because you obviously have a much better
    understanding of this issue than me, I would be happy if you could open
    the new issues.

    @serhiy-storchaka
    Copy link
    Member Author

    it is hard to discuss such large patch on GitHub since diffs and comments to all files are shown on one page. Rietveld is more appropriate since it shows only one file at the time. I suggest you Oren to provide the next version of your patch as a patch for reviewing on Rietveld and make PRs only for ready patches.

    Even after withdrawing some changes or extracting them as separate issues the patch can be too large. I suggest to commit first the part that relates to PyArg_Parse* ang PyLong_As*.

    @orenmn
    Copy link
    Mannequin

    orenmn mannequin commented Mar 18, 2017

    Here is a patch including only changes related to PyLong_As* and PyArg_Parse*
    functions.

    (I ran the test module, and on my Windows 10, the same tests failed with
    and without my patches. However, on my Ubuntu 16.04 VM, none of the tests
    failed.)

    @orenmn
    Copy link
    Mannequin

    orenmn mannequin commented Mar 18, 2017

    Here is a patch including only changes related to formats using
    the 'c' specifier.

    (I ran the test module, and on my Windows 10, the same tests failed with
    and without my patches. However, on my Ubuntu 16.04 VM, none of the tests
    failed.)

    @orenmn
    Copy link
    Mannequin

    orenmn mannequin commented Mar 19, 2017

    Here is a patch including only changes related to array.

    (I ran the test module, and on my Windows 10, the same tests failed with
    and without my patches. However, on my Ubuntu 16.04 VM, none of the tests
    failed.)

    @orenmn
    Copy link
    Mannequin

    orenmn mannequin commented Mar 19, 2017

    Here is a patch including only changes related to hashlib, lzma and pickle.

    (I ran the test module, and on my Windows 10, the same tests failed with
    and without my patches. However, on my Ubuntu 16.04 VM, none of the tests
    failed.)

    @orenmn
    Copy link
    Mannequin

    orenmn mannequin commented Mar 19, 2017

    Here is a patch including only changes related to time and re.

    (I ran the test module, and on my Windows 10, the same tests failed with
    and without my patches. However, on my Ubuntu 16.04 VM, none of the tests
    failed.)

    @orenmn
    Copy link
    Mannequin

    orenmn mannequin commented Mar 19, 2017

    Here is a patch including only changes related to curses, stat, callproc (ctypes) and sequence_repeat (abstract).

    (I ran the test module, and on my Windows 10, the same tests failed with
    and without my patches. However, on my Ubuntu 16.04 VM, none of the tests
    failed.)

    @orenmn
    Copy link
    Mannequin

    orenmn mannequin commented Mar 19, 2017

    Here is a patch including only changes related to mmap, posix, socket and select.

    (I ran the test module, and on my Windows 10, the same tests failed with
    and without my patches. However, on my Ubuntu 16.04 VM, none of the tests
    failed.)

    @orenmn
    Copy link
    Mannequin

    orenmn mannequin commented Mar 19, 2017

    ISTM that what's left is (except for my 7 sub-patches):
    - making the error messages of long_rshift and long_lshift consistent (waits for bpo-29816)
    - deciding whether to open an issue for changing the type of errors PyLong_AsSize_t raises

    @orenmn
    Copy link
    Mannequin

    orenmn mannequin commented Aug 4, 2017

    How do we proceed?
    should I update (if needed) each of the patches I uploaded in March
    to eliminate commit conflicts? or can someone review them as they are
    now?

    @serhiy-storchaka
    Copy link
    Member Author

    Glad to see you again Oren! Maybe first finish bpo-28261? It looks easier.

    @orenmn
    Copy link
    Mannequin

    orenmn mannequin commented Sep 28, 2017

    Serhiy, you suggested in https://bugs.python.org/issue15988#msg289799 that uploading
    diff files here is more convenient than in a github PR, so I uploaded my fixes here,
    and so #668 is now outdated, and merging it
    isn't really relevant (while in its current state).

    Should I open some smaller PRs? or should I combine all my patches and update
    #668 ?

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants