Author Oren Milman
Recipients Oren Milman
Date 2016-09-28.10:09:25
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1475057375.91.0.35024252276.issue28298@psf.upfronthosting.co.za>
In-reply-to
Content
------------ current state ------------
On my Windows 10, on a 32-bit Python, The following runs fine:
    import array
    array.array('L').append(2 ** 32 - 1)

However, in the following, an OverflowError('Python int too large to convert to C long') is raised on the last line:
    import array
    class LikeInt:
        def __init__(self, intVal):
            self.intVal = intVal
        def __int__(self):
            return self.intVal
    array.array('L').append(LikeInt(2 ** 32 - 1))

The reason for this behavior is the implementation of the function LL_setitem (in Modules/arraymodule.c) (edited brutally for brevity):
    LL_setitem(arrayobject *ap, Py_ssize_t i, PyObject *v)
    {
        unsigned long x;
        if (PyLong_Check(v)) {
            x = PyLong_AsUnsignedLong(v);
        }
        else {
            long y;
            PyArg_Parse(v, "l;array item must be integer", &y);
            x = (unsigned long)y;
        }
        (ap->ob_item)[i] = x;
    }
The problem is that PyArg_Parse is used to convert a Python int into a C long. So PyArg_Parse fails when it is given a Python int which is in range(LONG_MAX + 1, ULONG_MAX + 1), even though such Python int can be stored in a C unsigned long.

It is quite the same for array('I') and array('Q') (i.e. in II_setitem and in QQ_setitem, respectively).

With regard to relevant changes made in the past, PyArg_Parse was always used  (in case '!PyLong_Check(v)'), since adding the original versions of:
    * II_setitem and LL_setitem (back then, they were called I_setitem and L_setitem, respectively), in changeset 4875 (https://hg.python.org/cpython/rev/911040e1bb11)
    * QQ_setitem, in changeset 72430 (https://hg.python.org/cpython/rev/15659e0e2b2e)


------------ proposed changes ------------
    1. In Modules/arraymodule.c, change the implementation of LL_setitem (and likewise, of II_setitem and QQ_setitem) roughly to the following:
        LL_setitem(arrayobject *ap, Py_ssize_t i, PyObject *v)
        {
            unsigned long x;
            if (!PyLong_Check(v)) {
                v = _PyLong_FromNbInt(v);
            }
            x = PyLong_AsUnsignedLong(v);
            (ap->ob_item)[i] = x;
        }

    2. In Lib/test/test_array.py, add tests:
        * to verify the bug is fixed, i.e. test the bounds of setting int-like objects to items in an integers array
        * to verify float objects can't be set to items in an integers array (as is already the behavior, but there aren't any tests for it)

    3. While we are in Lib/test/test_array.py, remove any checks whether 'long long' is available, as it must be available, since changeset 103105 (https://hg.python.org/cpython/rev/9206a86f7321).

Note that issue #12974 (opened in 2011) proposes deprecating the ability to set int-like objects to items in an integers array. (I am not sure how that affects my patch, but I guess it should be noted.)


------------ diff ------------
The proposed patches diff file is attached.

(Note that the I didn't propose to change the perplexing error message 'unsigned int is greater than maximum' in II_setitem, as there are many such messages in the codebase, and I am working on a patch for them as part of issue #15988.)


------------ tests ------------
I ran 'python_d.exe -m test -j3' (on my 64-bit Windows 10) with and without the patches, and got quite the same output. (That also means my new tests in test_array passed.)
The outputs of both runs are attached.
History
Date User Action Args
2016-09-28 10:09:38Oren Milmansetrecipients: + Oren Milman
2016-09-28 10:09:35Oren Milmansetmessageid: <1475057375.91.0.35024252276.issue28298@psf.upfronthosting.co.za>
2016-09-28 10:09:35Oren Milmanlinkissue28298 messages
2016-09-28 10:09:34Oren Milmancreate