This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Objects/listobject.c: gallop functions rely on signed integer overflow
Type: behavior Stage: resolved
Components: Interpreter Core Versions: Python 3.8, Python 3.7, Python 3.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: tim.peters Nosy List: berker.peksag, izbyshev, miss-islington, pitrou, serhiy.storchaka, tim.peters
Priority: normal Keywords: patch

Created on 2018-10-28 13:40 by izbyshev, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 10175 closed izbyshev, 2018-10-28 13:42
PR 10184 closed izbyshev, 2018-10-28 17:09
PR 10202 merged izbyshev, 2018-10-28 21:49
PR 13514 merged miss-islington, 2019-05-23 00:01
Messages (6)
msg328688 - (view) Author: Alexey Izbyshev (izbyshev) * (Python triager) Date: 2018-10-28 13:40
gallop_left() and gallop_right() functions explicitly rely on overflowing behavior of Py_ssize_t (https://github.com/python/cpython/blob/6015cc50bc38b9e920ce4986ee10658eaa14f561/Objects/listobject.c#L1361):

    ofs = (ofs << 1) + 1;
    if (ofs <= 0)                   /* int overflow */
        ofs = maxofs;

Signed integer overflow is undefined in C, and the above is guaranteed to work only if compiler-specific workarounds are applied, such as GCC's -fwrapv (that is what CPython does). Without such workarounds the compiler would be free to remove the if statement.
msg328754 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2018-10-28 21:13
This doesn't actually matter - the code can never trigger.  It would be fine to replace it with an assert to that effect (see below for a specific suggestion).

The reason:  The indices in this code are into vectors of PyObject*.  These vectors can't contain more than

    floor(PY_SSIZE_T_MAX / sizeof(PyObject*))

pointers (see listobject.c & Python's heap allocation routines).  So the largest legit index this code can ever see is 1 less than that.  Since pointers are at least 4 bytes on all machines Python runs on, that implies (with room to spare) that

    assert(ofs <= (PY_SSIZE_T_MAX - 1) / 2);

can't fail.  Which in turn implies that, mathematically,

    2*ofs + 1 <= PY_SSIZE_T_MAX

So

       if (ofs <= 0)                   /* int overflow */

can't happen, regardless of how the platform C treats signed overflow (signed overflow can't happen to begin with).  The existing `while (ofs < maxofs)` check already ensures that `ofs` is a legit index, and _any_ legit index into a PyObject* vector can be doubled and incremented without overflowing Py_ssize_t.

In fact, that would remain so even if listobject.c allowed its PyObject* vectors to contain twice as many pointers as they actually can contain now.
msg328761 - (view) Author: Alexey Izbyshev (izbyshev) * (Python triager) Date: 2018-10-28 21:50
> This doesn't actually matter - the code can never trigger.

Yes, I considered this, and wondered why assert wasn't used in the first place, but the explicit check with a comment suggested that possibility of overflow was deemed real.

I've submitted a third PR that simply removes the checks and adds asserts instead.
msg328772 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2018-10-29 00:13
I left the code in because it was harmless (a 100%-predictable branch), and it was easier to show that overflow was _considered_ than to explain in full why it was impossible.  In the context of CPython.  For example, the Java port of this code couldn't rely on the far-removed-from-this-code details of Python's C heap management (the largest Java signed integer is a legit Java array index), and signed integer overflow _is_ wholly defined in Java.  Which happens to be the same way it worked under virtually all C compilers at the time the code was written.  The idea that C compilers should be as aggressive as Fortran compilers instead of just supplying a portable assembly language is a modern conceit ;-)

The code is useless, but it's not "a bug", so I'm removing Python 2 from the list of targets.
msg343256 - (view) Author: miss-islington (miss-islington) Date: 2019-05-23 00:01
New changeset 6bc5917903b722bdd0e5d3020949f26fec5dfe9a by Miss Islington (bot) (Alexey Izbyshev) in branch 'master':
bpo-35091: Objects/listobject.c: Replace overflow checks in gallop fu… (GH-10202)
https://github.com/python/cpython/commit/6bc5917903b722bdd0e5d3020949f26fec5dfe9a
msg343257 - (view) Author: miss-islington (miss-islington) Date: 2019-05-23 00:18
New changeset 367fe5757a707c4e3602dee807a9315199ed0b5c by Miss Islington (bot) in branch '3.7':
bpo-35091: Objects/listobject.c: Replace overflow checks in gallop fu… (GH-10202)
https://github.com/python/cpython/commit/367fe5757a707c4e3602dee807a9315199ed0b5c
History
Date User Action Args
2022-04-11 14:59:07adminsetgithub: 79272
2019-05-23 00:28:57cheryl.sabellasetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2019-05-23 00:18:59miss-islingtonsetmessages: + msg343257
2019-05-23 00:01:20miss-islingtonsetpull_requests: + pull_request13431
2019-05-23 00:01:12miss-islingtonsetnosy: + miss-islington
messages: + msg343256
2018-10-29 00:13:44tim.peterssetmessages: + msg328772
versions: - Python 2.7
2018-10-28 21:50:11izbyshevsetmessages: + msg328761
2018-10-28 21:49:24izbyshevsetpull_requests: + pull_request9520
2018-10-28 21:13:04tim.peterssetmessages: + msg328754
2018-10-28 19:56:11rhettingersetassignee: tim.peters

nosy: + tim.peters
2018-10-28 17:09:29izbyshevsetpull_requests: + pull_request9507
2018-10-28 13:42:48izbyshevsetkeywords: + patch
stage: patch review
pull_requests: + pull_request9498
2018-10-28 13:40:23izbyshevcreate