classification
Title: remove redundant overflow checks in tuple and list implementations
Type: enhancement Stage: patch review
Components: Interpreter Core Versions: Python 3.8
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Windson Yang, mark.dickinson, serhiy.storchaka, sir-sigurd, tim.peters
Priority: normal Keywords: patch

Created on 2018-08-13 19:38 by sir-sigurd, last changed 2019-01-25 07:43 by Windson Yang.

Pull Requests
URL Status Linked Edit
PR 8757 open sir-sigurd, 2018-08-13 19:39
Messages (6)
msg323491 - (view) Author: Sergey Fedoseev (sir-sigurd) * Date: 2018-08-13 19:38
Max size of list and tuples is limited by PY_SSIZE_T_MAX / sizeof(PyObject*), so the sum of any two list/tuples sizes always <= PY_SSIZE_T_MAX if sizeof(PyObject*) > 1, which seems true for all supported (existing?) platforms.
It means that overflow checks in app1, ins1, list_concat and tupleconcat are redundant and can be safely removed.
msg323540 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2018-08-14 21:27
I agree there's pointless code now, but don't understand why the patch replaces it with mysterious asserts.  For example, what's the point of this?

assert(Py_SIZE(a) <= PY_SSIZE_T_MAX / sizeof(PyObject*));
assert(Py_SIZE(b) <= PY_SSIZE_T_MAX / sizeof(PyObject*));

The invariant that needs to be maintained is that the number of elements is no larger than PY_SSIZE_T_MAX, so the _relevant_ thing to assert is:

assert(Py_SIZE(a) + Py_SIZE(b) <= PY_SSIZE_T_MAX);

That in turn cries out for an explanation of why it must be true.  The asserts actually in the patch are part of that explanation, but on their own appear to be coming from Mars.  They're missing the conclusion, and rely on further unstated subtleties.

Suggest adding a comment where the structs are defined, like:

"""
Note:  Python's memory allocators return at most PY_SSIZE_T_MAX bytes.  Therefore a vector of PyObject* can contain no more than PY_SSIZE_T_MAX / sizeof(PyObject*) elements.  In particular, a PyObject* consumes at least 2 bytes, so a vector can contain no more than PY_SSIZE_T_MAX / 2 elements.  Code freely relies on that.

#if SIZEOF_VOID_P < 2
#   error "size of pointer less than 2 bytes?!"
#endif
"""
msg323541 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2018-08-14 21:37
Bah - the relevant thing to assert is really

assert((size_t)Py_SIZE(a) + (size_t)Py_SIZE(b) <= (size_t)PY_SSIZE_T_MAX);

C sucks ;-)
msg326066 - (view) Author: Windson Yang (Windson Yang) * Date: 2018-09-22 02:58
Hello, Tim Peters. I wonder why we need to add size_t here:

    assert((size_t)Py_SIZE(a) + (size_t)Py_SIZE(b) <= (size_t)PY_SSIZE_T_MAX);

AFAIK, PY_SSIZE_T_MAX = ((Py_ssize_t)(((size_t)-1)>>1)) which is signed, either Py_SIZE(a) or Py_SIZE(b) is a positive number. Why we need to concast them to size_t, Thank you?
msg326068 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2018-09-22 03:57
Because the behavior of signed integer overflow isn't defined in C.  Picture a 3-bit integer type, where the maximum value of the signed integer type is 3.  3+3 has no defined result.  Cast them to the unsigned flavor of the integer type, though, and the result is defined to be 6.
msg334346 - (view) Author: Windson Yang (Windson Yang) * Date: 2019-01-25 07:43
I reviewed the patch months ago, maybe we need a core developer review this path?
History
Date User Action Args
2019-01-25 07:43:35Windson Yangsetmessages: + msg334346
2018-09-22 03:57:36tim.peterssetmessages: + msg326068
2018-09-22 02:58:00Windson Yangsetnosy: + Windson Yang
messages: + msg326066
2018-08-14 21:37:03tim.peterssetmessages: + msg323541
2018-08-14 21:27:40tim.peterssetmessages: + msg323540
2018-08-14 00:55:28rhettingersetnosy: + tim.peters, mark.dickinson, serhiy.storchaka
2018-08-13 19:39:51sir-sigurdsetkeywords: + patch
stage: patch review
pull_requests: + pull_request8233
2018-08-13 19:38:54sir-sigurdcreate