This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Update overflow checks in resize_buffer
Type: Stage: resolved
Components: IO Versions: Python 3.8, Python 3.7, Python 3.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: Windson Yang, gregory.p.smith, scoder, vstinner
Priority: normal Keywords:

Created on 2018-10-06 05:31 by Windson Yang, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Messages (5)
msg327223 - (view) Author: Windson Yang (Windson Yang) * Date: 2018-10-06 05:31
In [resize_buffer](https://github.com/python/cpython/blob/e42b705188271da108de42b55d9344642170aa2b/Modules/_io/stringio.c#L85)

    /* For simplicity, stay in the range of the signed type. Anyway, Python
       doesn't allow strings to be longer than this. */
    if (size > PY_SSIZE_T_MAX)
        goto overflow;
        ...

IMO, we should check the overflow with

    if (size > PY_SSIZE_T_MAX/sizeof(Py_UCS4))

Or we can just delete this code because we will check later at [alloc_check](https://github.com/python/cpython/blob/e42b705188271da108de42b55d9344642170aa2b/Modules/_io/stringio.c#L107)

BTW, I found we only use PY_SIZE_MAX here in CPython, I wonder why we do not use PY_SSIZE_T_MAX instead?
msg327705 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2018-10-14 13:13
If I understand the code right, "PY_SSIZE_T_MAX/sizeof(Py_UCS4)" would not be correct since it would unnecessarily limit the length of ASCII-only unicode strings.

I think the initial check avoids the risk of integer overflow in the calculations below, so it's not entirely redundant.
msg328059 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2018-10-19 18:27
correct, i don't see an obvious problem in the existing code.
msg328273 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-10-22 20:55
The current code LGTM.
msg329014 - (view) Author: Windson Yang (Windson Yang) * Date: 2018-10-31 20:22
Sorry, Stefan Behnel, I still don't get it. alloc will always bigger than size after the if else case:
 
    if (size < alloc / 2) {
        /* Major downsize; resize down to exact size. */
        alloc = size + 1;
    }
    else if (size < alloc) {
        /* Within allocated size; quick exit */
        return 0;
    }
    else if (size <= alloc * 1.125) {
        /* Moderate upsize; overallocate similar to list_resize() */
        alloc = size + (size >> 3) + (size < 9 ? 3 : 6);
    }
    else {
        /* Major upsize; resize up to exact size */
        alloc = size + 1;
    }

Since we limit the alloc at:

    if (alloc > PY_SIZE_MAX / sizeof(Py_UCS4))
        goto overflow;

whenever size > PY_SIZE_MAX / sizeof(Py_UCS4) at first will cause alloc overflow. So why not limit size to PY_SIZE_MAX / sizeof(Py_UCS4) at the beginning?
History
Date User Action Args
2022-04-11 14:59:06adminsetgithub: 79093
2018-10-31 20:22:15Windson Yangsetmessages: + msg329014
2018-10-22 20:55:02vstinnersetstatus: open -> closed

nosy: + vstinner
messages: + msg328273

resolution: fixed
stage: resolved
2018-10-19 18:27:12gregory.p.smithsetnosy: + gregory.p.smith
messages: + msg328059
2018-10-14 13:13:24scodersetnosy: + scoder
messages: + msg327705
2018-10-12 16:18:14terry.reedysetversions: - Python 3.4, Python 3.5
2018-10-06 05:31:56Windson Yangcreate