classification
Title: failing overflow checks in replace_*
Type: crash Stage: resolved
Components: Interpreter Core Versions: Python 3.7, Python 3.6, Python 3.5, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: benjamin.peterson, martin.panter, matejcik, python-dev, serhiy.storchaka, xiang.zhang
Priority: normal Keywords: patch

Created on 2017-01-03 18:36 by matejcik, last changed 2017-01-14 07:25 by python-dev. This issue is now closed.

Files
File name Uploaded Description Edit
replace-overflow-check.patch xiang.zhang, 2017-01-04 16:26 review
replace-overflow-check-v2.patch xiang.zhang, 2017-01-05 06:06 review
unicode-overflow.patch matejcik, 2017-01-05 10:31
replace-overflow-check-v3.patch xiang.zhang, 2017-01-05 11:14 review
overflow-checks-3x.patch xiang.zhang, 2017-01-09 03:33 review
overflow-checks-3x-2.patch xiang.zhang, 2017-01-09 05:36 review
Messages (22)
msg284588 - (view) Author: jan matejek (matejcik) * Date: 2017-01-03 18:36
Related to http://bugs.python.org/issue1621 and http://bugs.python.org/issue27473
GCC 6 optimizes away broken overflow checks. This leads to segfaults on test_replace_overflow, at least for strings and bytearrays.
msg284607 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2017-01-04 03:51
Does that mean it no longer respects -fwrapv?
msg284638 - (view) Author: jan matejek (matejcik) * Date: 2017-01-04 14:13
It does, but "-fwrapv" is not automatically added when you specify custom OPT flags. I should have clarified that in the original report.
msg284639 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2017-01-04 14:31
> It does, but "-fwrapv" is not automatically added when you specify custom OPT flags.

Indeed I think this is why the changes in #27473 and #1621 make sense.

> GCC 6 optimizes away broken overflow checks.

I am sorry but I don't understand this. What do you mean by broken overflow checks? Is `if (size > PY_SSIZE_T_MAX - vo.len)` broken?
msg284640 - (view) Author: jan matejek (matejcik) * Date: 2017-01-04 14:37
No, your changes from issue 27473 are OK. However functions like replace_interleave and replace_single_character etc. still use the broken code:

/* use the difference between current and new, hence the "-1" */
/*   result_len = self_len + count * (to_len-1)  */
product = count * (to_len-1);
if (product / (to_len-1) != count) {
    PyErr_SetString(PyExc_OverflowError, "replace bytes is too long");
    return NULL;
}
msg284648 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2017-01-04 16:26
Ohh sorry I misunderstand your intention. :-( Attach a patch. :-)
msg284658 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-01-04 17:17
It would be better to write the code in the same form as in 3.x. This could help backporting other patches if needed.
msg284708 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2017-01-05 06:06
> It would be better to write the code in the same form as in 3.x. This could help backporting other patches if needed.

Yes, it is. :-) I only had a fast glance at at 3.x codes and thought they didn't share much. :-( But apparently I was wrong. The new patch backports 3.x code.
msg284721 - (view) Author: jan matejek (matejcik) * Date: 2017-01-05 10:31
some instances are present in unicodeobject.c too
msg284728 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-01-05 11:10
replace-overflow-check-v2.patch LGTM.
msg284731 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2017-01-05 11:14
> some instances are present in unicodeobject.c too

Thanks for your notification.

v3 adds the some changes needed I could find in unicodeobject.c. Nothing in v2 is changed.
msg284734 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-01-05 11:21
It seems to me that unicodeobject.c should be changed in 3.x too.
msg284735 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2017-01-05 11:30
> It seems to me that unicodeobject.c should be changed in 3.x too.

I don't understand. Would you mind tell me which part? The only suspicious part to me is `PY_SSIZE_T_MAX >> (rkind - 1)`. It seems should be `PY_SSIZE_T_MAX / rkind`.

And I missed jan's patch so wrote mine. :-( But they are almost the same. :-)
msg284897 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-01-07 07:28
The code in PyUnicode_Join() checks on overflow after making an addition.

        sz += PyUnicode_GET_LENGTH(item);
        item_maxchar = PyUnicode_MAX_CHAR_VALUE(item);
        maxchar = Py_MAX(maxchar, item_maxchar);
        if (i != 0)
            sz += seplen;
        if (sz < old_sz || sz > PY_SSIZE_T_MAX) {
            PyErr_SetString(PyExc_OverflowError,
                            "join() result is too long for a Python string");
            goto onError;
        }

Maybe there are other cases.
msg285017 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2017-01-09 03:16
New changeset 29e505526bdd by Xiang Zhang in branch '2.7':
Issue #29145: Fix overflow checks in string, bytearray and unicode.
https://hg.python.org/cpython/rev/29e505526bdd
msg285018 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2017-01-09 03:33
I committed the patch to 2.7. The new patch is for 3.x.
msg285020 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2017-01-09 04:38
FTR I thought the consensus was not to backport these fixes unless there was a demonstrated problem: <https://bugs.python.org/issue1621#msg144499>, though personally, I would be in favour of backporting in many cases.

Regarding str.join() in unicode.c, see also my unicode.patch in that bug. I included a test case that works on 32-bit platforms, which you may use. (From memory, you probably need to disable -fwrapv and maybe use -ftrapv or the undefined behaviour sanitizer to make the test fail.)
msg285022 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2017-01-09 05:36
> FTR I thought the consensus was not to backport these fixes unless there was a demonstrated problem: <https://bugs.python.org/issue1621#msg144499>, though personally, I would be in favour of backporting in many cases.

Sorry not know that. :-( Another benefit of the change is it actually backports the codes from 3.x so it's easy for later maintainence and it doesn't add additional work.

> I included a test case that works on 32-bit platforms, which you may use.

Thanks. But I don't get such an environment to test so I prefer not to add it now. I adjust my patch according to yours. :-)
msg285078 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2017-01-09 22:34
Both fixes (join and replace) look good to me. However I don’t think it is necessary to change the exception message in 3.5 or 3.6.
msg285086 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2017-01-10 03:05
New changeset 09b1cdac74de by Xiang Zhang in branch '3.5':
Issue #29145: Fix overflow checks in str.replace() and str.join().
https://hg.python.org/cpython/rev/09b1cdac74de

New changeset d966ccda9f17 by Xiang Zhang in branch '3.6':
Issue #29145: Merge 3.5.
https://hg.python.org/cpython/rev/d966ccda9f17

New changeset f61a0e8ec022 by Xiang Zhang in branch 'default':
Issue #29145: Merge 3.6.
https://hg.python.org/cpython/rev/f61a0e8ec022
msg285087 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2017-01-10 03:08
Thanks you all. :-)
msg285465 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2017-01-14 07:25
New changeset dd2c7d497878 by Martin Panter in branch '3.5':
Issues #1621, #29145: Test for str.join() overflow
https://hg.python.org/cpython/rev/dd2c7d497878

New changeset 0c6ea411af17 by Martin Panter in branch 'default':
Issue #29145: Merge test from 3.6
https://hg.python.org/cpython/rev/0c6ea411af17
History
Date User Action Args
2017-01-14 07:25:16python-devsetmessages: + msg285465
2017-01-10 03:08:46xiang.zhangsetstatus: open -> closed
resolution: fixed
messages: + msg285087

stage: commit review -> resolved
2017-01-10 03:05:14python-devsetmessages: + msg285086
2017-01-09 22:34:30martin.pantersetmessages: + msg285078
2017-01-09 05:36:34xiang.zhangsetfiles: + overflow-checks-3x-2.patch

messages: + msg285022
2017-01-09 04:38:25martin.pantersetnosy: + martin.panter
messages: + msg285020
2017-01-09 04:24:18martin.panterlinkissue1621 dependencies
2017-01-09 03:33:34xiang.zhangsetfiles: + overflow-checks-3x.patch

stage: patch review -> commit review
messages: + msg285018
versions: + Python 3.5, Python 3.6, Python 3.7
2017-01-09 03:16:03python-devsetnosy: + python-dev
messages: + msg285017
2017-01-07 07:28:29serhiy.storchakasetmessages: + msg284897
2017-01-05 11:30:57xiang.zhangsetmessages: + msg284735
2017-01-05 11:21:29serhiy.storchakasetmessages: + msg284734
2017-01-05 11:14:51xiang.zhangsetfiles: + replace-overflow-check-v3.patch

messages: + msg284731
2017-01-05 11:10:18serhiy.storchakasetmessages: + msg284728
2017-01-05 10:31:55matejciksetfiles: + unicode-overflow.patch

messages: + msg284721
2017-01-05 06:06:51xiang.zhangsetfiles: + replace-overflow-check-v2.patch

messages: + msg284708
2017-01-04 17:17:05serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg284658
2017-01-04 16:26:10xiang.zhangsetfiles: + replace-overflow-check.patch
keywords: + patch
messages: + msg284648

stage: patch review
2017-01-04 14:37:55matejciksetmessages: + msg284640
2017-01-04 14:31:02xiang.zhangsetnosy: + xiang.zhang
messages: + msg284639
2017-01-04 14:13:18matejciksetmessages: + msg284638
2017-01-04 03:51:41benjamin.petersonsetnosy: + benjamin.peterson
messages: + msg284607
2017-01-03 18:36:42matejcikcreate