msg229455 - (view) |
Author: paul (pkt) |
Date: 2014-10-15 14:50 |
Crashes python 3.4.1.
# Objects\unicodeobject.c
#
# static PyObject *
# case_operation(PyObject *self,
# Py_ssize_t (*perform)(int, void *, Py_ssize_t, Py_UCS4 *, Py_UCS4 *))
# {
# PyObject *res = NULL;
# Py_ssize_t length, newlength = 0;
# int kind, outkind;
# (...)
# 1 length = PyUnicode_GET_LENGTH(self);
# 2 tmp = PyMem_MALLOC(sizeof(Py_UCS4) * 3 * length);
# (...)
# 3 newlength = perform(kind, data, length, tmp, &maxchar);
#
# 1. there are no safety checks
# 2. 12*length overflows
# 3. perform() writes to tmp buffer, which is too small to hold the result
|
msg229467 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2014-10-15 15:49 |
New changeset 449b1f427cc7 by Benjamin Peterson in branch '3.3':
fix integer overflow in unicode case operations (closes #22643)
https://hg.python.org/cpython/rev/449b1f427cc7
New changeset 570e70252d5d by Benjamin Peterson in branch '3.4':
merge 3.3 (#22643)
https://hg.python.org/cpython/rev/570e70252d5d
New changeset c2980ec10a4c by Benjamin Peterson in branch 'default':
merge 3.4 (#22643)
https://hg.python.org/cpython/rev/c2980ec10a4c
|
msg229470 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2014-10-15 16:10 |
Benjamin, could you please first propose a patch for review instead of commiting directly your change? Especially for security related changes.
+ if (length > PY_SSIZE_T_MAX / 3 ||
+ length > PY_SIZE_MAX / (3 * sizeof(Py_UCS4))) {
+ PyErr_SetString(PyExc_OverflowError, "string is too long");
+ return NULL;
+ }
tmp = PyMem_MALLOC(sizeof(Py_UCS4) * 3 * length);
PyMem_MALLOC() returns NULL if the length is larger than PY_SSIZE_T_MAX, so the overflow check doesn't look correct. The overflow check can be replaced with:
if ((size_t)length > PY_SSIZE_T_MAX / (3 * sizeof(Py_UCS4))) ...
|
msg229471 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2014-10-15 16:11 |
Other changesets related to this issue:
changeset: 93071:6a91e616485a
branch: 3.3
parent: 93068:449b1f427cc7
user: Benjamin Peterson <benjamin@python.org>
date: Wed Oct 15 11:51:05 2014 -0400
files: Objects/unicodeobject.c
description:
make sure length is unsigned
(followed by merged into 3.4 and 3.5)
|
msg229472 - (view) |
Author: Benjamin Peterson (benjamin.peterson) *  |
Date: 2014-10-15 16:15 |
Cool, I forgot about that.
On Wed, Oct 15, 2014, at 12:11, STINNER Victor wrote:
>
> STINNER Victor added the comment:
>
> Other changesets related to this issue:
>
> changeset: 93071:6a91e616485a
> branch: 3.3
> parent: 93068:449b1f427cc7
> user: Benjamin Peterson <benjamin@python.org>
> date: Wed Oct 15 11:51:05 2014 -0400
> files: Objects/unicodeobject.c
> description:
> make sure length is unsigned
>
> (followed by merged into 3.4 and 3.5)
>
> ----------
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue22643>
> _______________________________________
|
msg229473 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2014-10-15 16:21 |
New changeset f963cc1f96cf by Benjamin Peterson in branch '3.3':
it suffices to check for PY_SSIZE_T_MAX overflow (#22643)
https://hg.python.org/cpython/rev/f963cc1f96cf
New changeset 8195d48a5c43 by Benjamin Peterson in branch 'default':
merge 3.4 (closes #22643)
https://hg.python.org/cpython/rev/8195d48a5c43
|
msg229476 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2014-10-15 17:19 |
I think this is a place where _PyUnicodeWriter would be appropriate (of course this is different issue).
|
msg229478 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2014-10-15 17:29 |
The test should be decorated with the bigmemtest decorator. And I think that condition sys.maxsize < 2**32 would be more robust.
|
msg229479 - (view) |
Author: Benjamin Peterson (benjamin.peterson) *  |
Date: 2014-10-15 17:36 |
It's only 341 MB.
On Wed, Oct 15, 2014, at 13:29, Serhiy Storchaka wrote:
>
> Serhiy Storchaka added the comment:
>
> The test should be decorated with the bigmemtest decorator. And I think
> that condition sys.maxsize < 2**32 would be more robust.
>
> ----------
> status: closed -> open
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue22643>
> _______________________________________
|
msg229480 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2014-10-15 17:37 |
And this test is CPython only. It relies on specific implementation detail. After changing current implementation (which inefficiently uses memory) this test will be dropped.
|
msg229481 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2014-10-15 17:41 |
New changeset 33290d0dd946 by Benjamin Peterson in branch '3.4':
merge 3.3 (#22643)
https://hg.python.org/cpython/rev/33290d0dd946
New changeset ffabb674140c by Benjamin Peterson in branch 'default':
merge 3.4 (#22643)
https://hg.python.org/cpython/rev/ffabb674140c
|
msg229482 - (view) |
Author: Benjamin Peterson (benjamin.peterson) *  |
Date: 2014-10-15 17:43 |
Since the memory requirement is less than 500MB, I don't think it needs a bigmemtest decorator.
|
msg229483 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2014-10-15 17:46 |
> It's only 341 MB.
It's 2**32//12*2 ~ 683 MiB for original string (but I'm not sure that non-
ASCII string is needed for this test) + 2**32 = 4GiB if the test will fail for
some reason. Some buildbots AFAIK have memory less than 683 MiB.
|
msg229484 - (view) |
Author: Benjamin Peterson (benjamin.peterson) *  |
Date: 2014-10-15 17:48 |
It's Latin 1, so the chars only use one byte:
>>> sys.getsizeof("üü") - sys.getsizeof("ü")
1
On Wed, Oct 15, 2014, at 13:46, Serhiy Storchaka wrote:
>
> Serhiy Storchaka added the comment:
>
> > It's only 341 MB.
>
> It's 2**32//12*2 ~ 683 MiB for original string (but I'm not sure that
> non-
> ASCII string is needed for this test) + 2**32 = 4GiB if the test will
> fail for
> some reason. Some buildbots AFAIK have memory less than 683 MiB.
>
> ----------
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue22643>
> _______________________________________
|
msg229486 - (view) |
Author: Benjamin Peterson (benjamin.peterson) *  |
Date: 2014-10-15 17:51 |
More explicitly:
>>> sys.getsizeof("ü"*(2**32//12 + 1))//1024//1024
341
|
msg229551 - (view) |
Author: Arfrever Frehtes Taifersar Arahesis (Arfrever) *  |
Date: 2014-10-16 20:28 |
Summary of commits:
3.3:
449b1f427cc7
6a91e616485a
f963cc1f96cf
cda907a02a80
b4a6be12a4bc
3.4:
570e70252d5d
a6d0b1275d4b
2a25076c50ad
33290d0dd946
2f90ea9e60ef
3.5:
c2980ec10a4c
d0f8f242531b
8195d48a5c43
ffabb674140c
030fda7b1de8
|
msg239956 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2015-04-03 06:19 |
http://buildbot.python.org/all/builders/PPC64%20AIX%203.x/builds/3466/steps/test/logs/stdio
======================================================================
ERROR: test_case_operation_overflow (test.test_unicode.UnicodeTest)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/shager/cpython-buildarea/3.x.edelsohn-aix-ppc64/build/Lib/test/test_unicode.py", line 852, in test_case_operation_overflow
self.assertRaises(OverflowError, ("�"*(2**32//12 + 1)).upper)
MemoryError
----------------------------------------------------------------------
|
msg239959 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2015-04-03 08:53 |
> self.assertRaises(OverflowError, ("�"*(2**32//12 + 1)).upper)
> MemoryError
Hum, even with the PEP 393, this string is still large: 682 MB.
$ python3
Python 3.4.1 (default, Nov 3 2014, 14:38:10)
>>> sys.getsizeof("�"*(2**32//12 + 1)) / 1024.**2
682.6667385101318
I guess that Serhiy suggests to put a @bigmemtest decorator on test_case_operation_overflow().
The test expects that large_string.upper() raises immediatly an OverflowError, so I understand that a new string is not created.
Hum, it's strange to have an OverflowError here. Integer overflow on memory allocation usually raises MemoryError, not OverflowError. It looks like unicodeobject.c is not consistent here.
|
msg239960 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2015-04-03 09:06 |
It's strange to have "�" here. Should be "ü".
|
msg239970 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2015-04-03 11:37 |
> It's strange to have "�" here. Should be "ü".
Oh right, I saw "�" in the bug tracker and in the buildbot output. But
it's a web browser issue, if you download the raw buildbout output,
you will see "ü". So the string takes less memory (1 byte per
character, not 2).
|
msg239985 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2015-04-03 12:46 |
Perhaps adding bigmemtest is not worth if memory requirements are less than 1 GiB. But I would separate the creating of long tested string and the calling of case converting method, so we will sure what operation is failed. Proposed patch skips the test if can't create needed string and clean locals, so failing test_case_operation_overflow no longer causes multiple MemoryErrors in following tests.
|
msg239986 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2015-04-03 12:53 |
> Perhaps adding bigmemtest is not worth if memory requirements are less than 1 GiB.
What is the current limit if you don't pass -M option to regrtest? In @bigimemtest, I see "maxsize = 5147". Is it a number of bytes? 5 kB of memory?
IMO it's interested to use @bigmemtest if a test uses 100 MB of memory or more, but regrtest should use tests using less than 500 MB by default (or maybe less than 1 GB by default).
|
msg239987 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2015-04-03 13:02 |
bigimem tests are ran with dummy size = 5147 if dry_run = True (default) and skipped if dry_run = False. Almost always you need dry_run = False.
The -M option doesn't work with less than 1 GiB.
In this case likely the buildbot had enough memory at the start, but then some memory leaked in tests (perhaps mainly in tracebacks of failed tests).
|
msg254272 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2015-11-07 14:56 |
New changeset 5fae49ef94fd by Serhiy Storchaka in branch '3.4':
Issue #22643: Skip test_case_operation_overflow on computers with low memory.
https://hg.python.org/cpython/rev/5fae49ef94fd
New changeset 6b00bee218ff by Serhiy Storchaka in branch '3.5':
Issue #22643: Skip test_case_operation_overflow on computers with low memory.
https://hg.python.org/cpython/rev/6b00bee218ff
New changeset b1c5949a3af4 by Serhiy Storchaka in branch 'default':
Issue #22643: Skip test_case_operation_overflow on computers with low memory.
https://hg.python.org/cpython/rev/b1c5949a3af4
|
|
Date |
User |
Action |
Args |
2022-04-11 14:58:09 | admin | set | github: 66833 |
2015-11-09 10:20:48 | serhiy.storchaka | set | status: open -> closed resolution: fixed stage: patch review -> resolved |
2015-11-07 14:56:03 | python-dev | set | messages:
+ msg254272 |
2015-04-03 13:02:42 | serhiy.storchaka | set | messages:
+ msg239987 |
2015-04-03 12:53:33 | vstinner | set | messages:
+ msg239986 |
2015-04-03 12:46:44 | serhiy.storchaka | set | files:
+ test_case_operation_overflow_memory_error.patch keywords:
+ patch stage: resolved -> patch review versions:
- Python 3.3 |
2015-04-03 12:46:07 | serhiy.storchaka | set | messages:
+ msg239985 |
2015-04-03 11:37:10 | vstinner | set | messages:
+ msg239970 |
2015-04-03 09:06:09 | serhiy.storchaka | set | messages:
+ msg239960 |
2015-04-03 08:53:54 | vstinner | set | status: closed -> open resolution: fixed -> (no value) messages:
+ msg239959
|
2015-04-03 06:19:26 | serhiy.storchaka | set | messages:
+ msg239956 |
2014-10-16 20:28:47 | Arfrever | set | nosy:
+ Arfrever
messages:
+ msg229551 versions:
+ Python 3.3, Python 3.5 |
2014-10-15 17:51:10 | benjamin.peterson | set | messages:
+ msg229486 |
2014-10-15 17:48:59 | benjamin.peterson | set | messages:
+ msg229484 |
2014-10-15 17:46:55 | serhiy.storchaka | set | messages:
+ msg229483 |
2014-10-15 17:43:45 | benjamin.peterson | set | status: open -> closed
messages:
+ msg229482 |
2014-10-15 17:41:10 | python-dev | set | messages:
+ msg229481 |
2014-10-15 17:37:30 | serhiy.storchaka | set | messages:
+ msg229480 |
2014-10-15 17:36:17 | benjamin.peterson | set | messages:
+ msg229479 |
2014-10-15 17:29:39 | serhiy.storchaka | set | status: closed -> open
messages:
+ msg229478 |
2014-10-15 17:19:20 | serhiy.storchaka | set | messages:
+ msg229476 |
2014-10-15 16:21:27 | python-dev | set | status: open -> closed resolution: fixed messages:
+ msg229473
|
2014-10-15 16:15:59 | benjamin.peterson | set | messages:
+ msg229472 |
2014-10-15 16:11:15 | vstinner | set | messages:
+ msg229471 |
2014-10-15 16:10:29 | vstinner | set | status: closed -> open resolution: fixed -> (no value) messages:
+ msg229470
|
2014-10-15 15:49:51 | python-dev | set | status: open -> closed
nosy:
+ python-dev messages:
+ msg229467
resolution: fixed stage: resolved |
2014-10-15 15:17:15 | ezio.melotti | set | nosy:
+ benjamin.peterson, serhiy.storchaka type: security -> crash
|
2014-10-15 15:15:26 | vstinner | set | nosy:
+ ezio.melotti components:
+ Unicode
|
2014-10-15 15:15:21 | vstinner | set | nosy:
+ vstinner
|
2014-10-15 14:50:30 | pkt | create | |