Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integer overflow in case_operation #66833

Closed
pkt mannequin opened this issue Oct 15, 2014 · 24 comments
Closed

Integer overflow in case_operation #66833

pkt mannequin opened this issue Oct 15, 2014 · 24 comments
Labels
topic-unicode type-crash A hard crash of the interpreter, possibly with a core dump

Comments

@pkt
Copy link
Mannequin

pkt mannequin commented Oct 15, 2014

BPO 22643
Nosy @vstinner, @benjaminp, @ezio-melotti, @serhiy-storchaka
Files
  • poc_case_op.py: poc
  • test_case_operation_overflow_memory_error.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2015-11-09.10:20:48.290>
    created_at = <Date 2014-10-15.14:50:30.144>
    labels = ['expert-unicode', 'type-crash']
    title = 'Integer overflow in case_operation'
    updated_at = <Date 2015-11-09.10:20:48.289>
    user = 'https://bugs.python.org/pkt'

    bugs.python.org fields:

    activity = <Date 2015-11-09.10:20:48.289>
    actor = 'serhiy.storchaka'
    assignee = 'none'
    closed = True
    closed_date = <Date 2015-11-09.10:20:48.290>
    closer = 'serhiy.storchaka'
    components = ['Unicode']
    creation = <Date 2014-10-15.14:50:30.144>
    creator = 'pkt'
    dependencies = []
    files = ['36941', '38818']
    hgrepos = []
    issue_num = 22643
    keywords = ['patch']
    message_count = 24.0
    messages = ['229455', '229467', '229470', '229471', '229472', '229473', '229476', '229478', '229479', '229480', '229481', '229482', '229483', '229484', '229486', '229551', '239956', '239959', '239960', '239970', '239985', '239986', '239987', '254272']
    nosy_count = 7.0
    nosy_names = ['vstinner', 'benjamin.peterson', 'ezio.melotti', 'Arfrever', 'python-dev', 'serhiy.storchaka', 'pkt']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'crash'
    url = 'https://bugs.python.org/issue22643'
    versions = ['Python 3.4', 'Python 3.5']

    @pkt
    Copy link
    Mannequin Author

    pkt mannequin commented Oct 15, 2014

    Crashes python 3.4.1.

    # Objects\unicodeobject.c
    #
    # static PyObject *
    # case_operation(PyObject *self,
    # Py_ssize_t (*perform)(int, void *, Py_ssize_t, Py_UCS4 *, Py_UCS4 *))
    # {
    # PyObject *res = NULL;
    # Py_ssize_t length, newlength = 0;
    # int kind, outkind;
    # (...)
    # 1 length = PyUnicode_GET_LENGTH(self);
    # 2 tmp = PyMem_MALLOC(sizeof(Py_UCS4) * 3 * length);
    # (...)
    # 3 newlength = perform(kind, data, length, tmp, &maxchar);
    #
    # 1. there are no safety checks
    # 2. 12*length overflows
    # 3. perform() writes to tmp buffer, which is too small to hold the result

    @pkt pkt mannequin added the type-security A security issue label Oct 15, 2014
    @ezio-melotti ezio-melotti added type-crash A hard crash of the interpreter, possibly with a core dump and removed type-security A security issue labels Oct 15, 2014
    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Oct 15, 2014

    New changeset 449b1f427cc7 by Benjamin Peterson in branch '3.3':
    fix integer overflow in unicode case operations (closes bpo-22643)
    https://hg.python.org/cpython/rev/449b1f427cc7

    New changeset 570e70252d5d by Benjamin Peterson in branch '3.4':
    merge 3.3 (bpo-22643)
    https://hg.python.org/cpython/rev/570e70252d5d

    New changeset c2980ec10a4c by Benjamin Peterson in branch 'default':
    merge 3.4 (bpo-22643)
    https://hg.python.org/cpython/rev/c2980ec10a4c

    @python-dev python-dev mannequin closed this as completed Oct 15, 2014
    @vstinner
    Copy link
    Member

    Benjamin, could you please first propose a patch for review instead of commiting directly your change? Especially for security related changes.

    + if (length > PY_SSIZE_T_MAX / 3 ||
    + length > PY_SIZE_MAX / (3 * sizeof(Py_UCS4))) {
    + PyErr_SetString(PyExc_OverflowError, "string is too long");
    + return NULL;
    + }
    tmp = PyMem_MALLOC(sizeof(Py_UCS4) * 3 * length);

    PyMem_MALLOC() returns NULL if the length is larger than PY_SSIZE_T_MAX, so the overflow check doesn't look correct. The overflow check can be replaced with:

    if ((size_t)length > PY_SSIZE_T_MAX / (3 * sizeof(Py_UCS4))) ...
    

    @vstinner vstinner reopened this Oct 15, 2014
    @vstinner
    Copy link
    Member

    Other changesets related to this issue:

    changeset: 93071:6a91e616485a
    branch: 3.3
    parent: 93068:449b1f427cc7
    user: Benjamin Peterson <benjamin@python.org>
    date: Wed Oct 15 11:51:05 2014 -0400
    files: Objects/unicodeobject.c
    description:
    make sure length is unsigned

    (followed by merged into 3.4 and 3.5)

    @benjaminp
    Copy link
    Contributor

    Cool, I forgot about that.

    On Wed, Oct 15, 2014, at 12:11, STINNER Victor wrote:

    STINNER Victor added the comment:

    Other changesets related to this issue:

    changeset: 93071:6a91e616485a
    branch: 3.3
    parent: 93068:449b1f427cc7
    user: Benjamin Peterson <benjamin@python.org>
    date: Wed Oct 15 11:51:05 2014 -0400
    files: Objects/unicodeobject.c
    description:
    make sure length is unsigned

    (followed by merged into 3.4 and 3.5)

    ----------


    Python tracker <report@bugs.python.org>
    <http://bugs.python.org/issue22643\>


    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Oct 15, 2014

    New changeset f963cc1f96cf by Benjamin Peterson in branch '3.3':
    it suffices to check for PY_SSIZE_T_MAX overflow (bpo-22643)
    https://hg.python.org/cpython/rev/f963cc1f96cf

    New changeset 8195d48a5c43 by Benjamin Peterson in branch 'default':
    merge 3.4 (closes bpo-22643)
    https://hg.python.org/cpython/rev/8195d48a5c43

    @python-dev python-dev mannequin closed this as completed Oct 15, 2014
    @serhiy-storchaka
    Copy link
    Member

    I think this is a place where _PyUnicodeWriter would be appropriate (of course this is different issue).

    @serhiy-storchaka
    Copy link
    Member

    The test should be decorated with the bigmemtest decorator. And I think that condition sys.maxsize < 2**32 would be more robust.

    @benjaminp
    Copy link
    Contributor

    It's only 341 MB.

    On Wed, Oct 15, 2014, at 13:29, Serhiy Storchaka wrote:

    Serhiy Storchaka added the comment:

    The test should be decorated with the bigmemtest decorator. And I think
    that condition sys.maxsize < 2**32 would be more robust.

    ----------
    status: closed -> open


    Python tracker <report@bugs.python.org>
    <http://bugs.python.org/issue22643\>


    @serhiy-storchaka
    Copy link
    Member

    And this test is CPython only. It relies on specific implementation detail. After changing current implementation (which inefficiently uses memory) this test will be dropped.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Oct 15, 2014

    New changeset 33290d0dd946 by Benjamin Peterson in branch '3.4':
    merge 3.3 (bpo-22643)
    https://hg.python.org/cpython/rev/33290d0dd946

    New changeset ffabb674140c by Benjamin Peterson in branch 'default':
    merge 3.4 (bpo-22643)
    https://hg.python.org/cpython/rev/ffabb674140c

    @benjaminp
    Copy link
    Contributor

    Since the memory requirement is less than 500MB, I don't think it needs a bigmemtest decorator.

    @serhiy-storchaka
    Copy link
    Member

    It's only 341 MB.

    It's 2**32//12*2 ~ 683 MiB for original string (but I'm not sure that non-
    ASCII string is needed for this test) + 2**32 = 4GiB if the test will fail for
    some reason. Some buildbots AFAIK have memory less than 683 MiB.

    @benjaminp
    Copy link
    Contributor

    It's Latin 1, so the chars only use one byte:

    >>> sys.getsizeof("üü") - sys.getsizeof("ü")
    1

    On Wed, Oct 15, 2014, at 13:46, Serhiy Storchaka wrote:

    Serhiy Storchaka added the comment:

    > It's only 341 MB.

    It's 2**32//12*2 ~ 683 MiB for original string (but I'm not sure that
    non-
    ASCII string is needed for this test) + 2**32 = 4GiB if the test will
    fail for
    some reason. Some buildbots AFAIK have memory less than 683 MiB.

    ----------


    Python tracker <report@bugs.python.org>
    <http://bugs.python.org/issue22643\>


    @benjaminp
    Copy link
    Contributor

    More explicitly:

    >>> sys.getsizeof("ü"*(2**32//12 + 1))//1024//1024
    341

    @Arfrever
    Copy link
    Mannequin

    Arfrever mannequin commented Oct 16, 2014

    Summary of commits:
    3.3:
    449b1f427cc7
    6a91e616485a
    f963cc1f96cf
    cda907a02a80
    b4a6be12a4bc
    3.4:
    570e70252d5d
    a6d0b1275d4b
    2a25076c50ad
    33290d0dd946
    2f90ea9e60ef
    3.5:
    c2980ec10a4c
    d0f8f242531b
    8195d48a5c43
    ffabb674140c
    030fda7b1de8

    @serhiy-storchaka
    Copy link
    Member

    http://buildbot.python.org/all/builders/PPC64%20AIX%203.x/builds/3466/steps/test/logs/stdio
    ======================================================================
    ERROR: test_case_operation_overflow (test.test_unicode.UnicodeTest)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "/home/shager/cpython-buildarea/3.x.edelsohn-aix-ppc64/build/Lib/test/test_unicode.py", line 852, in test_case_operation_overflow
        self.assertRaises(OverflowError, (""*(2**32//12 + 1)).upper)
    MemoryError

    @vstinner
    Copy link
    Member

    vstinner commented Apr 3, 2015

    self.assertRaises(OverflowError, ("�"\*(2\*\*32//12 + 1)).upper)
    

    MemoryError

    Hum, even with the PEP-393, this string is still large: 682 MB.

    $ python3
    Python 3.4.1 (default, Nov  3 2014, 14:38:10) 
    >>> sys.getsizeof(""*(2**32//12 + 1)) / 1024.**2
    682.6667385101318

    I guess that Serhiy suggests to put a @bigmemtest decorator on test_case_operation_overflow().

    The test expects that large_string.upper() raises immediatly an OverflowError, so I understand that a new string is not created.

    Hum, it's strange to have an OverflowError here. Integer overflow on memory allocation usually raises MemoryError, not OverflowError. It looks like unicodeobject.c is not consistent here.

    @vstinner vstinner reopened this Apr 3, 2015
    @serhiy-storchaka
    Copy link
    Member

    It's strange to have "�" here. Should be "ü".

    @vstinner
    Copy link
    Member

    vstinner commented Apr 3, 2015

    It's strange to have "�" here. Should be "ü".

    Oh right, I saw "�" in the bug tracker and in the buildbot output. But
    it's a web browser issue, if you download the raw buildbout output,
    you will see "ü". So the string takes less memory (1 byte per
    character, not 2).

    @serhiy-storchaka
    Copy link
    Member

    Perhaps adding bigmemtest is not worth if memory requirements are less than 1 GiB. But I would separate the creating of long tested string and the calling of case converting method, so we will sure what operation is failed. Proposed patch skips the test if can't create needed string and clean locals, so failing test_case_operation_overflow no longer causes multiple MemoryErrors in following tests.

    @vstinner
    Copy link
    Member

    vstinner commented Apr 3, 2015

    Perhaps adding bigmemtest is not worth if memory requirements are less than 1 GiB.

    What is the current limit if you don't pass -M option to regrtest? In @bigimemtest, I see "maxsize = 5147". Is it a number of bytes? 5 kB of memory?

    IMO it's interested to use @bigmemtest if a test uses 100 MB of memory or more, but regrtest should use tests using less than 500 MB by default (or maybe less than 1 GB by default).

    @serhiy-storchaka
    Copy link
    Member

    bigimem tests are ran with dummy size = 5147 if dry_run = True (default) and skipped if dry_run = False. Almost always you need dry_run = False.

    The -M option doesn't work with less than 1 GiB.

    In this case likely the buildbot had enough memory at the start, but then some memory leaked in tests (perhaps mainly in tracebacks of failed tests).

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Nov 7, 2015

    New changeset 5fae49ef94fd by Serhiy Storchaka in branch '3.4':
    Issue bpo-22643: Skip test_case_operation_overflow on computers with low memory.
    https://hg.python.org/cpython/rev/5fae49ef94fd

    New changeset 6b00bee218ff by Serhiy Storchaka in branch '3.5':
    Issue bpo-22643: Skip test_case_operation_overflow on computers with low memory.
    https://hg.python.org/cpython/rev/6b00bee218ff

    New changeset b1c5949a3af4 by Serhiy Storchaka in branch 'default':
    Issue bpo-22643: Skip test_case_operation_overflow on computers with low memory.
    https://hg.python.org/cpython/rev/b1c5949a3af4

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    topic-unicode type-crash A hard crash of the interpreter, possibly with a core dump
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants