Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integer overflow in classic string formatting #58905

Closed
serhiy-storchaka opened this issue Apr 30, 2012 · 28 comments
Closed

Integer overflow in classic string formatting #58905

serhiy-storchaka opened this issue Apr 30, 2012 · 28 comments
Assignees
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error

Comments

@serhiy-storchaka
Copy link
Member

BPO 14700
Nosy @mdickinson, @vstinner, @ericvsmith, @bitdancer, @serhiy-storchaka
Files
  • pyunicode_format_integer_overflow.patch
  • formatting-overflow-2.7.patch
  • formatting-overflow-3.2.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/mdickinson'
    closed_at = <Date 2012-10-28.10:27:58.277>
    created_at = <Date 2012-04-30.16:55:26.013>
    labels = ['interpreter-core', 'type-bug']
    title = 'Integer overflow in classic string formatting'
    updated_at = <Date 2012-10-28.10:27:58.276>
    user = 'https://github.com/serhiy-storchaka'

    bugs.python.org fields:

    activity = <Date 2012-10-28.10:27:58.276>
    actor = 'mark.dickinson'
    assignee = 'mark.dickinson'
    closed = True
    closed_date = <Date 2012-10-28.10:27:58.277>
    closer = 'mark.dickinson'
    components = ['Interpreter Core']
    creation = <Date 2012-04-30.16:55:26.013>
    creator = 'serhiy.storchaka'
    dependencies = []
    files = ['25426', '27471', '27473']
    hgrepos = []
    issue_num = 14700
    keywords = ['patch']
    message_count = 28.0
    messages = ['159707', '159708', '159709', '159710', '159712', '159713', '159714', '159715', '159716', '159718', '159726', '159729', '159731', '159732', '160130', '160141', '160142', '160155', '172284', '172293', '172294', '172342', '172344', '172346', '173660', '174021', '174023', '174024']
    nosy_count = 6.0
    nosy_names = ['mark.dickinson', 'vstinner', 'eric.smith', 'r.david.murray', 'python-dev', 'serhiy.storchaka']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = None
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue14700'
    versions = ['Python 2.7', 'Python 3.2']

    @serhiy-storchaka
    Copy link
    Member Author

    Check for integer overflow for width and precision is buggy.

    Just a few examples (on platform with 32-bit int):

    >>> '%.21d' % 123
    '000000000000000000123'
    >>> '%.2147483648d' % 123
    '123'
    >>> '%.2147483650d' % 123
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    ValueError: prec too big
    
    >>> '%.21f' % (1./7)
    '0.142857142857142849213'
    >>> '%.2147483648f' % (1./7)
    '0.142857'
    >>> '%.2147483650f' % (1./7)
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    ValueError: prec too big

    @serhiy-storchaka serhiy-storchaka added interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error labels Apr 30, 2012
    @bitdancer
    Copy link
    Member

    Serhiy: FYI we use the versions field to indicate which versions the fix will be made in, not which versions the bug occurs in. Since only 2.7, 3.2, and 3.3 get bug fixes, I've changed the versions field to be just those three. (3.1 and 2.6 are still in the list because they get *security* fixes, but those are rare.)

    @mdickinson
    Copy link
    Member

    Indeed, Objects/unicodeobject.c (default branch) has this, at around line 13839:

                            if ((prec*10) / 10 != prec) {
                                PyErr_SetString(PyExc_ValueError,
                                                "prec too big");
                                goto onError;
                            }

    ... which since 'prec' has type int, will invoke undefined behaviour. There are probably many other cases like this one.

    Serhiy, what platform are you on? And are you applying any special compile-time flags? For gcc, we should be using -fwrapv, which in this case should make the above code work as intended.

    @mdickinson
    Copy link
    Member

    See get_integer in Objects/stringlib/unicode_format.h for a better way to do this sort of thing.

    @serhiy-storchaka
    Copy link
    Member Author

    Serhiy: FYI we use the versions field to indicate which versions the fix will be made in, not which versions the bug occurs in. Since only 2.7, 3.2, and 3.3 get bug fixes, I've changed the versions field to be just those three. (3.1 and 2.6 are still in the list because they get *security* fixes, but those are rare.)

    Well, David, I understand. This ridiculous bug is unlikely security
    issue.

    Here is a patch that fixes this bug.

    @serhiy-storchaka
    Copy link
    Member Author

    Serhiy, what platform are you on?

    32-bit Linux (Ubuntu), gcc 4.6. But it has to happen on any platform
    with a 32-bit integer (for 64-bit use 9223372036854775808).

    214748364*10/10 == 214748364 -- test passed
    214748364*10 + ('8'-'0') == -2147483648 -- oops!

    See also how is this problem solved in _struct.c.

    @mdickinson
    Copy link
    Member

    But it has to happen on any platform
    with a 32-bit integer

    Not necessarily: it's undefined behaviour, so the compiler can do as it wishes.

    Your patch should also address possible overflow of the addition.

    @serhiy-storchaka
    Copy link
    Member Author

    Your patch should also address possible overflow of the addition.

    Here there is no overflow. The patch limits prec of a little stronger
    (instead of 2147483647 to 2147483639 on a 32-bit platform).

    @mdickinson
    Copy link
    Member

    Ah yes, true.

    @mdickinson
    Copy link
    Member

    Any chance of some tests? :-)

    @serhiy-storchaka
    Copy link
    Member Author

    Any chance of some tests? :-)

    Even a test for struct tests only struct.calcsize on this specific case.
    For string formatting has no such function, on most platforms testing
    would be a memory overflow.

    @serhiy-storchaka
    Copy link
    Member Author

    32-bit Linux (Ubuntu), gcc 4.6.

    Sorry, gcc 4.4.

    @mdickinson
    Copy link
    Member

    Still, I think it would be useful to have some tests that exercise the overflow branches. (If those tests had existed before, then this issue would probably already have been found and fixed, since clang could have detected the undefined behaviour resulting from signed overflow.)

    I'll add tests and apply this later.

    @mdickinson mdickinson self-assigned this Apr 30, 2012
    @serhiy-storchaka
    Copy link
    Member Author

    I'll add tests and apply this later.

    Well, look at test_crasher in Lib/test/test_struct.py.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented May 7, 2012

    New changeset 064c2d0483f8 by Mark Dickinson in branch 'default':
    Issue bpo-14700: Fix two broken and undefined-behaviour-inducing overflow checks in old-style string formatting. Thanks Serhiy Storchaka for report and original patch.
    http://hg.python.org/cpython/rev/064c2d0483f8

    @serhiy-storchaka
    Copy link
    Member Author

    Mark, I deliberately have not used the exact formula for the overflow. Comparison with the constant is much cheaper than division or multiplication.

    Microbencmark:

    ./python -m timeit -s 'f="%.1234567890s"*100;x=("",)*100' 'f%x'

    Before changeset 064c2d0483f8: 10000 loops, best of 3: 27.1 usec per loop
    Changeset 064c2d0483f8: 10000 loops, best of 3: 25.7 usec per loop
    Original patch: 100000 loops, best of 3: 18.2 usec per loop

    @mdickinson
    Copy link
    Member

    Sure, I realize that, but I prefer not to be sloppy in the overflow check, and to use the same formula that's already used in stringlib. I somehow doubt that this micro-optimization is going to have any noticeable effect in real code.

    @serhiy-storchaka
    Copy link
    Member Author

    I somehow doubt that this micro-optimization is going to have any noticeable effect in real code.

    Agree. I just found this bug, trying to optimize the code.

    @mdickinson
    Copy link
    Member

    Re-opening: this should probably also be fixed in 2.7 and 3.2. See bpo-16096 for discussion.

    @mdickinson mdickinson reopened this Oct 7, 2012
    @mdickinson
    Copy link
    Member

    Here's a patch for 2.7.

    @mdickinson
    Copy link
    Member

    And for 3.2

    @serhiy-storchaka
    Copy link
    Member Author

    Only one comment. test_formatting_huge_precision should use not sys.maxsize, but _testcapi.INT_MAX. Other tests can use _testcapi.PY_SSIZE_T_MAX.

    I think this tests are worth to add for 3.3 and 3.4. Your old test for this bug (064c2d0483f8) actually does not test the bug on all plathforms.

    @mdickinson
    Copy link
    Member

    Thanks for reviewing. I was being lazy with the checks; I'll fix that.

    Agreed that it's worth forward porting the tests to 3.3 and 3.4; I'll do that.

    @vstinner
    Copy link
    Member

    vstinner commented Oct 7, 2012

    For your information, I fixed recently PyUnicode_FromFormatV() to detect overflows on width and precision:

    changeset: 79543:d1369daeb9ec
    user: Victor Stinner <victor.stinner@gmail.com>
    date: Sat Oct 06 23:05:00 2012 +0200
    files: Objects/unicodeobject.c
    description:
    Issue bpo-16147: PyUnicode_FromFormatV() now detects integer overflow when parsing
    width and precision

    @serhiy-storchaka
    Copy link
    Member Author

    Mark, can I help?

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Oct 28, 2012

    New changeset 21fb1767e185 by Mark Dickinson in branch '2.7':
    Issue bpo-14700: Fix buggy overflow checks for large precision and width in new-style and old-style formatting.
    http://hg.python.org/cpython/rev/21fb1767e185

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Oct 28, 2012

    New changeset 102df748572d by Mark Dickinson in branch '3.2':
    Issue bpo-14700: Fix buggy overflow checks for large precision and width in new-style and old-style formatting.
    http://hg.python.org/cpython/rev/102df748572d

    New changeset 79ea0c84152a by Mark Dickinson in branch '3.3':
    Issue bpo-14700: merge tests from 3.2.
    http://hg.python.org/cpython/rev/79ea0c84152a

    New changeset 22c8e6d71529 by Mark Dickinson in branch 'default':
    Issue bpo-14700: merge tests from 3.3.
    http://hg.python.org/cpython/rev/22c8e6d71529

    @mdickinson
    Copy link
    Member

    Fixed in 2.7 and 3.2; extra tests ported to 3.3 and default. Reclosing.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants