Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

integer overflow in encoding unicode #66708

Closed
pkt mannequin opened this issue Sep 29, 2014 · 17 comments
Closed

integer overflow in encoding unicode #66708

pkt mannequin opened this issue Sep 29, 2014 · 17 comments
Labels
type-security A security issue

Comments

@pkt
Copy link
Mannequin

pkt mannequin commented Sep 29, 2014

BPO 22518
Nosy @birkenfeld, @vstinner, @benjaminp, @serhiy-storchaka
Files
  • poc_encode_latin1.py
  • codecs_error_handlers_overflow_tests.patch: Attempt to test integer overflow
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2014-12-10.19:03:11.179>
    created_at = <Date 2014-09-29.21:01:25.365>
    labels = ['type-security']
    title = 'integer overflow in encoding unicode'
    updated_at = <Date 2014-12-10.19:03:11.179>
    user = 'https://bugs.python.org/pkt'

    bugs.python.org fields:

    activity = <Date 2014-12-10.19:03:11.179>
    actor = 'benjamin.peterson'
    assignee = 'none'
    closed = True
    closed_date = <Date 2014-12-10.19:03:11.179>
    closer = 'benjamin.peterson'
    components = []
    creation = <Date 2014-09-29.21:01:25.365>
    creator = 'pkt'
    dependencies = []
    files = ['36754', '37285']
    hgrepos = []
    issue_num = 22518
    keywords = ['patch']
    message_count = 17.0
    messages = ['227837', '227843', '227853', '227908', '227910', '227912', '227915', '227916', '227917', '227923', '228447', '228448', '228455', '228456', '231679', '231681', '231704']
    nosy_count = 7.0
    nosy_names = ['georg.brandl', 'vstinner', 'benjamin.peterson', 'Arfrever', 'python-dev', 'serhiy.storchaka', 'pkt']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'security'
    url = 'https://bugs.python.org/issue22518'
    versions = ['Python 2.7', 'Python 3.3', 'Python 3.4', 'Python 3.5']

    @pkt
    Copy link
    Mannequin Author

    pkt mannequin commented Sep 29, 2014

    # static PyObject *
    # unicode_encode_ucs1(PyObject *unicode,
    # const char *errors,
    # unsigned int limit)
    # {
    # ...
    # while (pos < size) {
    # ...
    # case 4: /* xmlcharrefreplace */
    # /* determine replacement size */
    # for (i = collstart, repsize = 0; i < collend; ++i) {
    # Py_UCS4 ch = PyUnicode_READ(kind, data, i);
    # ...
    # else if (ch < 100000)
    # 1 repsize += 2+5+1;
    # ...
    # }
    # 2 requiredsize = respos+repsize+(size-collend);
    # if (requiredsize > ressize) {
    # ...
    # if (_PyBytes_Resize(&res, requiredsize))
    # ...
    # }
    # /* generate replacement */
    # for (i = collstart; i < collend; ++i) {
    # 3 str += sprintf(str, "&#%d;", PyUnicode_READ(kind, data, i));
    # }
    #
    # 1. ch=0xffff<100000, so repsize = (number of unicode chars in string)*8
    # =2^29*2^3=2^32 == 0 (mod 2^32)
    # 2. respos==0, collend==0, so requiredsize=repsize==0, so the destination buffer
    # isn't resized
    # 3. overwrite

    @pkt pkt mannequin added the type-crash A hard crash of the interpreter, possibly with a core dump label Sep 29, 2014
    @serhiy-storchaka
    Copy link
    Member

    Looks very similar to bpo-22470.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Sep 29, 2014

    New changeset b2e68274aa8e by Benjamin Peterson in branch '2.7':
    cleanup overflowing handling in unicode_decode_call_errorhandler and unicode_encode_ucs1 (closes bpo-22518)
    https://hg.python.org/cpython/rev/b2e68274aa8e

    New changeset 3b7e93249700 by Benjamin Peterson in branch '2.7':
    add NEWS note for bpo-22518
    https://hg.python.org/cpython/rev/3b7e93249700

    New changeset 3c67d19c624f by Benjamin Peterson in branch '3.3':
    cleanup overflowing handling in unicode_decode_call_errorhandler and unicode_encode_ucs1 (closes bpo-22518)
    https://hg.python.org/cpython/rev/3c67d19c624f

    New changeset 88332ea4c140 by Benjamin Peterson in branch '3.3':
    NEWS issue for bpo-22518
    https://hg.python.org/cpython/rev/88332ea4c140

    New changeset 7dab27fffff2 by Benjamin Peterson in branch '3.4':
    merge 3.3 (closes bpo-22518)
    https://hg.python.org/cpython/rev/7dab27fffff2

    New changeset f86fde20e9ce by Benjamin Peterson in branch 'default':
    merge 3.4 (closes bpo-22518)
    https://hg.python.org/cpython/rev/f86fde20e9ce

    @python-dev python-dev mannequin closed this as completed Sep 29, 2014
    @vstinner
    Copy link
    Member

    New changeset f86fde20e9ce by Benjamin Peterson in branch 'default':
    merge 3.4 (closes bpo-22518)
    https://hg.python.org/cpython/rev/f86fde20e9ce

    This changeset added ">>>> other". It looks like you commited a conflict.
    •    if (requiredsize\<2\*outsize)
      

    + if (outsize <= PY_SSIZE_T_MAX/2 && requiredsize < 2*outsize)
    requiredsize = 2*outsize;

    I'm not sure that this change is correct. Why not raising an exception on overflow?

    @vstinner
    Copy link
    Member

    It would be nice to add a bigmem test to check that repr('\x00'*(2**30+1)) doesn't crash anymore.

    @vstinner
    Copy link
    Member

    It would be nice to add a bigmem test to check that repr('\x00'*(2**30+1)) doesn't crash anymore.

    Ooops, wrong issue, the test is : ("\uffff" * (2**29)).encode("latin1", errors="xmlcharrefreplace").

    @vstinner vstinner reopened this Sep 30, 2014
    @serhiy-storchaka
    Copy link
    Member

    I'm not sure that this change is correct. Why not raising an exception on
    overflow?

    This is correct. This check prevents overflow.

    @vstinner
    Copy link
    Member

    This is correct. This check prevents overflow.

    Oh, I didn't understand that "requiredsize = 2*outsize;" is only used for performances, to overallocate the buffer. So I agree that it's fine to not overallocate if it would overflow.

    @vstinner vstinner added type-security A security issue and removed type-crash A hard crash of the interpreter, possibly with a core dump labels Sep 30, 2014
    @birkenfeld
    Copy link
    Member

    Benjamin, could you make a patch for 3.2 as well?

    @serhiy-storchaka
    Copy link
    Member

    Ooops, wrong issue, the test is : ("\uffff" * (2**29)).encode("latin1",
    errors="xmlcharrefreplace").

    ("\uffff" * (sys.maxsize//8+1)).encode("latin1", errors="xmlcharrefreplace")

    or

    ("\xff" * (sys.maxsize//6+1)).encode("ascii", errors="xmlcharrefreplace")

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Oct 4, 2014

    New changeset 3f7519f633ed by Serhiy Storchaka in branch '2.7':
    Issue bpo-22518: Fixed integer overflow issues in "backslashreplace" and
    https://hg.python.org/cpython/rev/3f7519f633ed

    New changeset ec9b7fd246b6 by Serhiy Storchaka in branch '3.4':
    Issue bpo-22518: Fixed integer overflow issues in "backslashreplace",
    https://hg.python.org/cpython/rev/ec9b7fd246b6

    New changeset 2df4cc31c36e by Serhiy Storchaka in branch 'default':
    Issue bpo-22518: Fixed integer overflow issues in "backslashreplace",
    https://hg.python.org/cpython/rev/2df4cc31c36e

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Oct 4, 2014

    New changeset d1be1f355f59 by Serhiy Storchaka in branch '2.7':
    Fixed compilation error introduced in 3f7519f633ed (issue bpo-22518).
    https://hg.python.org/cpython/rev/d1be1f355f59

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Oct 4, 2014

    New changeset 51317c9786f5 by Serhiy Storchaka in branch '3.3':
    Issue bpo-22518: Fixed integer overflow issues in "backslashreplace",
    https://hg.python.org/cpython/rev/51317c9786f5

    @serhiy-storchaka
    Copy link
    Member

    Sorry for noise, these changes are related to bpo-22470.

    @serhiy-storchaka
    Copy link
    Member

    Do you want to add a bigmem test or close this issue Benjamin?

    @benjaminp
    Copy link
    Contributor

    I wouldn't object if you had a patch.

    @serhiy-storchaka
    Copy link
    Member

    Integer overflow errors were fixed in 4 error handlers: surrogatepass, backslashreplace, namereplace, and xmlcharrefreplace. Is is hard to write general robust tests. In worst cases they requires more than sys.maxsize or even sys.maxsize*2 memory and for sure fail with MemoryError. It is possible to write a test for xmlcharrefreplace, but it will not robust, after changes of implementation details it could raise MemoryError instead of OverflowError after consuming all address space.

    So I suggest close this issue without tests. Such tests are useless.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    type-security A security issue
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants