This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Memory corruption due to size expansion (overflow) in _json.encode_basestring_ascii on 32 bit Python 2.7.12
Type: security Stage: resolved
Components: Versions: Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: benjamin.peterson, python-dev
Priority: normal Keywords:

Created on 2016-09-27 06:54 by benjamin.peterson, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Messages (2)
msg277494 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2016-09-27 06:54
Guido Vranken reports:

This results in a segmentation fault on 32 bit:

python -c "import _json; print
_json.encode_basestring_ascii(unicode(chr(0x22)) * 0x2AAAAAAB)"


This is a tentative patch:

diff --git a/Modules/_json.c b/Modules/_json.c
index fede6b1..022bbc8 100644
--- a/Modules/_json.c
+++ b/Modules/_json.c
@@ -212,7 +212,16 @@ ascii_escape_unicode(PyObject *pystr)

     /* One char input can be up to 6 chars output, estimate 4 of these */
     output_size = 2 + (MIN_EXPANSION * 4) + input_chars;
-    max_output_size = 2 + (input_chars * MAX_EXPANSION);
+    if (input_chars > (PY_SSIZE_T_MAX / MAX_EXPANSION) ) {
+        PyErr_SetString(PyExc_ValueError, "Size of escaped string
exceeds upper boundary");
+        return NULL;
+    }
+    max_output_size = (input_chars * MAX_EXPANSION);
+    if ( max_output_size > (PY_SSIZE_T_MAX - 2)) {
+        PyErr_SetString(PyExc_ValueError, "Size of escaped string
exceeds upper boundary");
+        return NULL;
+    }
+    max_output_size += 2;
     rval = PyString_FromStringAndSize(NULL, output_size);
     if (rval == NULL) {
         return NULL;


But you still have to take account these things:

            /* This is an upper bound */
            if (new_output_size > max_output_size) {
                new_output_size = max_output_size;
            }

If this code within the if{} is reached then it merely truncates the
amount of memory that is actually required, thereby creating another
opportunity for overwrites?

And this:

            Py_ssize_t new_output_size = output_size * 2;

might overflow, but since output_size is always positive (ie. in the
range [0..PY_SSIZE_T_MAX]), an overflow would result in a negative
value. That would subsequently be caught in _PyString_Resize():

int
_PyString_Resize(PyObject **pv, Py_ssize_t newsize)
{
    register PyObject *v;
    register PyStringObject *sv;
    v = *pv;
    if (!PyString_Check(v) || Py_REFCNT(v) != 1 || newsize < 0 ||
        PyString_CHECK_INTERNED(v)) {
        *pv = 0;
        Py_DECREF(v);
        PyErr_BadInternalCall();
        return -1;
    }

Nonetheless an overflow from positive to negative is undesirable (it's
undefined behaviour).
msg277495 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2016-09-27 06:55
New changeset 9375c8834448 by Benjamin Peterson in branch '2.7':
be extremely careful about overflows in encode_basestring_ascii (closes #28284)
https://hg.python.org/cpython/rev/9375c8834448
History
Date User Action Args
2022-04-11 14:58:37adminsetgithub: 72471
2016-09-27 06:55:49python-devsetstatus: open -> closed

nosy: + python-dev
messages: + msg277495

resolution: fixed
stage: resolved
2016-09-27 06:54:57benjamin.petersoncreate