This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: integer overflow in computing byte's object representation
Type: behavior Stage: resolved
Components: Versions: Python 3.3, Python 3.4, Python 3.5, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: Arfrever, christian.heimes, pkt, python-dev, vstinner
Priority: normal Keywords:

Created on 2014-09-29 21:03 by pkt, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
poc_repr_bytes.py pkt, 2014-09-29 21:03
Messages (4)
msg227838 - (view) Author: paul (pkt) Date: 2014-09-29 21:03
# PyBytes_Repr(PyObject *obj, int smartquotes)
# {
#     PyBytesObject* op = (PyBytesObject*) obj;
# 1   Py_ssize_t i, length = Py_SIZE(op);
#     size_t newsize, squotes, dquotes;
#     ...
# 
#     /* Compute size of output string */
#     newsize = 3; /* b'' */
#     s = (unsigned char*)op->ob_sval;
#     for (i = 0; i < length; i++) {
#       ...
#         default:
#             if (s[i] < ' ' || s[i] >= 0x7f)
# 2               newsize += 4; /* \xHH */
#             else
#                 newsize++;
#         }
#     }
#     ...
# 3   if (newsize > (PY_SSIZE_T_MAX - sizeof(PyUnicodeObject) - 1)) {
#       PyErr_SetString(PyExc_OverflowError,
#         "bytes object is too large to make repr");
#       return NULL;
#     }
# 4   v = PyUnicode_New(newsize, 127);
#     ...
#     *p++ = 'b', *p++ = quote;
#     for (i = 0; i < length; i++) {
#         ...
# 5         *p++ = c;
#     }
#     *p++ = quote;
# 6   assert(_PyUnicode_CheckConsistency(v, 1));
#     return v;
# }
# 
# 1. length=2^30+1=1073741825
# 2. newsize=length*4+3=7 (overflow)
# 3. check is inefficient, because newsize=7
# 4. allocated buffer is too small
# 5. buffer overwrite
# 6. this assert will likely fail, since there is a good chance the allocated
#    buffer is just before the huge one, so the huge one will overwrite itself.
msg227856 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2014-09-29 23:11
New changeset d9cd11eda152 by Benjamin Peterson in branch '2.7':
fix overflow checking in PyString_Repr (closes #22519)
https://hg.python.org/cpython/rev/d9cd11eda152

New changeset f5c662a7f7e6 by Benjamin Peterson in branch '3.3':
fix overflow checking in PyBytes_Repr (closes #22519)
https://hg.python.org/cpython/rev/f5c662a7f7e6

New changeset ed31cdf11ac2 by Benjamin Peterson in branch '3.4':
merge 3.3 (closes #22519)
https://hg.python.org/cpython/rev/ed31cdf11ac2

New changeset 0ddc5fc5f395 by Benjamin Peterson in branch 'default':
merge 3.4 (#22519)
https://hg.python.org/cpython/rev/0ddc5fc5f395
msg227909 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2014-09-30 13:08
It would be nice to have a "bigmem" test checking that repr(b'\x00'*(2**30+1)) doesn't crash anymore.
msg277326 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2016-09-24 19:06
The code doesn't crash any more. It took me more than 5 GB of resident memory and about 90 CPU seconds to reproduce the circumstances of the overflow.
History
Date User Action Args
2022-04-11 14:58:08adminsetgithub: 66709
2016-09-24 19:06:12christian.heimessetstatus: open -> closed

nosy: + christian.heimes
messages: + msg277326

type: security -> behavior
resolution: fixed
2014-09-30 13:41:58vstinnersettype: crash -> security
2014-09-30 13:11:29vstinnersetstatus: closed -> open
resolution: fixed -> (no value)
2014-09-30 13:08:40vstinnersetnosy: + vstinner
messages: + msg227909
2014-09-30 06:37:51Arfreversetversions: + Python 2.7, Python 3.3, Python 3.5
2014-09-29 23:21:12Arfreversetnosy: + Arfrever
2014-09-29 23:11:17python-devsetstatus: open -> closed

nosy: + python-dev
messages: + msg227856

resolution: fixed
stage: resolved
2014-09-29 21:03:05pktcreate