This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: integer overflow in computing unicode's object representation
Type: security Stage: resolved
Components: Versions: Python 3.3, Python 3.4, Python 3.5
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: Arfrever, benjamin.peterson, pkt, python-dev, vstinner
Priority: normal Keywords:

Created on 2014-09-29 21:04 by pkt, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
poc_repr_unicode.py pkt, 2014-09-29 21:04
Messages (4)
msg227839 - (view) Author: paul (pkt) Date: 2014-09-29 21:04
# unicode_repr(PyObject *unicode)
# {
#     ...
# 1   isize = PyUnicode_GET_LENGTH(unicode);
#     idata = PyUnicode_DATA(unicode);
# 
#     /* Compute length of output, quote characters, and
#        maximum character */
#     osize = 0;
#     ...
#     for (i = 0; i < isize; i++) {
#         Py_UCS4 ch = PyUnicode_READ(ikind, idata, i);
#         switch (ch) {
#         ...
#         default:
#             /* Fast-path ASCII */
#             if (ch < ' ' || ch == 0x7f)
# 2               osize += 4; /* \xHH */ 
#             ...
#         }
#     }
# 
#     ...
# 3   repr = PyUnicode_New(osize, max);
#     ...
#         for (i = 0, o = 1; i < isize; i++) {
#             Py_UCS4 ch = PyUnicode_READ(ikind, idata, i);
#             ...
#                 else {
# 4                   PyUnicode_WRITE(okind, odata, o++, ch);
#                 }
#             }
#         }
#     }
#     /* Closing quote already added at the beginning */
# 5   assert(_PyUnicode_CheckConsistency(repr, 1));
#     return repr;
# }
# 
# 1. isize=2^30+1
# 2. osize=isize*4=4
# 3. allocated buffer is too small
# 4. heap overflow
# 5. this assert will likely fail, since there is a good chance the allocated
#    buffer is just before the huge one, so the huge one will overwrite itself.
msg227867 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2014-09-30 03:03
New changeset 8ba7e5f43952 by Benjamin Peterson in branch '3.3':
prevent overflow in unicode_repr (closes #22520)
https://hg.python.org/cpython/rev/8ba7e5f43952

New changeset 6f54dfa675eb by Benjamin Peterson in branch '3.4':
merge 3.3 (#22520)
https://hg.python.org/cpython/rev/6f54dfa675eb

New changeset 245d9679cd5b by Benjamin Peterson in branch 'default':
merge 3.4 (#22520)
https://hg.python.org/cpython/rev/245d9679cd5b
msg227911 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2014-09-30 13:11
It would be nice to add a bigmem test to check that repr('\x00'*(2**30+1)) doesn't crash anymore.
msg232439 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2014-12-10 19:07
As Serhiy has noted on other bugs, the fact that the tests must be restricted to 32-bits limits their usefulness unfortunately.
History
Date User Action Args
2022-04-11 14:58:08adminsetgithub: 66710
2014-12-10 19:07:11benjamin.petersonsetstatus: open -> closed

nosy: + benjamin.peterson
messages: + msg232439

resolution: fixed
2014-09-30 13:42:08vstinnersettype: crash -> security
2014-09-30 13:11:14vstinnersetstatus: closed -> open

nosy: + vstinner
messages: + msg227911

resolution: fixed -> (no value)
2014-09-30 03:03:32python-devsetstatus: open -> closed

nosy: + python-dev
messages: + msg227867

resolution: fixed
stage: resolved
2014-09-30 03:02:43benjamin.petersonsetversions: + Python 3.3, Python 3.5
2014-09-29 23:21:21Arfreversetnosy: + Arfrever
2014-09-29 21:04:19pktcreate