This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: integer overflow in _json.encode_basestring_ascii
Type: crash Stage: resolved
Components: Versions: Python 3.3, Python 3.4, Python 3.5
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: Arfrever, python-dev, serhiy.storchaka, tehybel_
Priority: normal Keywords: patch

Created on 2015-02-01 13:59 by pkt, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
poc_ascii_escape.py pkt, 2015-02-01 13:59
test_encode_basestring_ascii_overflow.patch serhiy.storchaka, 2015-02-02 13:09 review
Messages (6)
msg235177 - (view) Author: paul (pkt) Date: 2015-02-01 13:59
# static PyObject *
# ascii_escape_unicode(PyObject *pystr)
# {
#     ...
# 
#     input_chars = PyUnicode_GET_LENGTH(pystr);
#     input = PyUnicode_DATA(pystr);
#     kind = PyUnicode_KIND(pystr);
# 
#     /* Compute the output size */
#     for (i = 0, output_size = 2; i < input_chars; i++) {
#         Py_UCS4 c = PyUnicode_READ(kind, input, i);
#         if (S_CHAR(c))
#             output_size++;
#         else {
#             switch(c) {
#             ...
#             default:
# 1               output_size += c >= 0x10000 ? 12 : 6;
#     ...
# 
# 2   rval = PyUnicode_New(output_size, 127);
# 
# 1. if c is \uFFFF then output_size += 6. There are no overflow checks on this 
#    variable, so we can overflow it with a sufficiently long (2**32/6+1 chars) 
#    string
# 2. rval buffer is too small to hold the result
# 
# Crash:
# ------
#  
# Breakpoint 3, ascii_escape_unicode (pystr='...') at /home/p/Python-3.4.1/Modules/_json.c:198
# 198         rval = PyUnicode_New(output_size, 127);
# (gdb) print output_size
# $9 = 4
# (gdb) c
# Continuing.
#  
# Program received signal SIGSEGV, Segmentation fault.
# 0x4057888f in ascii_escape_unichar (c=65535,
#     output=0x40572358 "...",
#     chars=19624) at /home/p/Python-3.4.1/Modules/_json.c:155
# 155                 output[chars++] = Py_hexdigits[(c >>  8) & 0xf];
# 
# OS info
# -------
# 
# % ./python -V
# Python 3.4.1
#  
# % uname -a
# Linux ubuntu 3.8.0-29-generic #42~precise1-Ubuntu SMP Wed Aug 14 15:31:16 UTC 2013 i686 i686 i386 GNU/Linux
#  
 
from _json import encode_basestring_ascii as enc
s="\uffff"*int((2**32)/6+1)
enc(s)
msg235213 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2015-02-01 23:02
New changeset 8699b3085db3 by Benjamin Peterson in branch '3.3':
fix possible overflow in encode_basestring_ascii (closes #23369)
https://hg.python.org/cpython/rev/8699b3085db3

New changeset 4f47509d7417 by Benjamin Peterson in branch '3.4':
merge 3.3 (#23369)
https://hg.python.org/cpython/rev/4f47509d7417

New changeset 02aeca4974ac by Benjamin Peterson in branch 'default':
merge 3.4 (#23369)
https://hg.python.org/cpython/rev/02aeca4974ac
msg235255 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-02-02 13:09
"\uffff"*((2**32)//6 + 1) is calculated at compile time. This requires much memory and can cause swapping. May be this was a cause of failing tests on some buildbots:

http://buildbot.python.org/all/builders/AMD64%20FreeBSD%209.x%203.x/builds/2623/steps/test/logs/stdio
http://buildbot.python.org/all/builders/AMD64%20FreeBSD%209.x%203.4/builds/749/steps/test/logs/stdio

Traceback (most recent call last):
  File "/usr/home/buildbot/python/3.4.koobs-freebsd9/build/Lib/runpy.py", line 170, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/home/buildbot/python/3.4.koobs-freebsd9/build/Lib/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/usr/home/buildbot/python/3.4.koobs-freebsd9/build/Lib/test/__main__.py", line 3, in <module>
    regrtest.main_in_temp_cwd()
  File "/usr/home/buildbot/python/3.4.koobs-freebsd9/build/Lib/test/regrtest.py", line 1564, in main_in_temp_cwd
    main()
  File "/usr/home/buildbot/python/3.4.koobs-freebsd9/build/Lib/test/regrtest.py", line 738, in main
    raise Exception("Child error on {}: {}".format(test, result[1]))
Exception: Child error on test_json: Exit code -9
*** [buildbottest] Error code 1

At least my computer hanged on first run of this test.

To prevent computing this string constant at compile time you can use a variable. And '\x00' can be used instead of '\uffff', it needs less memory.
msg235298 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2015-02-02 22:47
New changeset 5c730d30ffbc by Benjamin Peterson in branch '3.3':
reduce memory usage of test (closes #23369)
https://hg.python.org/cpython/rev/5c730d30ffbc
msg272156 - (view) Author: tehybel_ (tehybel_) Date: 2016-08-08 09:54
I noticed that this is only fixed for python 3.3 and 3.4, not for 2.7. Is that intentional? If so, why?
msg272623 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2016-08-13 23:48
New changeset 6fa0ebfdc136 by Benjamin Peterson in branch '2.7':
fix possible overflow in encode_basestring_ascii (#23369)
https://hg.python.org/cpython/rev/6fa0ebfdc136
History
Date User Action Args
2022-04-11 14:58:12adminsetgithub: 67558
2016-08-13 23:48:41python-devsetmessages: + msg272623
2016-08-08 09:54:10tehybel_setnosy: + tehybel_, - pkt
messages: + msg272156
2015-02-04 01:28:27Arfreversetversions: + Python 3.3, Python 3.5
2015-02-02 22:47:38python-devsetstatus: open -> closed

messages: + msg235298
2015-02-02 13:09:28serhiy.storchakasetstatus: closed -> open
files: + test_encode_basestring_ascii_overflow.patch

nosy: + serhiy.storchaka
messages: + msg235255

keywords: + patch
2015-02-01 23:02:33python-devsetstatus: open -> closed

nosy: + python-dev
messages: + msg235213

resolution: fixed
stage: resolved
2015-02-01 21:18:10Arfreversetnosy: + Arfrever
2015-02-01 13:59:35pktcreate