classification
Title: Decoding a highly-nested object with json (_speedups enabled) causes segfault
Type: security Stage: resolved
Components: Extension Modules Versions: Python 3.1, Python 3.2, Python 3.3, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: ezio.melotti Nosy List: alex, barry, benjamin.peterson, ezio.melotti, georg.brandl, ivank, python-dev
Priority: high Keywords: patch

Created on 2011-05-06 12:25 by ivank, last changed 2011-05-07 15:31 by ezio.melotti. This issue is now closed.

Files
File name Uploaded Description Edit
issue12017.diff ezio.melotti, 2011-05-06 16:10 Patch to fix the segfault. review
Messages (8)
msg135281 - (view) Author: ivank (ivank) Date: 2011-05-06 12:25
Decoding a highly-nested object with json (_speedups enabled) can cause a segfault due to a stack overflow:

# python -c "import json; json.loads('[' * 100000 + '1' + ']' * 100000)"
zsh: segmentation fault  python -c "import json; json.loads('[' * 100000 + '1' + ']' * 100000)"

# python -c "import json; json.loads('{\"a\":' * 100000 + '1' + '}' * 100000)"
zsh: segmentation fault  python -c "import json; json.loads('{\"a\":' * 100000 + '1' + '}' * 100000)"

simplejson has the same problem:
https://github.com/simplejson/simplejson/pull/11

I've started on a fix (see patch at that URL), but it doesn't support customizing max_depth yet.
msg135284 - (view) Author: ivank (ivank) Date: 2011-05-06 12:32
<ivan`> any opinions here on what the default max_depth should be? is there any safe number?
<ivan`> I'm curious as to how many C stack frames I can use after reaching the Python recursion limit
<ivan`> and what ulimit -s looks like on every platform
<Taggnostr> http://hg.python.org/cpython/file/tip/Lib/test/test_parser.py#l537
<Taggnostr> ivan`, maybe this is useful
msg135285 - (view) Author: Alex Gaynor (alex) * (Python committer) Date: 2011-05-06 12:33
Why not use Py_EnterRecursiveCall?
msg135315 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2011-05-06 16:10
Attached patch fixes the issues. The patch includes 6 tests that caused a segfault and that now raise a "RuntimeError: maximum recursion depth exceeded" error, matching the behavior of the Python version.

The recursion happen because scan_once_str/unicode might call _parse_object_str/unicode (for objects/dicts) and _parse_array_str/unicode (for arrays/lists), and these functions might call scan_once_str again for their inner elements.
To fix the problem I added Py_Enter/LeaveRecursiveCall around the calls to _parse_object_str and _parse_array_str in scan_once_str/unicode.

For some reason the message raised with json.loads('{"a":' * 100000 + '1' + '}' * 100000), is a generic "maximum recursion depth exceeded while calling a Python object", but that's probably not too important.
The other messages work fine.
msg135359 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2011-05-06 18:57
Thanks for the patch.  I'll apply this to 2.6svn for the 2.6.7rc1 release today.  Feel free to apply this to 2.6hg and forward port it to the relevant releases.
msg135361 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2011-05-06 19:41
Not applicable to 2.6 after all.
msg135471 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2011-05-07 15:29
New changeset 6fa20e360e5e by Ezio Melotti in branch '2.7':
#12017: Fix segfault in json.loads() while decoding highly-nested objects using the C accelerations.
http://hg.python.org/cpython/rev/6fa20e360e5e

New changeset 61164d09337e by Ezio Melotti in branch '3.1':
#12017: Fix segfault in json.loads() while decoding highly-nested objects using the C accelerations.
http://hg.python.org/cpython/rev/61164d09337e

New changeset db97968379dd by Ezio Melotti in branch '3.2':
#12017: merge with 3.1.
http://hg.python.org/cpython/rev/db97968379dd

New changeset 389620c9e609 by Ezio Melotti in branch 'default':
#12017: merge with 3.2.
http://hg.python.org/cpython/rev/389620c9e609
msg135472 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2011-05-07 15:31
Fixed, thanks Ivan for the report and Alex for suggesting Py_EnterRecursiveCall!
History
Date User Action Args
2011-05-07 15:31:34ezio.melottisetstatus: open -> closed
resolution: fixed
messages: + msg135472

stage: commit review -> resolved
2011-05-07 15:29:25python-devsetnosy: + python-dev
messages: + msg135471
2011-05-07 07:43:27terry.reedysetversions: - Python 2.6
2011-05-06 19:41:26barrysetpriority: release blocker -> high

messages: + msg135361
2011-05-06 18:57:50barrysetmessages: + msg135359
2011-05-06 16:11:00ezio.melottisetfiles: + issue12017.diff
priority: high -> release blocker


keywords: + patch
nosy: + benjamin.peterson, georg.brandl
messages: + msg135315
stage: needs patch -> commit review
2011-05-06 15:23:54ezio.melottisetassignee: ezio.melotti
2011-05-06 12:59:07pitrousetpriority: normal -> high
nosy: + barry
stage: test needed -> needs patch

versions: + Python 2.6
2011-05-06 12:33:00alexsetnosy: + alex
messages: + msg135285
2011-05-06 12:32:17ivanksetmessages: + msg135284
2011-05-06 12:31:09ezio.melottisetnosy: + ezio.melotti
stage: test needed

components: + Extension Modules, - Library (Lib)
versions: + Python 3.1, Python 3.2, Python 3.3
2011-05-06 12:25:56ivankcreate