classification
Title: Type confusion in json encoding
Type: crash Stage: resolved
Components: Extension Modules Versions: Python 3.6, Python 3.4, Python 3.5, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: serhiy.storchaka Nosy List: Arfrever, pkt, python-dev, rhettinger, ronaldoussoren, serhiy.storchaka, vstinner
Priority: normal Keywords: patch

Created on 2015-07-22 07:01 by pkt, last changed 2016-02-06 07:10 by pkt. This issue is now closed.

Files
File name Uploaded Description Edit
json_markers.py pkt, 2015-07-22 08:48
json_make_encoder_typecheck.patch serhiy.storchaka, 2015-07-26 05:34 review
eip.py pkt, 2016-01-26 12:24
Messages (18)
msg247093 - (view) Author: paul (pkt) Date: 2015-07-22 07:01
on-35dm-i386-linux-gnu.so`encoder_listencode_list(s=0xb6f90394, acc=0xbfc42c28, seq=0xb6f2361c, indent_level=1) + 655 at _json.c:1800
#     frame #2: 0xb6e4366d _json.cpython-35dm-i386-linux-gnu.so`encoder_listencode_obj(s=0xb6f90394, acc=0xbfc42c28, obj=0xb6f2361c, indent_level=1) + 733 at _json.c:1554
#     frame #3: 0xb6e3fc4f _json.cpython-35dm-i386-linux-gnu.so`encoder_call(self=0xb6f90394, args=0xb7049304, kwds=0x00000000) + 319 at _json.c:1386
#     frame #4: 0x080c5758 python`PyObject_Call(func=0xb6f90394, arg=0xb7049304, kw=0x00000000) + 264 at abstract.c:2149
# 
# This is a type confusion bug. encoder->markers can be initialized to an
# arbitrary object (string in this POC). PyDict_Contains trusts the caller that
# "op" is a dictionary without checking. Some callers can't be trusted :)
msg247095 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015-07-22 07:22
I don't understand the issue. Can you elaborate?

What is your code? What is the current result? What is the expected result? What is your platform? What is your Python version? etc.
msg247100 - (view) Author: paul (pkt) Date: 2015-07-22 08:49
Sorry, I uploaded a test case.
msg247111 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2015-07-22 11:51
In encoder_init (the __init__ for _json.Encoder) s->marker is set to an argument of __init__, without any kind of type check, it can therefore be an arbitrary object.

encoder_listencode_obj (and other functions) then use s->markers with the concrete API for dicts (such as PyDict_Contains). 

PyDict_Contains does not perform a type check, but casts its first argument to a PyDictObject and access fields. That causes problems when the marker isn't actually a dict.

I don't know the module good enough to be 100% sure about a fix, but I think it would be best to add a type check to encoder_init. 

BTW. As far as I know _json.make_encoder is a private API and shouldn't be used directly, when you use the public API the argument will always be a dict.
msg247406 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-07-26 05:34
There is similar issue with key_separator and item_separator in 3.x. They are used with _PyAccu_Accumulate that performs a type check only in assert().

Here is a patch.
msg247408 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2015-07-26 05:45
Patch LGTM.
msg247410 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2015-07-26 06:04
New changeset b3d0bf112f70 by Serhiy Storchaka in branch '3.4':
Issue #24683: Fixed crashes in _json functions called with arguments of
https://hg.python.org/cpython/rev/b3d0bf112f70

New changeset ef4d09399b99 by Serhiy Storchaka in branch '3.5':
Issue #24683: Fixed crashes in _json functions called with arguments of
https://hg.python.org/cpython/rev/ef4d09399b99

New changeset 7de4abf4eed2 by Serhiy Storchaka in branch 'default':
Issue #24683: Fixed crashes in _json functions called with arguments of
https://hg.python.org/cpython/rev/7de4abf4eed2
msg247412 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2015-07-26 06:08
New changeset 0a1266ef1b5d by Serhiy Storchaka in branch '2.7':
Issue #24683: Fixed a crash in _json.make_encoder() called with non-dict 1st argument.
https://hg.python.org/cpython/rev/0a1266ef1b5d
msg247413 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-07-26 06:10
Thank you for your report paul. Thanks for review Raymond.
msg247454 - (view) Author: paul (pkt) Date: 2015-07-27 07:38
resolution: not a bug

^ because of private API?
msg247455 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015-07-27 07:41
> resolution: not a bug

Since Serhiy pushed changes, the issue is fixed and it is a real bug.

Thanks for the report paul.
msg247456 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-07-27 08:00
This was my fault. I want to set resolution to "fixed" but missed. Victor has corrected this.
msg258950 - (view) Author: paul (pkt) Date: 2016-01-26 12:24
Proof of EIP control.
msg258951 - (view) Author: paul (pkt) Date: 2016-01-26 12:25
GDB dump of running ./python eip.py

_______________________________________________________________________________
     eax:37A317DD ebx:B7A54268  ecx:BFFFE22C  edx:11223344     eflags:00010217
     esi:B7A61060 edi:B7AA6714  esp:BFFFE20C  ebp:B7A317DC     eip:11223344
     cs:0073  ds:007B  es:007B  fs:0000  gs:0033  ss:007B    o d I t s z A P C
[007B:BFFFE20C]---------------------------------------------------------[stack]
BFFFE23C : 10 FA A1 B7  60 10 A6 B7 - 68 42 A5 B7  00 60 A2 B7 ....`...hB...`..
BFFFE22C : 60 17 A6 B7  10 68 2B 08 - 00 60 A2 B7  DC 17 A3 B7 `....h+..`......
BFFFE21C : 2C E2 FF BF  DC 17 A3 B7 - 3C E2 FF BF  00 00 00 00 ,.......<.......
BFFFE20C : AE 07 0D 08  60 10 A6 B7 - 68 42 A5 B7  DD 17 A3 37 ....`...hB.....7
[0073:11223344]---------------------------------------------------------[ code]
=> 0x11223344:  Error while running hook_stop:
Cannot access memory at address 0x11223344
0x11223344 in ?? ()
msg259679 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-02-05 17:18
I can't reproduce your example paul.

$ ./python eip.py 
Traceback (most recent call last):
  File "eip.py", line 21, in <module>
    e = j.make_encoder(markers, None, enc, 4, "ks", "is", False, True, True)
TypeError: make_encoder() argument 1 must be dict or None, not array.array
msg259681 - (view) Author: paul (pkt) Date: 2016-02-05 17:39
Can you try on 2.7 branch?
msg259684 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-02-05 17:58
The same result on 2.7 branch:

$ ./python ../cpython/eip.py 
Traceback (most recent call last):
  File "../cpython/eip.py", line 21, in <module>
    e = j.make_encoder(markers, None, enc, 4, "ks", "is", False, True, True)
TypeError: make_encoder() argument 1 must be dict or None, not str
msg259719 - (view) Author: paul (pkt) Date: 2016-02-06 07:10
Sorry, I wasn't clear enough. This POC is a proof that the original bug can be used for EIP control. I just checked and it works as advertised on 2.7 revision: https://hg.python.org/cpython/rev/2d39777f3477 - it's a parent of https://hg.python.org/cpython/rev/0a1266ef1b5d containing the patch for this issue. I added this file, because I submitted a bug on hackerone claiming EIP control.
History
Date User Action Args
2016-02-06 07:10:47pktsetmessages: + msg259719
2016-02-05 17:58:45serhiy.storchakasetmessages: + msg259684
2016-02-05 17:39:15pktsetmessages: + msg259681
2016-02-05 17:18:53serhiy.storchakasetmessages: + msg259679
2016-01-26 12:25:03pktsetmessages: + msg258951
2016-01-26 12:24:08pktsetfiles: + eip.py

messages: + msg258950
2015-08-17 04:34:56Arfreversetnosy: + Arfrever
2015-07-27 08:00:08serhiy.storchakasetmessages: + msg247456
2015-07-27 07:41:00vstinnersetresolution: not a bug -> fixed
messages: + msg247455
2015-07-27 07:38:13pktsetmessages: + msg247454
2015-07-26 06:10:26serhiy.storchakasetstatus: open -> closed
resolution: not a bug
messages: + msg247413

stage: patch review -> resolved
2015-07-26 06:08:55python-devsetmessages: + msg247412
2015-07-26 06:04:42python-devsetnosy: + python-dev
messages: + msg247410
2015-07-26 05:45:53rhettingersetnosy: + rhettinger
messages: + msg247408
2015-07-26 05:34:50serhiy.storchakasetfiles: + json_make_encoder_typecheck.patch
versions: + Python 2.7, Python 3.4, Python 3.6
messages: + msg247406

keywords: + patch
stage: needs patch -> patch review
2015-07-22 11:51:45ronaldoussorensetnosy: + ronaldoussoren
messages: + msg247111
2015-07-22 11:11:34serhiy.storchakasetassignee: serhiy.storchaka

nosy: + serhiy.storchaka
components: + Extension Modules
stage: needs patch
2015-07-22 08:49:03pktsetmessages: + msg247100
2015-07-22 08:48:34pktsetfiles: + json_markers.py
2015-07-22 07:22:11vstinnersetnosy: + vstinner
messages: + msg247095
2015-07-22 07:01:58pktcreate