classification
Title: An exploitable segmentation fault in marshal module
Type: security Stage: resolved
Components: Interpreter Core Versions: Python 3.10
process
Status: closed Resolution: duplicate
Dependencies: Superseder: Pickle crashes unpickling invalid NEWOBJ_EX opcode
View: 41288
Assigned To: Nosy List: Iman Sharafaldin, christian.heimes, serhiy.storchaka, vstinner
Priority: normal Keywords:

Created on 2020-07-04 11:56 by Iman Sharafaldin, last changed 2020-08-03 17:29 by vstinner. This issue is now closed.

Files
File name Uploaded Description Edit
Crash.zip Iman Sharafaldin, 2020-07-04 11:56
Messages (23)
msg372990 - (view) Author: Iman Sharafaldin (Iman Sharafaldin) Date: 2020-07-04 11:56
It seems that all versions of Python 3 are vulnerable to de-marshaling the attached file (Python file is included). I've tested on Python 3.10.0a0 (heads/master:b40e434, Jul  4 2020), Python 3.6.11 and Python 3.7.2. This is due to lack of proper validation at Objects/tupleobject.c:413 (heads/master:b40e434).
 
This is the result of GDB's Exploitable plugin (it's exploitable):
Description: Access violation during branch instruction
Short description: BranchAv (4/22)
Hash: e04b830dfb409a8bbf67bff96ff0df44.4d31b48b56e0c02ed51520182d91a457
Exploitability Classification: EXPLOITABLE
Explanation: The target crashed on a branch instruction, which may indicate that the control flow is tainted.
Other tags: AccessViolation (21/22)
msg373072 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-07-06 07:20
How did you get this file?
msg373102 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-07-06 11:59
According to the Python Security Model, this issue is not security vulnerability:
(*) https://python-security.readthedocs.io/security.html#python-security-model

The marshal is not intended to be used to load untrusted code. That's why its documentation contains the red warning:
"The marshal module is not intended to be secure against erroneous or maliciously constructed data. Never unmarshal data received from an untrusted or unauthenticated source."
https://docs.python.org/dev/library/marshal.html
msg373103 - (view) Author: Iman Sharafaldin (Iman Sharafaldin) Date: 2020-07-06 12:02
By using our proprietary fuzzer. I'm a cybersecurity researcher.
msg373105 - (view) Author: Iman Sharafaldin (Iman Sharafaldin) Date: 2020-07-06 12:02
What about patching that as a crash?
msg373108 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-07-06 12:06
Python doesn't implement any protection against invalid PYC files to avoid any performance overhead at runtime. Maybe we can close this issue as WONTFIX.
msg373117 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2020-07-06 13:58
Python's thread model is:
If an attacker can create a malicious PYC file and feed it to a Python process, then they already have full code execution privileges. There is no need to exploit a segfault. Because the marshal module should only be used for PYC files, they can straight out execute any Python code at import time. That's much simpler and works on all operating systems.
msg373119 - (view) Author: Iman Sharafaldin (Iman Sharafaldin) Date: 2020-07-06 14:09
I thought it's like Pickle. Then if we find an exploitable segfault just in Pickle, you would count it as a threat?
msg373122 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-07-06 14:35
No, unlike to marshal the pickle format is a Turing-complete language. Just loading pickle data can cause to execution of arbitrary code. marshal is more "safe" in this regard -- in worst case you can just crash when load it.

It may be interesting to make marshal deserialization more robust if it does not affect performance. But it would be a new feature, not a bug fix, and not a security fix.
msg373124 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2020-07-06 14:36
Yes, it's like pickle, but it is not like you think.

The pickle module has a similar security disclaimer, https://docs.python.org/dev/library/pickle.html . We might agree to fix segfaults in unpickler code if the fix is simple and does not cause backwards compatibility or performance regressions. It's more likely that we decide against it because the pickle format is inherently insecure and not designed to handle untrusted data.
msg373126 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-07-06 14:47
In this particular case unmarshalling creates a tuple containing a reference to itself which is used as a key in a dict. Calculating a hash of such tuple leads to infinite recursion which overflows the programming stack. There is no efficient way to detect such case, and since cyclic tuples cannot be created by pure Python code we should not even try to solve this problem. You can get it only by misusing the C API or the ctypes module or loading invalid marshal data.
msg373129 - (view) Author: Iman Sharafaldin (Iman Sharafaldin) Date: 2020-07-06 15:04
It's interesting that you would not count a critical segfault in Pickle as a threat, because there are numerous libraries that are Unpickling untrusted user data (even-though some of them are using RestrictedUnpickler to protect themselves but a segfault would bypass that). For example, Ray Project with five thousands commits (https://github.com/ray-project/ray/blob/master/rllib/utils/policy_server.py#L31). 

Long story short, you advise us to not put time on checking the security of the Pickle module too, am I right?

Thanks,
Iman
msg373132 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2020-07-06 15:26
That line in Ray Project is a potential arbitrary code execution vulnerability. If an attacker is able to inject a custom pickle stream, then they can easily take over the service. Please report the issue to the project. It might be a simple score of a CVE for you.

Python has several functions and modules that are not designed to deal with malicious data. They are documented as insecure. The pickle format was created 25 years ago. It's a useful serialization format but it's inherently insecure.

tl;dr we welcome any and all work to make Python more secure, but we cannot make very part of the interpreter secure. Pickle and marshal are two modules that you should ignore.
msg373139 - (view) Author: Iman Sharafaldin (Iman Sharafaldin) Date: 2020-07-06 16:35
Sure. Thank you.
msg373525 - (view) Author: Iman Sharafaldin (Iman Sharafaldin) Date: 2020-07-11 14:07
Nevertheless, I have an exploitable crash for the Pickle module too right now, but  as you're not interested, I didn't open an issue to share it. Thanks anyway.
msg373556 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-07-12 14:38
By design, it is trivial to run arbritrary Python code using pickle. There
is no need to exploit a segfault for that.
msg373557 - (view) Author: Iman Sharafaldin (Iman Sharafaldin) Date: 2020-07-12 14:45
There are many online Python interpreters, we can use this malicious file to escape their sandboxes and get control of their Docker container or system (and abuse them, for example, to conduct a DoS attack), as their fully trust that Python doesn't generate segfault.  
For example, the following code clearly kills the interpreter (and a shellcode can be attached), even though, they have protection mechanisms for file access and many other things.

-----------
https://www.programiz.com/python-programming/online-compiler/
-----------

import io
import marshal



hex_string = "FBE901000000DA0136E90209000072010000007203000000DA0168A90372010000007205000000DA026161DA026A6A7BDA0278785B020000007201000000DA01353030DA0170E7E10B930189E4414130"
myb = bytes.fromhex(hex_string)
f = io.BytesIO(myb)
print(f)
data = marshal.load(f)
print(data)
print('We have segfault but we cannot see!')
-------------------
msg373558 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-07-12 14:52
This bug tracker is not the right place to report issues of third party web
services. I don't see anything wrong with Python according to Python Threat
Model:
https://python-security.readthedocs.io/security.html#python-security-model

That's why pickle starts with a big warning about the lack of security.
msg373564 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2020-07-12 16:35
Linux containers like Docker are not a security boundary. They are a merely a mechanism to package, deliver, and run software. Dan Walsh coined the phrase "Containers Don't Contain" a while ago. It's possible to tighten security of containers. This starts at "Don't execute arbitrary and potentially malicious code".
msg373567 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-07-12 19:49
It depends. pickle is not vulnerable to the kind of error reported in this issue. If you find some way to crash Python specific to pickle it will likely be fixed if it is possible without significant performance or memory cost. If it depends on arbitrary code execution, it is not a pickle issue.
msg373569 - (view) Author: Iman Sharafaldin (Iman Sharafaldin) Date: 2020-07-12 20:15
@serhiy.storchaka you name it, you have it. The following code generates a segfault on the Pickle module [it's a crafted datetime object] (Python 3.10.0a0 (heads/master:b40e434, Jul  4 2020), Python 3.6.11 and Python 3.7.2):

import io
import pickle


hex_string = "8004952A000000000000008C086461746574696D65948C086461746574696D65949388430A07B2010100000000000092059452942E"
myb = bytes.fromhex(hex_string)
f = io.BytesIO(myb)
print(f)
data = pickle.load(f)
print(data)
print('We have segfault but we cannot see!')
msg373572 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-07-12 21:30
Thank you. Indeed, it is a pickle specific crash. Please open a new issue and I'll provide a fix.
msg373574 - (view) Author: Iman Sharafaldin (Iman Sharafaldin) Date: 2020-07-12 21:36
@serhiy.storchaka Thank you. Please find it here https://bugs.python.org/issue41288 .
History
Date User Action Args
2020-08-03 17:29:14vstinnersetsuperseder: Pickle crashes unpickling invalid NEWOBJ_EX opcode
resolution: not a bug -> duplicate
2020-07-12 21:36:59Iman Sharafaldinsetmessages: + msg373574
2020-07-12 21:30:13serhiy.storchakasetmessages: + msg373572
2020-07-12 20:15:31Iman Sharafaldinsetmessages: + msg373569
2020-07-12 19:49:13serhiy.storchakasetmessages: + msg373567
2020-07-12 16:35:45christian.heimessetmessages: + msg373564
2020-07-12 14:52:53vstinnersetmessages: + msg373558
2020-07-12 14:45:23Iman Sharafaldinsetmessages: + msg373557
2020-07-12 14:38:29vstinnersetmessages: + msg373556
2020-07-11 14:07:13Iman Sharafaldinsetmessages: + msg373525
2020-07-06 16:35:35Iman Sharafaldinsetmessages: + msg373139
2020-07-06 15:26:28christian.heimessetmessages: + msg373132
2020-07-06 15:04:05Iman Sharafaldinsetmessages: + msg373129
2020-07-06 14:47:34serhiy.storchakasetstatus: open -> closed
resolution: not a bug
messages: + msg373126

stage: resolved
2020-07-06 14:36:17christian.heimessetmessages: + msg373124
2020-07-06 14:35:30serhiy.storchakasetmessages: + msg373122
2020-07-06 14:09:10Iman Sharafaldinsetmessages: + msg373119
2020-07-06 13:58:32christian.heimessetnosy: + christian.heimes
messages: + msg373117
2020-07-06 12:06:02vstinnersetmessages: + msg373108
2020-07-06 12:02:56Iman Sharafaldinsetmessages: + msg373105
2020-07-06 12:02:15Iman Sharafaldinsetmessages: + msg373103
2020-07-06 11:59:21vstinnersetmessages: + msg373102
2020-07-06 07:20:38serhiy.storchakasetmessages: + msg373072
2020-07-05 18:41:31pitrousetnosy: + vstinner, serhiy.storchaka
2020-07-04 11:56:29Iman Sharafaldincreate