classification
Title: memory exhaustion in Modules/_pickle.c:1393
Type: security Stage: patch review
Components: FreeBSD Versions: Python 3.8
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: alexandre.vassalotti, serhiy.storchaka, shuoz, xtreak
Priority: normal Keywords: patch

Created on 2018-09-13 04:38 by shuoz, last changed 2018-09-15 11:02 by serhiy.storchaka.

Files
File name Uploaded Description Edit
poc shuoz, 2018-09-13 04:38
pk.py shuoz, 2018-09-13 04:53
Pull Requests
URL Status Linked Edit
PR 9261 open benjamin.peterson, 2018-09-14 00:55
Messages (3)
msg325230 - (view) Author: shuoz (shuoz) Date: 2018-09-13 04:38
python version:
   Python 3.8.0a0 (heads/master:4ae8ece, Sep 13 2018, 09:48:16) 
   [GCC 5.4.0 20160609] on linux


I found a bug in python pickle.load func. Can cause memory exhaustion DDOS.

./python pk.py poc


cat ./pk.py
import pickle
import sys
filename = sys.argv[1]
with open(filename, 'rb') as f:
    aa = pickle.load(f)
    print(aa)
msg325231 - (view) Author: shuoz (shuoz) Date: 2018-09-13 04:55
[----------------------------------registers-----------------------------------]
RAX: 0x7ff9d401e010 --> 0x0 
RBX: 0x7ffff7f48d00 --> 0x1 
RCX: 0x7ff8ab58c800 --> 0x7ffff7ea5d80 --> 0x2 
RDX: 0x7ffff3ac47d8 --> 0x1 
RSI: 0x25152303 
RDI: 0xfff3a803c00 --> 0x0 
RBP: 0x7473078c 
RSP: 0x7fffffffcf20 --> 0x7ffff3ac47d8 --> 0x1 
RIP: 0x7ffff28a8a64 (<_Unpickler_MemoPut+1668>:	add    r11,0x20)
R8 : 0xfff3a803bff --> 0x0 
R9 : 0xfff3a803c01 --> 0x0 
R10: 0xffffefe91a3 --> 0x0 
R11: 0x128a917f8 --> 0x0 
R12: 0xfff156b1922 --> 0x0 
R13: 0xe8e60f18 --> 0x0 
R14: 0x7ffff7f48d18 --> 0x7ff8ab58c800 --> 0x7ffff7ea5d80 --> 0x2 
R15: 0xfff3a803c02 --> 0x0
EFLAGS: 0x216 (carry PARITY ADJUST zero sign trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
   0x7ffff28a8a52 <_Unpickler_MemoPut+1650>:	cmp    BYTE PTR [r15+0x7fff8000],0x0
   0x7ffff28a8a5a <_Unpickler_MemoPut+1658>:	jne    0x7ffff28a8ae1 <_Unpickler_MemoPut+1793>
   0x7ffff28a8a60 <_Unpickler_MemoPut+1664>:	add    rsi,0x4
=> 0x7ffff28a8a64 <_Unpickler_MemoPut+1668>:	add    r11,0x20
   0x7ffff28a8a68 <_Unpickler_MemoPut+1672>:	cmp    BYTE PTR [r10+0x7fff8000],0x0
   0x7ffff28a8a70 <_Unpickler_MemoPut+1680>:	mov    QWORD PTR [rax],0x0
   0x7ffff28a8a77 <_Unpickler_MemoPut+1687>:	je     0x7ffff28a896d <_Unpickler_MemoPut+1421>
   0x7ffff28a8a7d <_Unpickler_MemoPut+1693>:	nop    DWORD PTR [rax]
[------------------------------------stack-------------------------------------]
0000| 0x7fffffffcf20 --> 0x7ffff3ac47d8 --> 0x1 
0008| 0x7fffffffcf28 --> 0xffffefe91a3 --> 0x0 
0016| 0x7fffffffcf30 --> 0x7ffff7f48da8 --> 0x20 (' ')
0024| 0x7fffffffcf38 --> 0x7ffff7f48d00 --> 0x1 
0032| 0x7fffffffcf40 --> 0xffffffffa00 --> 0x0 
0040| 0x7fffffffcf48 --> 0x0 
0048| 0x7fffffffcf50 --> 0x7ffff7f48da0 --> 0x28 ('(')
0056| 0x7fffffffcf58 --> 0x7ffff7f48da8 --> 0x20 (' ')
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value
0x00007ffff28a8a64	1392	    for (i = self->memo_size; i < new_size; i++)
gdb-peda$ p new_size
$5 = 0xe8e60f18
gdb-peda$ p self->memo_size
$6 = 0x20
gdb-peda$ p i


.....
for (i = self->memo_size; i < new_size; i++)
        self->memo[i] = NULL;
.....
msg325430 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-09-15 11:02
>>> import pickletools
>>> pickletools.dis(b'\x80\x04\x95\x1d\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x03age\x94K\x17\x8c\x03jobr\x8c\x07student\x94u.')
    0: \x80 PROTO      4
    2: \x95 FRAME      29
   11: }    EMPTY_DICT
   12: \x94 MEMOIZE    (as 0)
   13: (    MARK
   14: \x8c     SHORT_BINUNICODE 'age'
   19: \x94     MEMOIZE    (as 1)
   20: K        BININT1    23
   22: \x8c     SHORT_BINUNICODE 'job'
   27: r        LONG_BINPUT 1953695628
   32: u        SETITEMS   (MARK at 13)
   33: d    DICT       no MARK exists on stack
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/serhiy/py/cpython/Lib/pickletools.py", line 2457, in dis
    raise ValueError(errormsg)
ValueError: no MARK exists on stack

Ignore the error of unbalanced MARK. The problem code is LONG_BINPUT with the excessive large argument 1953695628. The C implementation of pickle tries to resize the the memo list to the size twice larger than this index. And here an integer overflow occurred.

This unlikely occurred in real world. The pickle needs to have more than 2**30-1 ≈ 10**9 memoized items for encountering this bug. It means that its size on disk and in memory should be tens or hundreds of gigabytes. Pickle is not the best format for serializing such amount of data.
History
Date User Action Args
2018-09-15 11:02:11serhiy.storchakasetmessages: + msg325430
2018-09-14 04:04:40koobssetnosy: - koobs
2018-09-14 00:55:24benjamin.petersonsetkeywords: + patch
stage: patch review
pull_requests: + pull_request8718
2018-09-13 09:47:19xtreaksetnosy: + xtreak
2018-09-13 05:46:03serhiy.storchakasetnosy: + alexandre.vassalotti, serhiy.storchaka
2018-09-13 04:55:08shuozsetmessages: + msg325231
2018-09-13 04:53:32shuozsetfiles: + pk.py
2018-09-13 04:38:47shuozcreate