Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CVE-2018-20406] memory exhaustion in Modules/_pickle.c:1393 #78837

Closed
httpsgithubcomxcainiao mannequin opened this issue Sep 13, 2018 · 19 comments
Closed

[CVE-2018-20406] memory exhaustion in Modules/_pickle.c:1393 #78837

httpsgithubcomxcainiao mannequin opened this issue Sep 13, 2018 · 19 comments
Labels
3.7 (EOL) end of life 3.8 only security fixes stdlib Python modules in the Lib dir type-security A security issue

Comments

@httpsgithubcomxcainiao
Copy link
Mannequin

httpsgithubcomxcainiao mannequin commented Sep 13, 2018

BPO 34656
Nosy @vstinner, @larryhastings, @ned-deily
PRs
  • bpo-34656: Avoid relying on signed overflow in _pickle memos. #9261
  • [3.7] closes bpo-34656: Avoid relying on signed overflow in _pickle memos. (GH-9261) #9465
  • [3.6] closes bpo-34656: Avoid relying on signed overflow in _pickle memos. (GH-9261) #9466
  • [3.5] bpo-34656: Avoid relying on signed overflow in _pickle memos (GH-9261) #11869
  • [3.4] bpo-34656: Avoid relying on signed overflow in _pickle memos (GH-9261) #11870
  • Files
  • poc
  • pk.py
  • CVE-2018-20406-pickle_LONG_BINPUT.patch: Patch updated for Python 3.4.*
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2019-02-26.00:41:36.072>
    created_at = <Date 2018-09-13.04:38:46.962>
    labels = ['type-security', '3.8', '3.7', 'library']
    title = '[CVE-2018-20406] memory exhaustion in Modules/_pickle.c:1393'
    updated_at = <Date 2019-05-10.18:14:59.140>
    user = 'https://github.com/httpsgithubcomxcainiao'

    bugs.python.org fields:

    activity = <Date 2019-05-10.18:14:59.140>
    actor = 'ned.deily'
    assignee = 'none'
    closed = True
    closed_date = <Date 2019-02-26.00:41:36.072>
    closer = 'larry'
    components = ['Library (Lib)']
    creation = <Date 2018-09-13.04:38:46.962>
    creator = 'shuoz'
    dependencies = []
    files = ['47801', '47802', '48073']
    hgrepos = []
    issue_num = 34656
    keywords = ['patch']
    message_count = 18.0
    messages = ['325230', '325231', '325430', '325937', '325938', '325939', '333292', '333294', '334108', '334179', '334183', '334208', '334209', '334267', '336381', '336508', '336568', '336589']
    nosy_count = 4.0
    nosy_names = ['vstinner', 'larry', 'ned.deily', 'dfmz77669']
    pr_nums = ['9261', '9465', '9466', '11869', '11870']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'security'
    url = 'https://bugs.python.org/issue34656'
    versions = ['Python 3.4', 'Python 3.5', 'Python 3.6', 'Python 3.7', 'Python 3.8']

    @httpsgithubcomxcainiao
    Copy link
    Mannequin Author

    httpsgithubcomxcainiao mannequin commented Sep 13, 2018

    python version:
    Python 3.8.0a0 (heads/master:4ae8ece, Sep 13 2018, 09:48:16)
    [GCC 5.4.0 20160609] on linux

    I found a bug in python pickle.load func. Can cause memory exhaustion DDOS.

    ./python pk.py poc

    cat ./pk.py
    import pickle
    import sys
    filename = sys.argv[1]
    with open(filename, 'rb') as f:
    aa = pickle.load(f)
    print(aa)

    @httpsgithubcomxcainiao httpsgithubcomxcainiao mannequin added 3.8 only security fixes type-security A security issue labels Sep 13, 2018
    @httpsgithubcomxcainiao
    Copy link
    Mannequin Author

    httpsgithubcomxcainiao mannequin commented Sep 13, 2018

    [----------------------------------registers-----------------------------------]
    RAX: 0x7ff9d401e010 --> 0x0
    RBX: 0x7ffff7f48d00 --> 0x1
    RCX: 0x7ff8ab58c800 --> 0x7ffff7ea5d80 --> 0x2
    RDX: 0x7ffff3ac47d8 --> 0x1
    RSI: 0x25152303
    RDI: 0xfff3a803c00 --> 0x0
    RBP: 0x7473078c
    RSP: 0x7fffffffcf20 --> 0x7ffff3ac47d8 --> 0x1
    RIP: 0x7ffff28a8a64 (<_Unpickler_MemoPut+1668>: add r11,0x20)
    R8 : 0xfff3a803bff --> 0x0
    R9 : 0xfff3a803c01 --> 0x0
    R10: 0xffffefe91a3 --> 0x0
    R11: 0x128a917f8 --> 0x0
    R12: 0xfff156b1922 --> 0x0
    R13: 0xe8e60f18 --> 0x0
    R14: 0x7ffff7f48d18 --> 0x7ff8ab58c800 --> 0x7ffff7ea5d80 --> 0x2
    R15: 0xfff3a803c02 --> 0x0
    EFLAGS: 0x216 (carry PARITY ADJUST zero sign trap INTERRUPT direction overflow)
    [-------------------------------------code-------------------------------------]
    0x7ffff28a8a52 <_Unpickler_MemoPut+1650>: cmp BYTE PTR [r15+0x7fff8000],0x0
    0x7ffff28a8a5a <_Unpickler_MemoPut+1658>: jne 0x7ffff28a8ae1 <_Unpickler_MemoPut+1793>
    0x7ffff28a8a60 <_Unpickler_MemoPut+1664>: add rsi,0x4
    => 0x7ffff28a8a64 <_Unpickler_MemoPut+1668>: add r11,0x20
    0x7ffff28a8a68 <_Unpickler_MemoPut+1672>: cmp BYTE PTR [r10+0x7fff8000],0x0
    0x7ffff28a8a70 <_Unpickler_MemoPut+1680>: mov QWORD PTR [rax],0x0
    0x7ffff28a8a77 <_Unpickler_MemoPut+1687>: je 0x7ffff28a896d <_Unpickler_MemoPut+1421>
    0x7ffff28a8a7d <_Unpickler_MemoPut+1693>: nop DWORD PTR [rax]
    [------------------------------------stack-------------------------------------]
    0000| 0x7fffffffcf20 --> 0x7ffff3ac47d8 --> 0x1
    0008| 0x7fffffffcf28 --> 0xffffefe91a3 --> 0x0
    0016| 0x7fffffffcf30 --> 0x7ffff7f48da8 --> 0x20 (' ')
    0024| 0x7fffffffcf38 --> 0x7ffff7f48d00 --> 0x1
    0032| 0x7fffffffcf40 --> 0xffffffffa00 --> 0x0
    0040| 0x7fffffffcf48 --> 0x0
    0048| 0x7fffffffcf50 --> 0x7ffff7f48da0 --> 0x28 ('(')
    0056| 0x7fffffffcf58 --> 0x7ffff7f48da8 --> 0x20 (' ')
    [------------------------------------------------------------------------------]
    Legend: code, data, rodata, value
    0x00007ffff28a8a64 1392 for (i = self->memo_size; i < new_size; i++)
    gdb-peda$ p new_size
    $5 = 0xe8e60f18
    gdb-peda$ p self->memo_size
    $6 = 0x20
    gdb-peda$ p i

    .....
    for (i = self->memo_size; i < new_size; i++)
    self->memo[i] = NULL;
    .....

    @serhiy-storchaka
    Copy link
    Member

    >>> import pickletools
    >>> pickletools.dis(b'\x80\x04\x95\x1d\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x03age\x94K\x17\x8c\x03jobr\x8c\x07student\x94u.')
        0: \x80 PROTO      4
        2: \x95 FRAME      29
       11: }    EMPTY_DICT
       12: \x94 MEMOIZE    (as 0)
       13: (    MARK
       14: \x8c     SHORT_BINUNICODE 'age'
       19: \x94     MEMOIZE    (as 1)
       20: K        BININT1    23
       22: \x8c     SHORT_BINUNICODE 'job'
       27: r        LONG_BINPUT 1953695628
       32: u        SETITEMS   (MARK at 13)
       33: d    DICT       no MARK exists on stack
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/home/serhiy/py/cpython/Lib/pickletools.py", line 2457, in dis
        raise ValueError(errormsg)
    ValueError: no MARK exists on stack

    Ignore the error of unbalanced MARK. The problem code is LONG_BINPUT with the excessive large argument 1953695628. The C implementation of pickle tries to resize the the memo list to the size twice larger than this index. And here an integer overflow occurred.

    This unlikely occurred in real world. The pickle needs to have more than 2**30-1 ≈ 10**9 memoized items for encountering this bug. It means that its size on disk and in memory should be tens or hundreds of gigabytes. Pickle is not the best format for serializing such amount of data.

    @benjaminp
    Copy link
    Contributor

    New changeset a4ae828 by Benjamin Peterson in branch 'master':
    closes bpo-34656: Avoid relying on signed overflow in _pickle memos. (GH-9261)
    a4ae828

    @miss-islington
    Copy link
    Contributor

    New changeset ef4306b by Miss Islington (bot) in branch '3.7':
    closes bpo-34656: Avoid relying on signed overflow in _pickle memos. (GH-9261)
    ef4306b

    @miss-islington
    Copy link
    Contributor

    New changeset 71a9c65 by Miss Islington (bot) in branch '3.6':
    closes bpo-34656: Avoid relying on signed overflow in _pickle memos. (GH-9261)
    71a9c65

    @hroncok
    Copy link
    Mannequin

    hroncok mannequin commented Jan 9, 2019

    Should this go to 3.4 and 3.5 as well, since it is a security thing?

    http://people.canonical.com/~ubuntu-security/cve/2018/CVE-2018-20406.html

    @serhiy-storchaka
    Copy link
    Member

    I am not sure this issue should be classified as a security issue. It can cause DDOS, because pickle should not be used with untrusted data. If it is used, the program has more severe security issues than just DDOS.

    The crash could be triggered by accident, but this is very unlikely. I doubts that this happened even once in real world. Libraries used for handling a large amount of data (like NumPy) use more efficient pickle representation, and can provide even more efficient alternate serialization methods. Note that integers and floats are not memoized, this increases the complexity and size of data that could be affected by this bug.

    But I think that this fix needs a news entry. Do you mind to add it Benjamin?

    @mcepl
    Copy link
    Mannequin

    mcepl mannequin commented Jan 20, 2019

    Does it even make sense to make a security patch for 2.7 for this one?

    @vstinner vstinner changed the title memory exhaustion in Modules/_pickle.c:1393 [CVE-2018-20406] memory exhaustion in Modules/_pickle.c:1393 Jan 21, 2019
    @vstinner
    Copy link
    Member

    Python 2.7 is not affected:

    • Python 2.7 has no C accelerator _pickle (Modules/_pickle.c)
    • Python 2.7 doesn't support protocol 4 (attached proof of concept)

    I reopen the issue because the issue should be fixed in 3.4 and 3.5 as well, since it has been marked as a vulnerability (it got a CVE number).

    @vstinner vstinner added the 3.7 (EOL) end of life label Jan 21, 2019
    @vstinner vstinner reopened this Jan 21, 2019
    @mcepl
    Copy link
    Mannequin

    mcepl mannequin commented Jan 21, 2019

    And Modules/cPickle.c is that drastically different?

    @vstinner
    Copy link
    Member

    And Modules/cPickle.c is that drastically different?

    Stupid me. I was surprised that Python 2.7 had no C accelerator. I was looking for Modules/pickle.c on my case sensitive Linux filesystem...

    @vstinner
    Copy link
    Member

    New changeset a4ae828 by Benjamin Peterson in branch 'master':
    closes bpo-34656: Avoid relying on signed overflow in _pickle memos. (GH-9261)
    a4ae828

    It seems like this patch changes the implementation of the internal "memo" object which is a custom C type in Python 3.

    In Python 2 cPickle, the memo is a regular dictionary and so I'm not sure that Python 2 is affected by this vulnerability.

    Can someone please confirm?

    @mcepl
    Copy link
    Mannequin

    mcepl mannequin commented Jan 23, 2019

    Python 3.4 doesn't allow C99 constructs, so I had to update the patch to reorder iterator declarations. Just if any future colleague Python Linux distro maintainer needs it.

    @dfmz77669
    Copy link
    Mannequin

    dfmz77669 mannequin commented Feb 23, 2019

    In python2, Picklertype donot has tp init which has bug in python3 Pickler_Type.
    I think it not effect python2.
    Can you arch more infor?
    thanks

    @dfmz77669 dfmz77669 mannequin added topic-ctypes and removed 3.7 (EOL) end of life 3.8 only security fixes labels Feb 23, 2019
    @vstinner
    Copy link
    Member

    As I wrote in my previous comment, I don't think that Python 2.7 is affected by this issue.

    @vstinner vstinner added stdlib Python modules in the Lib dir 3.7 (EOL) end of life 3.8 only security fixes and removed topic-ctypes labels Feb 25, 2019
    @larryhastings
    Copy link
    Contributor

    New changeset 4b42d57 by larryhastings (Victor Stinner) in branch '3.4':
    [3.4] bpo-34656: Avoid relying on signed overflow in _pickle memos (GH-9261) (bpo-11870)
    4b42d57

    @larryhastings
    Copy link
    Contributor

    New changeset ef33dd6 by larryhastings (Victor Stinner) in branch '3.5':
    closes bpo-34656: Avoid relying on signed overflow in _pickle memos. (GH-9261) (bpo-11869)
    ef33dd6

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    @Legoclones
    Copy link
    Contributor

    Legoclones commented Feb 7, 2024

    It's not 100% clear to me but based off this thread, it seems that this issue has been resolved. However, I'm still seeing similar behavior where using _pickle or pickle modules will kill the entire Python process if the index for LONG_BINPUT is excessively large.

    For example, pickle.loads(b'U\x04testr\xff\xff\xff\x1f.') - this sets the string 'test' at memo[0x1fffffff], and will take a noticeable amount of time to complete. pickle.loads(b'U\x04testr\xff\xff\xff\xaf.') sets the string 'test' at memo[0xafffffff], and should hang until the process is killed.

    What's interesting is sometimes it throws a MemoryError, but most of the time it just kills the process after waiting a while. I tested this on Python3.11 and Python3.12.1.

    image

    Here's an example of this inconsistent behavior that I'm talking about.

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life 3.8 only security fixes stdlib Python modules in the Lib dir type-security A security issue
    Projects
    None yet
    Development

    No branches or pull requests

    6 participants