Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[security] DoS (MemError via CPU and RAM exhaustion) when processing malformed Apple Property List files in binary format #86269

Closed
wessen mannequin opened this issue Oct 21, 2020 · 12 comments
Labels
3.7 (EOL) end of life 3.8 only security fixes 3.9 only security fixes 3.10 only security fixes performance Performance or resource usage release-blocker stdlib Python modules in the Lib dir

Comments

@wessen
Copy link
Mannequin

wessen mannequin commented Oct 21, 2020

BPO 42103
Nosy @ronaldoussoren, @ned-deily, @ambv, @serhiy-storchaka, @miss-islington
PRs
  • bpo-42103: Improve validation of Plist files. #22882
  • [3.9] bpo-42103: Improve validation of Plist files. (GH-22882) #23115
  • [3.8] bpo-42103: Improve validation of Plist files. (GH-22882) #23116
  • [3.7] bpo-42103: Improve validation of Plist files. (GH-22882) #23117
  • [3.6] bpo-42103: Improve validation of Plist files. (GH-22882) #23118
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2020-11-10.20:17:26.610>
    created_at = <Date 2020-10-21.00:25:11.057>
    labels = ['3.8', '3.9', '3.10', 'performance', '3.7', 'library', 'release-blocker']
    title = '[security] DoS (MemError via CPU and RAM exhaustion) when processing malformed Apple Property List files in binary format'
    updated_at = <Date 2020-11-10.20:17:26.609>
    user = 'https://bugs.python.org/wessen'

    bugs.python.org fields:

    activity = <Date 2020-11-10.20:17:26.609>
    actor = 'serhiy.storchaka'
    assignee = 'none'
    closed = True
    closed_date = <Date 2020-11-10.20:17:26.610>
    closer = 'serhiy.storchaka'
    components = ['Library (Lib)']
    creation = <Date 2020-10-21.00:25:11.057>
    creator = 'wessen'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 42103
    keywords = ['patch', 'security_issue']
    message_count = 12.0
    messages = ['379175', '379238', '379243', '379255', '379283', '379285', '379286', '380250', '380252', '380265', '380703', '380704']
    nosy_count = 6.0
    nosy_names = ['ronaldoussoren', 'ned.deily', 'lukasz.langa', 'serhiy.storchaka', 'miss-islington', 'wessen']
    pr_nums = ['22882', '23115', '23116', '23117', '23118']
    priority = 'release blocker'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'resource usage'
    url = 'https://bugs.python.org/issue42103'
    versions = ['Python 3.6', 'Python 3.7', 'Python 3.8', 'Python 3.9', 'Python 3.10']

    @wessen
    Copy link
    Mannequin Author

    wessen mannequin commented Oct 21, 2020

    In versions of Python from 3.4-3.10, the Python core plistlib library supports Apple's binary plist format. When given malformed input, the implementation can be forced to create an argument to struct.unpack() which consumes all available CPU and memory until a MemError is thrown as it builds the 'format' argument to unpack().

    This can be seen with the following malformed example binary plist input:

    $ xxd binary_plist_dos.bplist
    00000000: 6270 6c69 7374 3030 d101 0255 614c 6973  bplist00...UaLis
    00000010: 74a5 0304 0506 0000 00df 4251 4351 44a3  t.........BQCQD.
    00000020: 0809 0a10 0110 0210 0308 0b11 1719 1b1d  ................
    00000030: 0000 0101 0000 0000 0000 000b 0000 0000  ................
    00000040: 0000 0000 0000 0000 0000 0029            ...........)
    
    

    The error is reached in the following lines of plistlib.py:
    (

    return struct.unpack('>' + _BINARY_FORMAT[size] * n, data)
    )

        def _read_ints(self, n, size):
            data = self._fp.read(size * n)
            if size in _BINARY_FORMAT:
                return struct.unpack('>' + _BINARY_FORMAT[size] * n, data)
    

    When the malicious example above is opened by plistlib, it results in 'n' being controlled by the input and it can be forced to be very large. Plistlib attempts to build a string which is as long as 'n', consuming excessive resources.

    Apple's built in utilities for handling plist files detects this same file as malformed and will not process it.

    @wessen wessen mannequin added interpreter-core (Objects, Python, Grammar, and Parser dirs) 3.7 (EOL) end of life 3.8 only security fixes 3.9 only security fixes 3.10 only security fixes performance Performance or resource usage labels Oct 21, 2020
    @ned-deily ned-deily added stdlib Python modules in the Lib dir and removed interpreter-core (Objects, Python, Grammar, and Parser dirs) labels Oct 21, 2020
    @ned-deily ned-deily changed the title DoS (MemError via CPU and RAM exhaustion) when processing malformed Apple Property List files in binary format [security] DoS (MemError via CPU and RAM exhaustion) when processing malformed Apple Property List files in binary format Oct 21, 2020
    @ned-deily ned-deily added stdlib Python modules in the Lib dir and removed interpreter-core (Objects, Python, Grammar, and Parser dirs) labels Oct 21, 2020
    @ned-deily ned-deily changed the title DoS (MemError via CPU and RAM exhaustion) when processing malformed Apple Property List files in binary format [security] DoS (MemError via CPU and RAM exhaustion) when processing malformed Apple Property List files in binary format Oct 21, 2020
    @ronaldoussoren
    Copy link
    Contributor

    Thanks for the report. I can reproduce the issue.

    @ronaldoussoren
    Copy link
    Contributor

    One Apple implementation of binary plist parsing is here: https://opensource.apple.com/source/CF/CF-550/CFBinaryPList.c.

    That seems to work from a buffer (or mmap) of the entire file, making consistency checks somewhat easier, and I don't think they have a hardcoded limit on the number of items in the plist (but: it's getting late and might have missed that).

    An easy fix for this issue is to limit the amount of values read by _read_ints, but that immediately begs the question what that limit should be. It is clear that 1B items in the array is too much (in this case 1B object refs for dictionary keys).

    I don't know what a sane limit would be. The largest plist file on my system is 10MB. Limiting *n* to 20M would be easily enough to parse that file, and doesn't consume too much memory (about 160MB for the read buffer at most).

    Another thing the current code doesn't do is check that the "ref" argument to _read_object is in range. That's less problematic because the code will raise an exception regardless (IndexErrox, instead of the nicer InvalidFileException).

    @serhiy-storchaka
    Copy link
    Member

    There are two issues here.

    The simple one is building a large format string for struct.unpack(). It has simple solution: use f'>{n}{_BINARY_FORMAT[size]}'.

    The hard issue is that read(n) allocates n bytes in memory even if there are not so many bytes in the file. It affects not only plistlib and should be fixed in the file implementation itself. There is an open issue for this.

    @ronaldoussoren
    Copy link
    Contributor

    Serhiy, thanks. Just the change in the format string would fix this particular example.

    I see you're working on a PR with better validation. The current state of the draft looks good to me, but I haven't checked yet if there are other potential problems that can be added to the list of invalid binary plists.

    @serhiy-storchaka
    Copy link
    Member

    PR 22882 fixes problem in _read_ints(), adds validation for string size, and adds many tests for mailformed binary Plists.

    There may be problems with recursive collections. I'll try to solve them too.

    @serhiy-storchaka
    Copy link
    Member

    No, with recursive collections all is good.

    Then I'll just add a NEWS entry and maybe few more tests.

    @serhiy-storchaka
    Copy link
    Member

    New changeset 34637a0 by Serhiy Storchaka in branch 'master':
    bpo-42103: Improve validation of Plist files. (GH-22882)
    34637a0

    @miss-islington
    Copy link
    Contributor

    New changeset e277cb7 by Miss Skeleton (bot) in branch '3.9':
    bpo-42103: Improve validation of Plist files. (GH-22882)
    e277cb7

    @serhiy-storchaka
    Copy link
    Member

    New changeset 547d2bc by Serhiy Storchaka in branch '3.8':
    [3.8] bpo-42103: Improve validation of Plist files. (GH-22882) (GH-23116)
    547d2bc

    @ned-deily
    Copy link
    Member

    New changeset 225e365 by Serhiy Storchaka in branch '3.7':
    [3.7] bpo-42103: Improve validation of Plist files. (GH-22882) (bpo-23117)
    225e365

    @ned-deily
    Copy link
    Member

    New changeset a63234c by Serhiy Storchaka in branch '3.6':
    [3.6] bpo-42103: Improve validation of Plist files. (GH-22882) (GH-23118)
    a63234c

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life 3.8 only security fixes 3.9 only security fixes 3.10 only security fixes performance Performance or resource usage release-blocker stdlib Python modules in the Lib dir
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants