-
-
Notifications
You must be signed in to change notification settings - Fork 29.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[security] DoS (MemError via CPU and RAM exhaustion) when processing malformed Apple Property List files in binary format #86269
Comments
In versions of Python from 3.4-3.10, the Python core plistlib library supports Apple's binary plist format. When given malformed input, the implementation can be forced to create an argument to struct.unpack() which consumes all available CPU and memory until a MemError is thrown as it builds the 'format' argument to unpack(). This can be seen with the following malformed example binary plist input:
The error is reached in the following lines of plistlib.py: Line 485 in e9959c7
When the malicious example above is opened by plistlib, it results in 'n' being controlled by the input and it can be forced to be very large. Plistlib attempts to build a string which is as long as 'n', consuming excessive resources. Apple's built in utilities for handling plist files detects this same file as malformed and will not process it. |
Thanks for the report. I can reproduce the issue. |
One Apple implementation of binary plist parsing is here: https://opensource.apple.com/source/CF/CF-550/CFBinaryPList.c. That seems to work from a buffer (or mmap) of the entire file, making consistency checks somewhat easier, and I don't think they have a hardcoded limit on the number of items in the plist (but: it's getting late and might have missed that). An easy fix for this issue is to limit the amount of values read by _read_ints, but that immediately begs the question what that limit should be. It is clear that 1B items in the array is too much (in this case 1B object refs for dictionary keys). I don't know what a sane limit would be. The largest plist file on my system is 10MB. Limiting *n* to 20M would be easily enough to parse that file, and doesn't consume too much memory (about 160MB for the read buffer at most). Another thing the current code doesn't do is check that the "ref" argument to _read_object is in range. That's less problematic because the code will raise an exception regardless (IndexErrox, instead of the nicer InvalidFileException). |
There are two issues here. The simple one is building a large format string for struct.unpack(). It has simple solution: use f'>{n}{_BINARY_FORMAT[size]}'. The hard issue is that read(n) allocates n bytes in memory even if there are not so many bytes in the file. It affects not only plistlib and should be fixed in the file implementation itself. There is an open issue for this. |
Serhiy, thanks. Just the change in the format string would fix this particular example. I see you're working on a PR with better validation. The current state of the draft looks good to me, but I haven't checked yet if there are other potential problems that can be added to the list of invalid binary plists. |
PR 22882 fixes problem in _read_ints(), adds validation for string size, and adds many tests for mailformed binary Plists. There may be problems with recursive collections. I'll try to solve them too. |
No, with recursive collections all is good. Then I'll just add a NEWS entry and maybe few more tests. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: