New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Writing in freed memory in _PyMem_DebugRawRealloc() after shrinking a memory block #75807
Comments
I tried to build CPython on 64-bit OpenBSD. It was built successfully, but tests crash. They crash on importing hashlib. Actually on compiling hashlib docstring. Minimal reproducer is: $ ./python -c 'import codecs; codecs.decode(r"\\"*15 + "x"*1906, "unicode-escape")'
assertion "maxchar < 128" failed: file "Objects/unicodeobject.c", line 474, function "_PyUnicode_CheckConsistency"
Abort trap (core dumped) _PyUnicodeWriter_Finish() calls resize_compact() for truncating the result, but the latter fills the last 15 bytes with \xdb. Since the string is tagged as ASCII, thus crashing in _PyUnicode_CheckConsistency. |
"_PyUnicodeWriter_Finish() calls resize_compact() for truncating the result, but the latter fills the last 15 bytes with \xdb. Since the string is tagged as ASCII, thus crashing in _PyUnicode_CheckConsistency." Cool. My code works as expected to catch bugs :-) |
There several bugs in the memory allocator. Incorrectly detected the case when realloc() resizes a memory block in-place. Wrong address is used for filling the extra memory with DEADBYTE. - if (q == oldq && nbytes < original_nbytes) {
+ if (q == oldq - 2*SST && nbytes < original_nbytes) {
/* shrinking: mark old extra memory dead */
- memset(q + nbytes, DEADBYTE, original_nbytes - nbytes);
+ memset(q + 2*SST + nbytes, DEADBYTE, original_nbytes - nbytes);
} But fixing this exposes other problem. _PyMem_DebugRawRealloc() is called recursively. _PyMem_DebugRawRealloc calls api->alloc.realloc which is _PyMem_DebugRawRealloc. There are two nested debug allocators. The block is nested in other block, both have their own header and footer. |header1|header2|------------------------------|footer2|footer1| _PyMem_DebugRawRealloc fills the extra memory with DEADBYTE. |header|---------------------------..unused..|footer| But in case of nested _PyMem_DebugRawRealloc's, the outer one (which reallocates the inner block), overwrites the footer of the outer block. |header1|header2|--------------------..unused..|footer2|footer1| XXX are DEADBYTEs written by the inner allocator, YYY are DEADBYTEs written by the outer allocator. |
Using nested _PyMem_DebugRawRealloc() looks suspicions to me. This may be a bug. But even without nested _PyMem_DebugRawRealloc() writing to the extra memory after using realloc() looks wrong to me. This can break other non-trivial system allocators which write an information past the allocated block. This can cause a segfault if unused memory pages are returned to OS. After creating the PR I have found that it literally restores the code of 2.7 and 3.3. 3.4 and later contain this bug. The bug looks enough serious to me for fixing it in 3.4 and 3.5. |
Ping. |
PyObject_Malloc() calls PyMem_RawMalloc() for allocations larger than 512 bytes. When debug hooks are enabled, PyObject_Malloc() and PyMem_RawMalloc() both call _PyMem_DebugRawRealloc(). The behaviour that you saw is expected. It was simpler to reuse _PyMem_DebugRawRealloc() PyObject and PyMem_Raw allocator families, rather than duplicating the code. |
"I tried to build CPython on 64-bit OpenBSD. It was built successfully, but tests crash." It's the first time that anyone complains about _PyMem_DebugRawRealloc(). The behaviour seems to be very specific to OpenBSD. Even if the current code is "weird", it works very well and is effecient. I proposed attached PR 4119 which makes _PyMem_DebugRawRealloc() behaves correctly: erase bytes *before* calling realloc(), but keeps a copy if realloc() fails. My PR only changes the behaviour on OpenBSD. It keeps the current behaviour (erase bytes *after* realloc, if realloc succeeded) on other platforms for performance reasons. |
The current code in not correct on all platforms. We don't know how many of random bugs, hangs and crashes on other platforms are caused by this bug. I'm not surprised that it was caught on OpenBSD since the robustness and security of software is the goal of OpenBSD. PR 3844 restores the behavior of 2.7 and 3.3. I propose to merge it first, and develop other enhancements later. |
Please don't. This PR reintroduced a bug that I fixed in bpo-18408: commit c426636
|
My PR 4119 only changes the behaviour on OpenBSD, but I'm not sure about that. Maybe it's simpler to apply this fix on all platforms, as I asked on the PR itself? |
In this case it would be safe to not erase bytes at all. PR 4119 is too complex for a bugfix (especially if backport it to 3.5 and 3.4). It can introduce other regressions. The performance hit is not the only issue. Allocating a temporary buffer can change the structure of "holes" in memory. As result some memory related bugs can be reproducible only in release mode. Maybe it is enough to erase only few bytes at the start and end of the freed area. The copy can be saved in local variables, without involving the heap. This solution still will be enough complex, and I think it can be applied only to 3.7. But the bug should be fixed in all affected versions. |
"the bug should be fixed in all affected versions" I don't understand why you insist to change Python 3.4 and Python 3.5. IMHO this issue only impacts OpenBSD. "The current code in not correct on all platforms. We don't know how many of random bugs, hangs and crashes on other platforms are caused by this bug. I'm not surprised that it was caught on OpenBSD since the robustness and security of software is the goal of OpenBSD." I'm not aware of these "random bugs, hangs and crashes on other platforms". Did you hear someone complaining about random crashes in Python? We are running the Python suite multiple times per time on a large farm of buildbot workers. I never saw the crashes that you mentionned. IMHO it's very unlikely or impossible that _PyMem_DebugRawRealloc() erases bytes of a memory block that was just unallocated while another thread uses this new memory block for a new allocation. Most, if not all, calls to _PyMem_DebugRawRealloc() are protected by the GIL. If there is a single thread using the memory block, I think that it's perfectly fine to write after it's deallocated. Well, maybe I'm plain wrong, and it's possible that shrinking a memory block makes the unallocator memory block really unaccessible, and that the memcpy() after the realloc() triggers a big crash. But I would like to *see* such crash to really be convinced :-) |
"PR 4119 is too complex for a bugfix (especially if backport it to 3.5 and 3.4). It can introduce other regressions." Which kind of regression do you expect? Something like a crash? "The performance hit is not the only issue. Allocating a temporary buffer can change the structure of "holes" in memory. As result some memory related bugs can be reproducible only in release mode." Can you please elaborate how the exact memory layout can trigger or not bugs in the code? I'm not saying that you are right or wrong. I just fail to see why the memory address and "holes" would trigger or not bugs. What I can understand is a subtle behaviour change if realloc() returns the same memory block or a new memory block. But we cannot control that. I'm not sure that allocating "copy" before realloc() would impact the behaviour of realloc(). The common case is that a memory block is a few bytes smaller, and so the realloc returns the same memory block, but now becomes able to use the unallocated bytes for a new allocation later, no? "Maybe it is enough to erase only few bytes at the start and end of the freed area. The copy can be saved in local variables, without involving the heap. This solution still will be enough complex, and I think it can be applied only to 3.7. But the bug should be fixed in all affected versions." If we must make a choice, I would prefer to keep the current behaviour: make sure that unallocated bytes are erased. Catching code reading unallocated bytes is the most important feature of debug hooks, no? I dislike the idea of only erasing "some" bytes: this change may prevent to detect some bugs. |
The current code OBVIOUSLY is wrong. Bytes are erased if q == oldq && nbytes < original_nbytes. But q == oldq only if realloc() returns the new address 2*sizeof(size_t) bytes larger than its argument. This is virtually never happen on other platforms because _PyMem_DebugRawRealloc() usually used with blocks larger than 2*sizeof(size_t) bytes and the system realloc() don't shrink the block at left (this is implementation detail). Thus this code is virtually dead on other platforms. It doesn't detect shrinking memory block in-place. After fixing *this* bug, we have encountered with *other* bug, related to overwriting the freed memory. I don't see reasons of keeping an obviously wrong code. When fix the first bug we will need to fix the other bug. |
I don't understand this: If realloc() fails, the original buffer |
Ah sorry, you mean it cannot write the special bytes. |
I have removed the incorrect code in master and 3.6, this unblocks testing debug build on OpenBSD. I don't think it is worth to backport this change to 3.5 and 3.4, since the bug affects only debug build. Here is a patch which perhaps is an alternative to PR 4119, but without disturbing the heap allocator. It erases just few bytes at the begin and at the end of the allocated block (including the header and the trailer), and saves erased bytes in small local buffer. It does this even if the block is not reallocated in-place. Hence calling free() on reallocated block will be detected. Reading from the reallocated block will get erased bytes at the start and the end, this is likely will cause to loud failure too. I have tested in on OpenBSD. But I'm not sure that it is worth to apply this patch. Left it on to you Victor. |
I don't quite follow where the write-after-free is happening, but: C's free() function is *not* protected by the GIL. So if you're running in a multithreaded program where other threads aren't blocked by the GIL, one of these other threads could very easily allocate this freshly-freed memory. And if you wrote to it after the memory was allocated to this other thread, congrats, your program is no longer correct. |
While the commit b484d56 fixes a crash on OpenBSD and makes the code "correct" on any platforms, I still see it as a feature regression. Detecting buffer overflow and using freed memory are feature of debug hooks. https://docs.python.org/dev/c-api/memory.html#c.PyMem_SetupDebugHooks So yes, debug-realloc.diff is interesting. |
Yet one issue is left. In debug mode two debug allocators are used, the one is nested in the other. Is it correct? |
For allocations larger than 512 bytes, PyObject_Malloc() calls PyMem_RawMalloc(). When debug hooks are installed, PyObject_Malloc() calls a debug hook which calls PyMem_RawMalloc() which calls another debug hook. |
Serhiy: I added a Py_FatalError() to Python 2.7 if the case that must not happen does happen. Are you ok to apply the same change for Python 3.6 (commit ed4743a), or do you prefer to backport your more complex patch using a small buffer allocated on the stack? |
I prefer to keep the current code as is. |
Ok, let me summarize:
Python 3.6 release manager, Ned Deily, rejected the proposal to backport the complex fix from master to 3.6. Victor (me):
Sorry, when I asked the question, I expected that Python 3.6 still erased bytes before realloc(). But it's not the case. I agree that Py_FatalError() would be a bad idea for Python 3.6. Serhiy: Thank you for fixing Python 3.6 (don't memset() *after* realloc, which crashed on OpenBSD) and "fix" the feature in master (restore erased bytes if realloc fails)! I close the issue. I don't think that Python 2.7 or 3.6 need further changes. |
And doesn't detect "use after free" bug when a memory block is expanded. And |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: