classification
Title: Writing in freed memory in _PyMem_DebugRawRealloc() after shrinking a memory block
Type: crash Stage: resolved
Components: Interpreter Core Versions: Python 3.7, Python 3.6, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: davin, larry, serhiy.storchaka, skrah, vstinner, xdegaye
Priority: normal Keywords: patch

Created on 2017-09-29 06:29 by serhiy.storchaka, last changed 2017-11-24 11:57 by serhiy.storchaka. This issue is now closed.

Files
File name Uploaded Description Edit
debug-realloc.diff serhiy.storchaka, 2017-10-31 19:23
Pull Requests
URL Status Linked Edit
PR 3844 merged serhiy.storchaka, 2017-10-01 12:51
PR 3898 closed vstinner, 2017-10-05 16:46
PR 4119 closed vstinner, 2017-10-25 14:15
PR 4191 merged python-dev, 2017-10-31 12:06
PR 4210 merged serhiy.storchaka, 2017-11-01 10:00
PR 4310 merged vstinner, 2017-11-07 10:57
Messages (30)
msg303302 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-09-29 06:29
I tried to build CPython on 64-bit OpenBSD. It was built successfully, but tests crash. They crash on importing hashlib. Actually on compiling hashlib docstring. Minimal reproducer is:

$ ./python -c 'import codecs; codecs.decode(r"\\"*15 + "x"*1906, "unicode-escape")'                  
assertion "maxchar < 128" failed: file "Objects/unicodeobject.c", line 474, function "_PyUnicode_CheckConsistency"
Abort trap (core dumped) 

_PyUnicodeWriter_Finish() calls resize_compact() for truncating the result, but the latter fills the last 15 bytes with \xdb. Since the string is tagged as ASCII, thus crashing in _PyUnicode_CheckConsistency.
msg303307 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-09-29 07:58
"_PyUnicodeWriter_Finish() calls resize_compact() for truncating the result, but the latter fills the last 15 bytes with \xdb. Since the string is tagged as ASCII, thus crashing in _PyUnicode_CheckConsistency."

Cool. My code works as expected to catch bugs :-)
msg303457 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-10-01 12:02
There several bugs in the memory allocator.

Incorrectly detected the case when realloc() resizes a memory block in-place. Wrong address is used for filling the extra memory with DEADBYTE.

-    if (q == oldq && nbytes < original_nbytes) {
+    if (q == oldq - 2*SST && nbytes < original_nbytes) {
         /* shrinking:  mark old extra memory dead */
-        memset(q + nbytes, DEADBYTE, original_nbytes - nbytes);
+        memset(q + 2*SST + nbytes, DEADBYTE, original_nbytes - nbytes);
     }

But fixing this exposes other problem. _PyMem_DebugRawRealloc() is called recursively. _PyMem_DebugRawRealloc calls api->alloc.realloc which is _PyMem_DebugRawRealloc. There are two nested debug allocators. The block is nested in other block, both have their own header and footer.

|header1|header2|------------------------------|footer2|footer1|

_PyMem_DebugRawRealloc fills the extra memory with DEADBYTE.

|header|---------------------------..unused..|footer|
|header|---------------------------|footer|XXXXXXXXX|

But in case of nested _PyMem_DebugRawRealloc's, the outer one (which reallocates the inner block), overwrites the footer of the outer block.

|header1|header2|--------------------..unused..|footer2|footer1|
|header1|header2|--------------------..unused|footer1|XXXXXXXXX| after inner realloc
|header1|header2|--------------------|footer2|YYYYYYYYY|XXXXXXX| after outher realloc

XXX are DEADBYTEs written by the inner allocator, YYY are DEADBYTEs written by the outer allocator.
msg303460 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-10-01 13:02
Using nested _PyMem_DebugRawRealloc() looks suspicions to me. This may be a bug.

But even without nested _PyMem_DebugRawRealloc() writing to the extra memory after using realloc() looks wrong to me. This can break other non-trivial system allocators which write an information past the allocated block. This can cause a segfault if unused memory pages are returned to OS.

After creating the PR I have found that it literally restores the code of 2.7 and 3.3. 3.4 and later contain this bug. The bug looks enough serious to me for fixing it in 3.4 and 3.5.
msg304752 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-10-22 14:41
Ping.
msg304987 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-10-25 13:55
> Using nested _PyMem_DebugRawRealloc() looks suspicions to me. This may be a bug.

PyObject_Malloc() calls PyMem_RawMalloc() for allocations larger than 512 bytes.

When debug hooks are enabled, PyObject_Malloc() and PyMem_RawMalloc() both call _PyMem_DebugRawRealloc(). The behaviour that you saw is expected.

It was simpler to reuse _PyMem_DebugRawRealloc() PyObject and PyMem_Raw allocator families, rather than duplicating the code.
msg304988 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-10-25 14:18
"I tried to build CPython on 64-bit OpenBSD. It was built successfully, but tests crash."

It's the first time that anyone complains about _PyMem_DebugRawRealloc(). The behaviour seems to be very specific to OpenBSD.

Even if the current code is "weird", it works very well and is effecient.

I proposed attached PR 4119 which makes _PyMem_DebugRawRealloc() behaves correctly: erase bytes *before* calling realloc(), but keeps a copy if realloc() fails. My PR only changes the behaviour on OpenBSD. It keeps the current behaviour (erase bytes *after* realloc, if realloc succeeded) on other platforms for performance reasons.
msg304991 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-10-25 14:55
The current code in not correct on all platforms. We don't know how many of random bugs, hangs and crashes on other platforms are caused by this bug. I'm not surprised that it was caught on OpenBSD since the robustness and security of software is the goal of OpenBSD.

PR 3844 restores the behavior of 2.7 and 3.3. I propose to merge it first, and develop other enhancements later.
msg304992 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-10-25 15:00
> PR 3844 restores the behavior of 2.7 and 3.3. I propose to merge it first, and develop other enhancements later.

Please don't. This PR reintroduced a bug that I fixed in bpo-18408:

commit c4266360fc70745d49b09f2c29cda91c1a007525
Author: Victor Stinner <victor.stinner@gmail.com>
Date:   Tue Jul 9 00:44:43 2013 +0200

    Issue #18408: Fix _PyMem_DebugRealloc()
    
    Don't mark old extra memory dead before calling realloc(). realloc() can fail
    and realloc() must not touch the original buffer on failure.
    
    So mark old extra memory dead only on success if the new buffer did not move
    (has the same address).
msg304993 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-10-25 15:02
> The current code in not correct on all platforms.

My PR 4119 only changes the behaviour on OpenBSD, but I'm not sure about that. Maybe it's simpler to apply this fix on all platforms, as I asked on the PR itself?
msg304998 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-10-25 16:09
In this case it would be safe to not erase bytes at all.

PR 4119 is too complex for a bugfix (especially if backport it to 3.5 and 3.4). It can introduce other regressions. The performance hit is not the only issue. Allocating a temporary buffer can change the structure of "holes" in memory. As result some memory related bugs can be reproducible only in release mode.

Maybe it is enough to erase only few bytes at the start and end of the freed area. The copy can be saved in local variables, without involving the heap. This solution still will be enough complex, and I think it can be applied only to 3.7. But the bug should be fixed in all affected versions.
msg305041 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-10-26 10:03
"the bug should be fixed in all affected versions"

I don't understand why you insist to change Python 3.4 and Python 3.5. IMHO this issue only impacts OpenBSD.


"The current code in not correct on all platforms. We don't know how many of random bugs, hangs and crashes on other platforms are caused by this bug. I'm not surprised that it was caught on OpenBSD since the robustness and security of software is the goal of OpenBSD."

I'm not aware of these "random bugs, hangs and crashes on other platforms". Did you hear someone complaining about random crashes in Python?

We are running the Python suite multiple times per time on a large farm of buildbot workers. I  never saw the crashes that you mentionned.

IMHO it's very unlikely or impossible that _PyMem_DebugRawRealloc() erases bytes of a memory block that was just unallocated while another thread uses this new memory block for a new allocation. Most, if not all, calls to _PyMem_DebugRawRealloc() are protected by the GIL. If there is a single thread using the memory block, I think that it's perfectly fine to write after it's deallocated.

Well, maybe I'm plain wrong, and it's possible that shrinking a memory block makes the unallocator memory block really unaccessible, and that the memcpy() after the realloc() triggers a big crash. But I would like to *see* such crash to really be convinced :-)
msg305043 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-10-26 10:09
"PR 4119 is too complex for a bugfix (especially if backport it to 3.5 and 3.4). It can introduce other regressions."

Which kind of regression do you expect? Something like a crash?


"The performance hit is not the only issue. Allocating a temporary buffer can change the structure of "holes" in memory. As result some memory related bugs can be reproducible only in release mode."

Can you please elaborate how the exact memory layout can trigger or not bugs in the code?

I'm not saying that you are right or wrong. I just fail to see why the memory address and "holes" would trigger or not bugs.

What I can understand is a subtle behaviour change if realloc() returns the same memory block or a new memory block. But we cannot control that.

I'm not sure that allocating "copy" before realloc() would impact the behaviour of realloc(). The common case is that a memory block is a few bytes smaller, and so the realloc returns the same memory block, but now becomes able to use the unallocated bytes for a new allocation later, no?


"Maybe it is enough to erase only few bytes at the start and end of the freed area. The copy can be saved in local variables, without involving the heap. This solution still will be enough complex, and I think it can be applied only to 3.7. But the bug should be fixed in all affected versions."

If we must make a choice, I would prefer to keep the current behaviour: make sure that unallocated bytes are erased. Catching code reading unallocated bytes is the most important feature of debug hooks, no?

I dislike the idea of only erasing "some" bytes: this change may prevent to detect some bugs.
msg305060 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-10-26 14:35
The current code OBVIOUSLY is wrong. Bytes are erased if q == oldq && nbytes < original_nbytes. But q == oldq only if realloc() returns the new address 2*sizeof(size_t) bytes larger than its argument. This is virtually never happen on other platforms because _PyMem_DebugRawRealloc() usually used with blocks larger than 2*sizeof(size_t) bytes and the system realloc() don't shrink the block at left (this is implementation detail). Thus this code is virtually dead on other platforms. It doesn't detect shrinking memory block in-place.

After fixing *this* bug, we have encountered with *other* bug, related to overwriting the freed memory.

I don't see reasons of keeping an obviously wrong code. When fix the first bug we will need to fix the other bug.
msg305162 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2017-10-28 14:26
> realloc() must not touch the original buffer on failure

I don't understand this: If realloc() fails, the original buffer
is perfectly valid.
msg305163 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2017-10-28 14:31
Ah sorry, you mean it cannot write the special bytes.
msg305291 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-10-31 12:05
New changeset b484d5606ca76f9bbd0f5de7a6ef753400213e94 by Serhiy Storchaka in branch 'master':
bpo-31626: Fixed a bug in debug memory allocator. (#3844)
https://github.com/python/cpython/commit/b484d5606ca76f9bbd0f5de7a6ef753400213e94
msg305296 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-10-31 13:58
New changeset ece5659565e083baaee4d185ce181a98aaee7f96 by Serhiy Storchaka (Miss Islington (bot)) in branch '3.6':
bpo-31626: Fixed a bug in debug memory allocator. (GH-3844) (#4191)
https://github.com/python/cpython/commit/ece5659565e083baaee4d185ce181a98aaee7f96
msg305332 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-10-31 19:23
I have removed the incorrect code in master and 3.6, this unblocks testing debug build on OpenBSD. I don't think it is worth to backport this change to 3.5 and 3.4, since the bug affects only debug build.

Here is a patch which perhaps is an alternative to PR 4119, but without disturbing the heap allocator. It erases just few bytes at the begin and at the end of the allocated block (including the header and the trailer), and saves erased bytes in small local buffer. It does this even if the block is not reallocated in-place. Hence calling free() on reallocated block will be detected. Reading from the reallocated block will get erased bytes at the start and the end, this is likely will cause to loud failure too. I have tested in on OpenBSD.

But I'm not sure that it is worth to apply this patch. Left it on to you Victor.
msg305334 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2017-10-31 19:32
> Most, if not all, calls to _PyMem_DebugRawRealloc() are protected by
> the GIL. If there is a single thread using the memory block,
> I think that it's perfectly fine to write after it's deallocated.

I don't quite follow where the write-after-free is happening, but: C's free() function is *not* protected by the GIL.  So if you're running in a multithreaded program where other threads aren't blocked by the GIL, one of these other threads could very easily allocate this freshly-freed memory.  And if you wrote to it after the memory was allocated to this other thread, congrats, your program is no longer correct.
msg305349 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-11-01 01:56
While the commit b484d5606ca76f9bbd0f5de7a6ef753400213e94 fixes a crash on OpenBSD and makes the code "correct" on any platforms, I still see it as a feature regression. Detecting buffer overflow and using freed memory are feature of debug hooks.

https://docs.python.org/dev/c-api/memory.html#c.PyMem_SetupDebugHooks

So yes, debug-realloc.diff is interesting.
msg305736 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-11-07 10:46
New changeset 3cc4c53a64bdcf21431ad306eca0e568f88735a2 by Serhiy Storchaka in branch 'master':
bpo-31626: Mark ends of the reallocated block in debug build. (#4210)
https://github.com/python/cpython/commit/3cc4c53a64bdcf21431ad306eca0e568f88735a2
msg305737 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-11-07 10:48
There are large discussions at PR 3844 and PR 4210.
msg305738 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-11-07 10:57
Yet one issue is left. In debug mode two debug allocators are used, the one is nested in the other. Is it correct?
msg305743 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-11-07 12:31
For allocations larger than 512 bytes, PyObject_Malloc() calls PyMem_RawMalloc(). When debug hooks are installed, PyObject_Malloc() calls a debug hook which calls PyMem_RawMalloc() which calls another debug hook.
msg306866 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-11-24 01:13
New changeset ed4743a2f2bb8d88f7f1aa9307794be9a44f1e0f by Victor Stinner in branch '2.7':
bpo-31626: Fix _PyObject_DebugReallocApi() (#4310)
https://github.com/python/cpython/commit/ed4743a2f2bb8d88f7f1aa9307794be9a44f1e0f
msg306867 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-11-24 01:15
Serhiy: I added a Py_FatalError() to Python 2.7 if the case that must not happen does happen. Are you ok to apply the same change for Python 3.6 (commit ed4743a2f2bb8d88f7f1aa9307794be9a44f1e0f), or do you prefer to backport your more complex patch using a small buffer allocated on the stack?
msg306877 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-11-24 06:12
I prefer to keep the current code as is.
msg306887 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-11-24 11:08
Ok, let me summarize:

* 2.7: memset() *before* realloc() if shrinking a memory block, but crash when Py_FatalError() if realloc() fails on shrinking --> cannot corrupt memory, but can crash if realloc() fails on shrinking

* master (3.7): memset() *before* realloc() if shrinking a memory block, but save erased bytes, and restore erased bytes if realloc() fails --> always correct

* 3.6: don't memset() --> correct, but don't detect "use after free" bug when a memory block is skrinked

Python 3.6 release manager, Ned Deily, rejected the proposal to backport the complex fix from master to 3.6.


Victor (me):
> Serhiy: I added a Py_FatalError() to Python 2.7 if the case that must not happen does happen. Are you ok to apply the same change for Python 3.6 (...)

Sorry, when I asked the question, I expected that Python 3.6 still erased bytes before realloc(). But it's not the case. I agree that Py_FatalError() would be a bad idea for Python 3.6.


Serhiy: Thank you for fixing Python 3.6 (don't memset() *after* realloc, which crashed on OpenBSD) and "fix" the feature in master (restore erased bytes if realloc fails)!


I close the issue. I don't think that Python 2.7 or 3.6 need further changes.
msg306891 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-11-24 11:57
> * 3.6: don't memset() --> correct, but don't detect "use after free" bug
> when a memory block is skrinked

And doesn't detect  "use after free" bug when a memory block is expanded. And 
2.7 doesn't detect this. Only 3.7 detects this.
History
Date User Action Args
2017-11-24 11:57:24serhiy.storchakasetmessages: + msg306891
2017-11-24 11:08:27vstinnersetversions: + Python 2.7
2017-11-24 11:08:20vstinnersetstatus: open -> closed
resolution: fixed
messages: + msg306887

stage: patch review -> resolved
2017-11-24 06:12:39serhiy.storchakasetmessages: + msg306877
2017-11-24 01:15:15vstinnersetmessages: + msg306867
2017-11-24 01:13:28vstinnersetmessages: + msg306866
2017-11-07 12:31:41vstinnersetmessages: + msg305743
2017-11-07 10:57:27serhiy.storchakasetmessages: + msg305738
2017-11-07 10:57:05vstinnersetpull_requests: + pull_request4271
2017-11-07 10:48:42serhiy.storchakasetmessages: + msg305737
2017-11-07 10:46:45serhiy.storchakasetmessages: + msg305736
2017-11-01 10:00:37serhiy.storchakasetpull_requests: + pull_request4179
2017-11-01 01:56:15vstinnersetmessages: + msg305349
2017-10-31 19:34:19serhiy.storchakasetpriority: critical -> normal
2017-10-31 19:32:09larrysetmessages: + msg305334
2017-10-31 19:23:23serhiy.storchakasetfiles: + debug-realloc.diff

messages: + msg305332
versions: - Python 3.4, Python 3.5
2017-10-31 13:58:37serhiy.storchakasetmessages: + msg305296
2017-10-31 12:06:11python-devsetpull_requests: + pull_request4161
2017-10-31 12:05:05serhiy.storchakasetmessages: + msg305291
2017-10-28 14:31:29skrahsetmessages: + msg305163
2017-10-28 14:26:35skrahsetnosy: + skrah
messages: + msg305162
2017-10-28 10:27:51xdegayesetnosy: + xdegaye
2017-10-26 14:35:15serhiy.storchakasetmessages: + msg305060
2017-10-26 10:09:26vstinnersetmessages: + msg305043
2017-10-26 10:03:13vstinnersetmessages: + msg305041
2017-10-25 16:09:22serhiy.storchakasetmessages: + msg304998
2017-10-25 15:02:58vstinnersetmessages: + msg304993
2017-10-25 15:00:21vstinnersetmessages: + msg304992
2017-10-25 14:55:08serhiy.storchakasetmessages: + msg304991
2017-10-25 14:18:51vstinnersetmessages: + msg304988
2017-10-25 14:15:59vstinnersetpull_requests: + pull_request4089
2017-10-25 13:55:11vstinnersetmessages: + msg304987
2017-10-22 14:41:21serhiy.storchakasetmessages: + msg304752
2017-10-05 16:46:52vstinnersetpull_requests: + pull_request3870
2017-10-01 13:02:30serhiy.storchakasetnosy: + larry
title: Crash in _PyUnicode_DecodeUnicodeEscape on OpenBSD -> Writing in freed memory in _PyMem_DebugRawRealloc() after shrinking a memory block
messages: + msg303460

versions: + Python 3.4, Python 3.5, Python 3.6
2017-10-01 12:51:55serhiy.storchakasetkeywords: + patch
stage: patch review
pull_requests: + pull_request3825
2017-10-01 12:02:27serhiy.storchakasetpriority: normal -> critical
2017-10-01 12:02:14serhiy.storchakasetmessages: + msg303457
2017-09-29 09:45:52serhiy.storchakasetnosy: + davin
2017-09-29 07:58:08vstinnersetmessages: + msg303307
2017-09-29 06:52:45serhiy.storchakasettype: crash
2017-09-29 06:29:10serhiy.storchakacreate