Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Writing in freed memory in _PyMem_DebugRawRealloc() after shrinking a memory block #75807

Closed
serhiy-storchaka opened this issue Sep 29, 2017 · 30 comments
Labels
3.7 (EOL) end of life interpreter-core (Objects, Python, Grammar, and Parser dirs) type-crash A hard crash of the interpreter, possibly with a core dump

Comments

@serhiy-storchaka
Copy link
Member

BPO 31626
Nosy @vstinner, @larryhastings, @skrah, @xdegaye, @serhiy-storchaka, @applio
PRs
  • bpo-31626: Fixed a bug in debug memory allocator. #3844
  • [WIP] bpo-31626: Rewrite _PyMem_DebugRawRealloc() #3898
  • bpo-31626: Fix _PyMem_DebugRawRealloc() on OpenBSD #4119
  • [3.6] bpo-31626: Fixed a bug in debug memory allocator. (GH-3844) #4191
  • bpo-31626: Mark ends of the reallocated block in debug build. #4210
  • bpo-31626: Fix _PyObject_DebugReallocApi() #4310
  • Files
  • debug-realloc.diff
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2017-11-24.11:08:20.926>
    created_at = <Date 2017-09-29.06:29:10.376>
    labels = ['interpreter-core', '3.7', 'type-crash']
    title = 'Writing in freed memory in _PyMem_DebugRawRealloc() after shrinking a memory block'
    updated_at = <Date 2017-11-24.11:57:24.033>
    user = 'https://github.com/serhiy-storchaka'

    bugs.python.org fields:

    activity = <Date 2017-11-24.11:57:24.033>
    actor = 'serhiy.storchaka'
    assignee = 'none'
    closed = True
    closed_date = <Date 2017-11-24.11:08:20.926>
    closer = 'vstinner'
    components = ['Interpreter Core']
    creation = <Date 2017-09-29.06:29:10.376>
    creator = 'serhiy.storchaka'
    dependencies = []
    files = ['47249']
    hgrepos = []
    issue_num = 31626
    keywords = ['patch']
    message_count = 30.0
    messages = ['303302', '303307', '303457', '303460', '304752', '304987', '304988', '304991', '304992', '304993', '304998', '305041', '305043', '305060', '305162', '305163', '305291', '305296', '305332', '305334', '305349', '305736', '305737', '305738', '305743', '306866', '306867', '306877', '306887', '306891']
    nosy_count = 6.0
    nosy_names = ['vstinner', 'larry', 'skrah', 'xdegaye', 'serhiy.storchaka', 'davin']
    pr_nums = ['3844', '3898', '4119', '4191', '4210', '4310']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'crash'
    url = 'https://bugs.python.org/issue31626'
    versions = ['Python 2.7', 'Python 3.6', 'Python 3.7']

    @serhiy-storchaka
    Copy link
    Member Author

    I tried to build CPython on 64-bit OpenBSD. It was built successfully, but tests crash. They crash on importing hashlib. Actually on compiling hashlib docstring. Minimal reproducer is:

    $ ./python -c 'import codecs; codecs.decode(r"\\"*15 + "x"*1906, "unicode-escape")'                  
    assertion "maxchar < 128" failed: file "Objects/unicodeobject.c", line 474, function "_PyUnicode_CheckConsistency"
    Abort trap (core dumped) 

    _PyUnicodeWriter_Finish() calls resize_compact() for truncating the result, but the latter fills the last 15 bytes with \xdb. Since the string is tagged as ASCII, thus crashing in _PyUnicode_CheckConsistency.

    @serhiy-storchaka serhiy-storchaka added 3.7 (EOL) end of life interpreter-core (Objects, Python, Grammar, and Parser dirs) type-crash A hard crash of the interpreter, possibly with a core dump labels Sep 29, 2017
    @vstinner
    Copy link
    Member

    "_PyUnicodeWriter_Finish() calls resize_compact() for truncating the result, but the latter fills the last 15 bytes with \xdb. Since the string is tagged as ASCII, thus crashing in _PyUnicode_CheckConsistency."

    Cool. My code works as expected to catch bugs :-)

    @serhiy-storchaka
    Copy link
    Member Author

    There several bugs in the memory allocator.

    Incorrectly detected the case when realloc() resizes a memory block in-place. Wrong address is used for filling the extra memory with DEADBYTE.

    -    if (q == oldq && nbytes < original_nbytes) {
    +    if (q == oldq - 2*SST && nbytes < original_nbytes) {
             /* shrinking:  mark old extra memory dead */
    -        memset(q + nbytes, DEADBYTE, original_nbytes - nbytes);
    +        memset(q + 2*SST + nbytes, DEADBYTE, original_nbytes - nbytes);
         }

    But fixing this exposes other problem. _PyMem_DebugRawRealloc() is called recursively. _PyMem_DebugRawRealloc calls api->alloc.realloc which is _PyMem_DebugRawRealloc. There are two nested debug allocators. The block is nested in other block, both have their own header and footer.

    |header1|header2|------------------------------|footer2|footer1|

    _PyMem_DebugRawRealloc fills the extra memory with DEADBYTE.

    |header|---------------------------..unused..|footer|
    |header|---------------------------|footer|XXXXXXXXX|

    But in case of nested _PyMem_DebugRawRealloc's, the outer one (which reallocates the inner block), overwrites the footer of the outer block.

    |header1|header2|--------------------..unused..|footer2|footer1|
    |header1|header2|--------------------..unused|footer1|XXXXXXXXX| after inner realloc
    |header1|header2|--------------------|footer2|YYYYYYYYY|XXXXXXX| after outher realloc

    XXX are DEADBYTEs written by the inner allocator, YYY are DEADBYTEs written by the outer allocator.

    @serhiy-storchaka
    Copy link
    Member Author

    Using nested _PyMem_DebugRawRealloc() looks suspicions to me. This may be a bug.

    But even without nested _PyMem_DebugRawRealloc() writing to the extra memory after using realloc() looks wrong to me. This can break other non-trivial system allocators which write an information past the allocated block. This can cause a segfault if unused memory pages are returned to OS.

    After creating the PR I have found that it literally restores the code of 2.7 and 3.3. 3.4 and later contain this bug. The bug looks enough serious to me for fixing it in 3.4 and 3.5.

    @serhiy-storchaka serhiy-storchaka changed the title Crash in _PyUnicode_DecodeUnicodeEscape on OpenBSD Writing in freed memory in _PyMem_DebugRawRealloc() after shrinking a memory block Oct 1, 2017
    @serhiy-storchaka
    Copy link
    Member Author

    Ping.

    @vstinner
    Copy link
    Member

    Using nested _PyMem_DebugRawRealloc() looks suspicions to me. This may be a bug.

    PyObject_Malloc() calls PyMem_RawMalloc() for allocations larger than 512 bytes.

    When debug hooks are enabled, PyObject_Malloc() and PyMem_RawMalloc() both call _PyMem_DebugRawRealloc(). The behaviour that you saw is expected.

    It was simpler to reuse _PyMem_DebugRawRealloc() PyObject and PyMem_Raw allocator families, rather than duplicating the code.

    @vstinner
    Copy link
    Member

    "I tried to build CPython on 64-bit OpenBSD. It was built successfully, but tests crash."

    It's the first time that anyone complains about _PyMem_DebugRawRealloc(). The behaviour seems to be very specific to OpenBSD.

    Even if the current code is "weird", it works very well and is effecient.

    I proposed attached PR 4119 which makes _PyMem_DebugRawRealloc() behaves correctly: erase bytes *before* calling realloc(), but keeps a copy if realloc() fails. My PR only changes the behaviour on OpenBSD. It keeps the current behaviour (erase bytes *after* realloc, if realloc succeeded) on other platforms for performance reasons.

    @serhiy-storchaka
    Copy link
    Member Author

    The current code in not correct on all platforms. We don't know how many of random bugs, hangs and crashes on other platforms are caused by this bug. I'm not surprised that it was caught on OpenBSD since the robustness and security of software is the goal of OpenBSD.

    PR 3844 restores the behavior of 2.7 and 3.3. I propose to merge it first, and develop other enhancements later.

    @vstinner
    Copy link
    Member

    PR 3844 restores the behavior of 2.7 and 3.3. I propose to merge it first, and develop other enhancements later.

    Please don't. This PR reintroduced a bug that I fixed in bpo-18408:

    commit c426636
    Author: Victor Stinner <victor.stinner@gmail.com>
    Date: Tue Jul 9 00:44:43 2013 +0200

    Issue bpo-18408: Fix _PyMem_DebugRealloc()
    
    Don't mark old extra memory dead before calling realloc(). realloc() can fail
    and realloc() must not touch the original buffer on failure.
    
    So mark old extra memory dead only on success if the new buffer did not move
    (has the same address).
    

    @vstinner
    Copy link
    Member

    The current code in not correct on all platforms.

    My PR 4119 only changes the behaviour on OpenBSD, but I'm not sure about that. Maybe it's simpler to apply this fix on all platforms, as I asked on the PR itself?

    @serhiy-storchaka
    Copy link
    Member Author

    In this case it would be safe to not erase bytes at all.

    PR 4119 is too complex for a bugfix (especially if backport it to 3.5 and 3.4). It can introduce other regressions. The performance hit is not the only issue. Allocating a temporary buffer can change the structure of "holes" in memory. As result some memory related bugs can be reproducible only in release mode.

    Maybe it is enough to erase only few bytes at the start and end of the freed area. The copy can be saved in local variables, without involving the heap. This solution still will be enough complex, and I think it can be applied only to 3.7. But the bug should be fixed in all affected versions.

    @vstinner
    Copy link
    Member

    "the bug should be fixed in all affected versions"

    I don't understand why you insist to change Python 3.4 and Python 3.5. IMHO this issue only impacts OpenBSD.

    "The current code in not correct on all platforms. We don't know how many of random bugs, hangs and crashes on other platforms are caused by this bug. I'm not surprised that it was caught on OpenBSD since the robustness and security of software is the goal of OpenBSD."

    I'm not aware of these "random bugs, hangs and crashes on other platforms". Did you hear someone complaining about random crashes in Python?

    We are running the Python suite multiple times per time on a large farm of buildbot workers. I never saw the crashes that you mentionned.

    IMHO it's very unlikely or impossible that _PyMem_DebugRawRealloc() erases bytes of a memory block that was just unallocated while another thread uses this new memory block for a new allocation. Most, if not all, calls to _PyMem_DebugRawRealloc() are protected by the GIL. If there is a single thread using the memory block, I think that it's perfectly fine to write after it's deallocated.

    Well, maybe I'm plain wrong, and it's possible that shrinking a memory block makes the unallocator memory block really unaccessible, and that the memcpy() after the realloc() triggers a big crash. But I would like to *see* such crash to really be convinced :-)

    @vstinner
    Copy link
    Member

    "PR 4119 is too complex for a bugfix (especially if backport it to 3.5 and 3.4). It can introduce other regressions."

    Which kind of regression do you expect? Something like a crash?

    "The performance hit is not the only issue. Allocating a temporary buffer can change the structure of "holes" in memory. As result some memory related bugs can be reproducible only in release mode."

    Can you please elaborate how the exact memory layout can trigger or not bugs in the code?

    I'm not saying that you are right or wrong. I just fail to see why the memory address and "holes" would trigger or not bugs.

    What I can understand is a subtle behaviour change if realloc() returns the same memory block or a new memory block. But we cannot control that.

    I'm not sure that allocating "copy" before realloc() would impact the behaviour of realloc(). The common case is that a memory block is a few bytes smaller, and so the realloc returns the same memory block, but now becomes able to use the unallocated bytes for a new allocation later, no?

    "Maybe it is enough to erase only few bytes at the start and end of the freed area. The copy can be saved in local variables, without involving the heap. This solution still will be enough complex, and I think it can be applied only to 3.7. But the bug should be fixed in all affected versions."

    If we must make a choice, I would prefer to keep the current behaviour: make sure that unallocated bytes are erased. Catching code reading unallocated bytes is the most important feature of debug hooks, no?

    I dislike the idea of only erasing "some" bytes: this change may prevent to detect some bugs.

    @serhiy-storchaka
    Copy link
    Member Author

    The current code OBVIOUSLY is wrong. Bytes are erased if q == oldq && nbytes < original_nbytes. But q == oldq only if realloc() returns the new address 2*sizeof(size_t) bytes larger than its argument. This is virtually never happen on other platforms because _PyMem_DebugRawRealloc() usually used with blocks larger than 2*sizeof(size_t) bytes and the system realloc() don't shrink the block at left (this is implementation detail). Thus this code is virtually dead on other platforms. It doesn't detect shrinking memory block in-place.

    After fixing *this* bug, we have encountered with *other* bug, related to overwriting the freed memory.

    I don't see reasons of keeping an obviously wrong code. When fix the first bug we will need to fix the other bug.

    @skrah
    Copy link
    Mannequin

    skrah mannequin commented Oct 28, 2017

    realloc() must not touch the original buffer on failure

    I don't understand this: If realloc() fails, the original buffer
    is perfectly valid.

    @skrah
    Copy link
    Mannequin

    skrah mannequin commented Oct 28, 2017

    Ah sorry, you mean it cannot write the special bytes.

    @serhiy-storchaka
    Copy link
    Member Author

    New changeset b484d56 by Serhiy Storchaka in branch 'master':
    bpo-31626: Fixed a bug in debug memory allocator. (bpo-3844)
    b484d56

    @serhiy-storchaka
    Copy link
    Member Author

    New changeset ece5659 by Serhiy Storchaka (Miss Islington (bot)) in branch '3.6':
    bpo-31626: Fixed a bug in debug memory allocator. (GH-3844) (bpo-4191)
    ece5659

    @serhiy-storchaka
    Copy link
    Member Author

    I have removed the incorrect code in master and 3.6, this unblocks testing debug build on OpenBSD. I don't think it is worth to backport this change to 3.5 and 3.4, since the bug affects only debug build.

    Here is a patch which perhaps is an alternative to PR 4119, but without disturbing the heap allocator. It erases just few bytes at the begin and at the end of the allocated block (including the header and the trailer), and saves erased bytes in small local buffer. It does this even if the block is not reallocated in-place. Hence calling free() on reallocated block will be detected. Reading from the reallocated block will get erased bytes at the start and the end, this is likely will cause to loud failure too. I have tested in on OpenBSD.

    But I'm not sure that it is worth to apply this patch. Left it on to you Victor.

    @larryhastings
    Copy link
    Contributor

    Most, if not all, calls to _PyMem_DebugRawRealloc() are protected by
    the GIL. If there is a single thread using the memory block,
    I think that it's perfectly fine to write after it's deallocated.

    I don't quite follow where the write-after-free is happening, but: C's free() function is *not* protected by the GIL. So if you're running in a multithreaded program where other threads aren't blocked by the GIL, one of these other threads could very easily allocate this freshly-freed memory. And if you wrote to it after the memory was allocated to this other thread, congrats, your program is no longer correct.

    @vstinner
    Copy link
    Member

    vstinner commented Nov 1, 2017

    While the commit b484d56 fixes a crash on OpenBSD and makes the code "correct" on any platforms, I still see it as a feature regression. Detecting buffer overflow and using freed memory are feature of debug hooks.

    https://docs.python.org/dev/c-api/memory.html#c.PyMem_SetupDebugHooks

    So yes, debug-realloc.diff is interesting.

    @serhiy-storchaka
    Copy link
    Member Author

    New changeset 3cc4c53 by Serhiy Storchaka in branch 'master':
    bpo-31626: Mark ends of the reallocated block in debug build. (bpo-4210)
    3cc4c53

    @serhiy-storchaka
    Copy link
    Member Author

    There are large discussions at PR 3844 and PR 4210.

    @serhiy-storchaka
    Copy link
    Member Author

    Yet one issue is left. In debug mode two debug allocators are used, the one is nested in the other. Is it correct?

    @vstinner
    Copy link
    Member

    vstinner commented Nov 7, 2017

    For allocations larger than 512 bytes, PyObject_Malloc() calls PyMem_RawMalloc(). When debug hooks are installed, PyObject_Malloc() calls a debug hook which calls PyMem_RawMalloc() which calls another debug hook.

    @vstinner
    Copy link
    Member

    New changeset ed4743a by Victor Stinner in branch '2.7':
    bpo-31626: Fix _PyObject_DebugReallocApi() (bpo-4310)
    ed4743a

    @vstinner
    Copy link
    Member

    Serhiy: I added a Py_FatalError() to Python 2.7 if the case that must not happen does happen. Are you ok to apply the same change for Python 3.6 (commit ed4743a), or do you prefer to backport your more complex patch using a small buffer allocated on the stack?

    @serhiy-storchaka
    Copy link
    Member Author

    I prefer to keep the current code as is.

    @vstinner
    Copy link
    Member

    Ok, let me summarize:

    • 2.7: memset() *before* realloc() if shrinking a memory block, but crash when Py_FatalError() if realloc() fails on shrinking --> cannot corrupt memory, but can crash if realloc() fails on shrinking

    • master (3.7): memset() *before* realloc() if shrinking a memory block, but save erased bytes, and restore erased bytes if realloc() fails --> always correct

    • 3.6: don't memset() --> correct, but don't detect "use after free" bug when a memory block is skrinked

    Python 3.6 release manager, Ned Deily, rejected the proposal to backport the complex fix from master to 3.6.

    Victor (me):

    Serhiy: I added a Py_FatalError() to Python 2.7 if the case that must not happen does happen. Are you ok to apply the same change for Python 3.6 (...)

    Sorry, when I asked the question, I expected that Python 3.6 still erased bytes before realloc(). But it's not the case. I agree that Py_FatalError() would be a bad idea for Python 3.6.

    Serhiy: Thank you for fixing Python 3.6 (don't memset() *after* realloc, which crashed on OpenBSD) and "fix" the feature in master (restore erased bytes if realloc fails)!

    I close the issue. I don't think that Python 2.7 or 3.6 need further changes.

    @serhiy-storchaka
    Copy link
    Member Author

    • 3.6: don't memset() --> correct, but don't detect "use after free" bug
      when a memory block is skrinked

    And doesn't detect "use after free" bug when a memory block is expanded. And
    2.7 doesn't detect this. Only 3.7 detects this.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life interpreter-core (Objects, Python, Grammar, and Parser dirs) type-crash A hard crash of the interpreter, possibly with a core dump
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants