This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Debug pymalloc crash when using os.fork() [regression]
Type: crash Stage: resolved
Components: Interpreter Core Versions: Python 3.10, Python 3.9, Python 3.8
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: CendioOssman, colesbury, lukasz.langa, miss-islington, stestagg, vstinner
Priority: normal Keywords: patch

Created on 2020-12-02 14:22 by CendioOssman, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 29564 merged colesbury, 2021-11-15 20:33
PR 29599 merged miss-islington, 2021-11-17 20:51
PR 29600 merged colesbury, 2021-11-17 21:04
Messages (9)
msg382314 - (view) Author: Pierre Ossman (CendioOssman) Date: 2020-12-02 14:22
A python equivalent of the classical daemon() call started throwing an error from 3.8 when the debug hooks are active for pymalloc. If the tracing is also active it segfaults.

This simple example triggers it:


import os

def daemon():
    pid = os.fork()
    if pid != 0:
        os._exit(0)

daemon()


The error given is:

Debug memory block at address p=0xf013d0: API '1'
    0 bytes originally requested
    The 7 pad bytes at p-7 are not all FORBIDDENBYTE (0xfd):
        at p-7: 0x00 *** OUCH
        at p-6: 0x00 *** OUCH
        at p-5: 0x00 *** OUCH
        at p-4: 0x00 *** OUCH
        at p-3: 0x00 *** OUCH
        at p-2: 0x00 *** OUCH
        at p-1: 0x00 *** OUCH
    Because memory is corrupted at the start, the count of bytes requested
       may be bogus, and checking the trailing pad bytes may segfault.
    The 8 pad bytes at tail=0xf013d0 are not all FORBIDDENBYTE (0xfd):
        at tail+0: 0x01 *** OUCH
        at tail+1: 0x00 *** OUCH
        at tail+2: 0x00 *** OUCH
        at tail+3: 0x00 *** OUCH
        at tail+4: 0x00 *** OUCH
        at tail+5: 0x00 *** OUCH
        at tail+6: 0x00 *** OUCH
        at tail+7: 0x00 *** OUCH

Enable tracemalloc to get the memory block allocation traceback

Fatal Python error: bad ID: Allocated using API '1', verified using API 'r'
Python runtime state: finalizing (tstate=0xf023b0)



Tested on Fedora, Ubuntu and RHEL with the same behaviour everwhere. Everything up to 3.8 works fine. 3.8 and 3.9 both exhibit the issue.

Since this is a very standard way of starting a daemon it should affect quite a few users. At the very least it makes it annoying to use development mode to catch other issues.
msg382612 - (view) Author: Steve Stagg (stestagg) Date: 2020-12-06 22:10
Reproducible on master, but doesn't occur with a debug build.

My configure:

> ./configure --prefix=$PREFIX --exec-prefix=$EPREFIX --cache-file=../config.cache --without-ensurepip > /dev/null
> PYTHONMALLOC=pymalloc_debug $EPREFIX/bin/python3 ../test.py

Crash comes out of the child process, I modified example to invoke lldb for child:

```
import os
import time
import subprocess

def daemon():
    pid = os.fork()
    if pid != 0:
        subprocess.run(["lldb", '-p', str(pid)])
        time.sleep(10)
        os._exit(0)
    time.sleep(1)

daemon()
```


Attaching to the child gives this backtrace:

(lldb) bt
* thread #1, name = 'python3', stop reason = signal SIGSEGV: invalid address (fault address: 0x0)
    frame #0: 0x0000564a5a325f9c python3.10`_PyObject_DebugDumpAddress(p=0x0000564a5aa2c630) at obmalloc.c:2457:12
    frame #1: 0x0000564a5a326571 python3.10`_PyMem_DebugCheckAddress(func="_PyMem_DebugRawFree", api=<unavailable>, p=<unavailable>) at obmalloc.c:2375:9
  * frame #2: 0x0000564a5a326d29 python3.10`_PyMem_DebugRawFree at obmalloc.c:2208:5
    frame #3: 0x0000564a5a326d17 python3.10`_PyMem_DebugRawFree(ctx=0x0000564a5a5d37e0, p=0x0000564a5aa2c630) at obmalloc.c:2197
    frame #4: 0x0000564a5a3d4aa3 python3.10`PyInterpreterState_Delete(interp=0x0000564a5aa4e6b0) at pystate.c:396:9
    frame #5: 0x0000564a5a3d1921 python3.10`Py_FinalizeEx [inlined] finalize_interp_delete(tstate=<unavailable>) at pylifecycle.c:1633:5
    frame #6: 0x0000564a5a3d190b python3.10`Py_FinalizeEx at pylifecycle.c:1782
    frame #7: 0x0000564a5a2dab80 python3.10`Py_BytesMain at main.c:679:9
    frame #8: 0x0000564a5a2dab69 python3.10`Py_BytesMain at main.c:707
    frame #9: 0x0000564a5a2dab4f python3.10`Py_BytesMain(argc=<unavailable>, argv=<unavailable>) at main.c:731
    frame #10: 0x00007ff2e6251152 libc.so.6`__libc_start_main + 242
    frame #11: 0x0000564a5a2d978e python3.10`_start + 46


The call stack is trying to free an interpreter mutex:
    PyThread_free_lock(interp->id_mutex);
msg382613 - (view) Author: Steve Stagg (stestagg) Date: 2020-12-06 22:34
Something seems to be: breaking alignment, changing the allocator used, or trashing that memory.

In my case, the address of the mutex is: 0x5603a3666630  (*1) below
and the surrounding memory:

0x5603a3666610: n\0\0\03\0\0\0\0\0\0\0(*2)\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd1\0\0\0\0\0\0\0
0x5603a3666630: (*1)\x01\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd
0x5603a3666650: \xdd\xdd\xdd\xdd\xdd\xdd\xdd\xddQ\0\0\0\0\0\0\0\x90ds\xa3\x03V\0\0\xf0\xc4\xfbq\a\x7f\0\0

The fun thing is that the nbytes field lines up at (*2) which is full of DEADBYTEs, and thus tries to find the tail of the allocation at p + 15987178197214944733  (which happily segfaults)
msg382614 - (view) Author: Steve Stagg (stestagg) Date: 2020-12-06 23:01
So, I'm not an allocator/init/teardown expert, but it looks like: When you fork, PyRuntimeState creates a new mutex, explicitly using the default allocator (without the debug allocator active)..


#ifdef HAVE_FORK
/* This function is called from PyOS_AfterFork_Child to ensure that
   newly created child processes do not share locks with the parent. */
PyStatus
_PyRuntimeState_ReInitThreads(_PyRuntimeState *runtime)
{
    // This was initially set in _PyRuntimeState_Init().
    runtime->main_thread = PyThread_get_thread_ident();

    /* Force default allocator, since _PyRuntimeState_Fini() must
       use the same allocator than this function. */
    PyMemAllocatorEx old_alloc;
    _PyMem_SetDefaultAllocator(PYMEM_DOMAIN_RAW, &old_alloc);

    int reinit_interp = _PyThread_at_fork_reinit(&runtime->interpreters.mutex);
    int reinit_main_id = _PyThread_at_fork_reinit(&runtime->interpreters.main->id_mutex);
    int reinit_xidregistry = _PyThread_at_fork_reinit(&runtime->xidregistry.mutex);

    PyMem_SetAllocator(PYMEM_DOMAIN_RAW, &old_alloc);

    if (reinit_interp < 0
        || reinit_main_id < 0
        || reinit_xidregistry < 0)
    {
        return _PyStatus_ERR("Failed to reinitialize runtime locks");

    }
    return _PyStatus_OK();
}
#endif


But the PyInterpreterState_Delete function does not do this:

if (interp->id_mutex != NULL) {
    PyThread_free_lock(interp->id_mutex);
}


Which causes it to try to use the debug allocator to free, and hence crash..
msg406492 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-11-17 20:51
New changeset 736684b1bb67369a2e95a9f621752deead44e7ef by Sam Gross in branch 'main':
bpo-42540: reallocation of id_mutex should not force default allocator (GH-29564)
https://github.com/python/cpython/commit/736684b1bb67369a2e95a9f621752deead44e7ef
msg406493 - (view) Author: miss-islington (miss-islington) Date: 2021-11-17 21:16
New changeset 1079b3e3cb3eba7062e174ecc6c0ab20c2d0722e by Miss Islington (bot) in branch '3.10':
bpo-42540: reallocation of id_mutex should not force default allocator (GH-29564)
https://github.com/python/cpython/commit/1079b3e3cb3eba7062e174ecc6c0ab20c2d0722e
msg406503 - (view) Author: Łukasz Langa (lukasz.langa) * (Python committer) Date: 2021-11-17 22:54
New changeset 87787c8774221c81602b31b0e0dc0678ad494e91 by Sam Gross in branch '3.9':
[3.9] bpo-42540: reallocation of id_mutex should not force default allocator (GH-29564) (GH-29600)
https://github.com/python/cpython/commit/87787c8774221c81602b31b0e0dc0678ad494e91
msg406505 - (view) Author: Łukasz Langa (lukasz.langa) * (Python committer) Date: 2021-11-17 23:01
Thanks, Sam! ✨ 🍰 ✨
msg406533 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-11-18 12:42
Do you think that it would be worth it to dump the memory allocation when a Fatal Python error related to a memory error is triggered?
History
Date User Action Args
2022-04-11 14:59:38adminsetgithub: 86706
2021-11-18 12:42:27vstinnersetmessages: + msg406533
2021-11-17 23:01:52lukasz.langasetstatus: open -> closed
resolution: fixed
messages: + msg406505

stage: patch review -> resolved
2021-11-17 22:54:13lukasz.langasetnosy: + lukasz.langa
messages: + msg406503
2021-11-17 21:16:10miss-islingtonsetmessages: + msg406493
2021-11-17 21:04:54colesburysetpull_requests: + pull_request27843
2021-11-17 20:51:46vstinnersetnosy: + vstinner
messages: + msg406492
2021-11-17 20:51:16miss-islingtonsetnosy: + miss-islington
pull_requests: + pull_request27842
2021-11-15 20:33:05colesburysetkeywords: + patch
nosy: + colesbury

pull_requests: + pull_request27812
stage: patch review
2020-12-06 23:01:11stestaggsetmessages: + msg382614
2020-12-06 22:34:49stestaggsetmessages: + msg382613
2020-12-06 22:11:00stestaggsetnosy: + stestagg
messages: + msg382612
2020-12-02 14:22:16CendioOssmancreate