Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not all memory allocated by _Py_Quicken() is released at Python exit #90634

Closed
vstinner opened this issue Jan 23, 2022 · 21 comments
Closed

Not all memory allocated by _Py_Quicken() is released at Python exit #90634

vstinner opened this issue Jan 23, 2022 · 21 comments
Labels
3.11 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs)

Comments

@vstinner
Copy link
Member

BPO 46476
Nosy @gvanrossum, @vstinner, @tiran, @markshannon, @ericsnowcurrently, @corona10, @kumaraditya303
PRs
  • bpo-46476: Fix memory leak in code objects generated by deepfreeze  #30853
  • bpo-46476: Simplify and fix _PyStaticCode_Dealloc #30965
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2022-01-27.13:13:54.447>
    created_at = <Date 2022-01-23.00:29:31.933>
    labels = ['interpreter-core', '3.11']
    title = 'Not all memory allocated by _Py_Quicken() is released at Python exit'
    updated_at = <Date 2022-01-27.19:32:28.772>
    user = 'https://github.com/vstinner'

    bugs.python.org fields:

    activity = <Date 2022-01-27.19:32:28.772>
    actor = 'Mark.Shannon'
    assignee = 'none'
    closed = True
    closed_date = <Date 2022-01-27.13:13:54.447>
    closer = 'vstinner'
    components = ['Interpreter Core']
    creation = <Date 2022-01-23.00:29:31.933>
    creator = 'vstinner'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 46476
    keywords = ['patch']
    message_count = 21.0
    messages = ['411315', '411321', '411323', '411326', '411328', '411353', '411355', '411356', '411372', '411375', '411380', '411381', '411421', '411445', '411473', '411475', '411481', '411556', '411864', '411866', '411913']
    nosy_count = 7.0
    nosy_names = ['gvanrossum', 'vstinner', 'christian.heimes', 'Mark.Shannon', 'eric.snow', 'corona10', 'kumaraditya']
    pr_nums = ['30853', '30965']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue46476'
    versions = ['Python 3.11']

    @vstinner
    Copy link
    Member Author

    Python leaks around 66 memory blocks at exit. Most of them are allocated by _Py_Quicken() of Python/specialize.c.

    int
    _Py_Quicken(PyCodeObject *code) {
        ...
        SpecializedCacheOrInstruction *quickened = allocate(entry_count, instr_count);
        ...
        code->co_quickened = quickened;
        ...
        return 0;
    }

    The memory is stored in PyCodeObject.co_quickened member. This member *is* cleared by the code object deallocator function, code_dealloc():

    static void
    code_dealloc(PyCodeObject *co)
    {
        ...
        if (co->co_quickened) {
            PyMem_Free(co->co_quickened);
            _Py_QuickenedCount--;
        }
        ...
    }

    I read recently that deepfreeze creates "immortal" code objects (refcount of 999999999). I guess that it's related.

    I used Valgrind to look for memory leaked by Python at exit:

    $ PYTHONMALLOC=malloc valgrind --show-leak-kinds=all --leak-check=full --log-file=valgrind.log --num-callers=50 ./python -c pass

    Extract of valgrind.log:

    ==1266888== 5,528 bytes in 13 blocks are still reachable in loss record 33 of 33
    ==1266888== at 0x484186F: malloc (vg_replace_malloc.c:381)
    ==1266888== by 0x544DBC: _PyMem_RawMalloc (obmalloc.c:100)
    ==1266888== by 0x545B13: PyMem_Malloc (obmalloc.c:618)
    ==1266888== by 0x6AAE6F: allocate (specialize.c:231)
    ==1266888== by 0x6AB3F0: _Py_Quicken (specialize.c:420)
    ==1266888== by 0x622CE7: _Py_IncrementCountAndMaybeQuicken (pycore_code.h:152)
    ==1266888== by 0x626315: _PyEval_EvalFrameDefault (ceval.c:1792)
    ==1266888== by 0x622B13: _PyEval_EvalFrame (pycore_ceval.h:53)
    ==1266888== ...
    ==1266888==
    ==1266888== LEAK SUMMARY:
    ==1266888== definitely lost: 0 bytes in 0 blocks
    ==1266888== indirectly lost: 0 bytes in 0 blocks
    ==1266888== possibly lost: 0 bytes in 0 blocks
    ==1266888== still reachable: 24,240 bytes in 66 blocks
    ==1266888== suppressed: 0 bytes in 0 blocks

    See also bpo-46449: "Deep-freezed modules create inconsistency in sys.gettotalrefcount() (_Py_Reftotal)".

    @vstinner vstinner added 3.11 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) labels Jan 23, 2022
    @vstinner vstinner changed the title Not all memory allocated by _Py_Quicken() is not released at Python exit Not all memory allocated by _Py_Quicken() is released at Python exit Jan 23, 2022
    @vstinner vstinner changed the title Not all memory allocated by _Py_Quicken() is not released at Python exit Not all memory allocated by _Py_Quicken() is released at Python exit Jan 23, 2022
    @vstinner
    Copy link
    Member Author

    Patch to disable _Py_Quicken(), to help me debugging other memory leaks at Python exit:

    diff --git a/Include/internal/pycore_code.h b/Include/internal/pycore_code.h
    index dfc75300315..f9cdefed2a2 100644
    --- a/Include/internal/pycore_code.h
    +++ b/Include/internal/pycore_code.h
    @@ -146,12 +146,14 @@ int _Py_Quicken(PyCodeObject *code);
     static inline int
     _Py_IncrementCountAndMaybeQuicken(PyCodeObject *code)
     {
    +#if 0
         if (code->co_warmup != 0) {
             code->co_warmup++;
             if (code->co_warmup == 0) {
                 return _Py_Quicken(code) ? -1 : 1;
             }
         }
    +#endif
         return 0;
     }

    @gvanrossum
    Copy link
    Member

    If any of the immortal, deep-frozen code objects is ever quickened, I suppose the quickening data is never freed. But when we finalize and reinitialize, the co_quickened flag should remain set, so this would be a one-time leak.

    The question is whether the quickening cache points to any objects that *are* freed. If it does, that could be bad. If it doesn't, then all we lose is a fixed amount of memory (no further leaks if we finalize and initialize the runtime repeatedly).

    However, if my theory holds, why would valgrind consider the memory leaked? (TBH I don't know what valgrind does, so maybe that's not the right question.)

    @vstinner
    Copy link
    Member Author

    Maybe there should be a way to traverse all immortal code objects at Python exit and clear their PyCodeObject.co_quickened member (and maybe some other members).

    Would it be possible to enhance deepfreeze be produce a list of all (immortal) code objects?

    I did something similar for static types with _PyStaticType_Dealloc() in bpo-46417, but it's easy since the list of static type is known in advance and it's short (100 to 200 types).

    --

    I'm working on fixing the very old bug bpo-1635741: Python must release all memory that it called in Py_Finalize(). It matters when Python is embedded in an application. It makes sense to call Py_Initialize()/Py_Finalize() multiple times in such use case. Python should not leak any memory.

    With my PR 30815 fix + my msg411321 workaround, "./python -I -X showrefcount -c pass" now says that Python leaks exactly *zero* memory block: so bpo-1635741 is basically fixed, for the simplest Python command ("pass"). Obviously, I expect that more work is needed for more complex workload, since there are other static types which are still not cleared at Python exit, and more C extension modules which are not ported to the multi-phase initialization API (PEP-489).

    Valgrind is just one way to see the issue.

    @gvanrossum
    Copy link
    Member

    Would it be possible to enhance deepfreeze be produce a list of all (immortal) code objects?

    That should be simple. We could see if Kumar is interested.

    @kumaraditya303
    Copy link
    Contributor

    Would it be possible to enhance deepfreeze be produce a list of all (immortal) code objects?

    It is tricky because the deepfreeze modules are generated by the bootstrap interpreter in Linux/MacOS and the downloaded python from nuget interpreter on Windows so when the bootstrap interpreter is built there will be no list of code objects to begin with so it won't work as intended.
    But I have an idea : If we can #define Py_DEEPFROZEN_MODULES in the final interpreter but not in the bootstrap one, and then in pylifecycle.c an extern function can free up deep-frozen modules memory if Py_DEEPFROZEN_MODULES which will be defined in deepfreeze.c then it should work.

    @kumaraditya303
    Copy link
    Contributor

    Clearing co_quickened is easy, but it would requires changes to the build system to change the build order on Windows and bootstrap interpreter on Windows. I manually edited the Windows project files but it requires generating deepfreeze.c before hand. It cleared around ~60 memory blocks on Windows.

    See branch https://github.com/kumaraditya303/cpython/commits/fix-code

    @kumaraditya303
    Copy link
    Contributor

    bootstrap interpreter on Windows => bootstrap interpreter on Unix

    @vstinner
    Copy link
    Member Author

    See branch https://github.com/kumaraditya303/cpython/commits/fix-code

    Oh nice, I like this new _Py_Deepfreeze_Fini() function :-) I suggest to create a function specific to only clear immortal code objects. In my experience, it's important to control exactly when objects are cleared at Python exit: Py_Finalize() is complex and fragile. See my notes:
    https://pythondev.readthedocs.io/finalization.html

    Be careful, Python must remain usable after Py_Finalize(): it's legit to call Py_Initialize() again and execute new Python code. Example executing the same code 4 times, each time Py_Initialize() and Py_Finalize() are called:

    ./Programs/_testembed test_repeated_init_exec 'print("Hello")'

    My _PyStaticMethod_Dealloc() implementation uses Py_CLEAR() rather than Py_XDECREF() to set structure members to NULL.

    Moreover, there are more things than just co_quickened which should be cleared. I suggest to add a new function to clear an "immortal" code object. For example, I also suggest to call PyObject_ClearWeakRefs(). I guess that co_extra should also be cleared.

    @kumaraditya303
    Copy link
    Contributor

    FYI, I updated the build files and got it working on Windows and clears around ~60 memory blocks. See the latest commit in branch.

    @kumaraditya303
    Copy link
    Contributor

    Be careful, Python must remain usable after Py_Finalize(): it's legit to call Py_Initialize() again and execute new Python code. Example executing the same code 4 times, each time Py_Initialize() and Py_Finalize() are called:

    ./Programs/_testembed test_repeated_init_exec 'print("Hello")'

    With the current design, it isn't possible though because the code objects are modified in place so if co_quickened is freed the VM still tries to execute the copied instructions.

    See

    memcpy(new_instructions, code->co_firstinstr, size);

    @vstinner
    Copy link
    Member Author

    Aha. Maybe for now, the memory of immortal code objects can be deallocated in pymain_free(). It's the last function called before Python exit the process.

    @gvanrossum
    Copy link
    Member

    With the current design, it isn't possible though because the code objects are modified in place so if co_quickened is freed the VM still tries to execute the copied instructions.

    Or the cleanup code could also restore co_firstinstr and other things that are set by quickened (the co_quickened flag and what else?).

    Kumar, I'm not sure I follow your concerns about the bootstrap working differently on Windows than on Unix. Is the problem that on Unix the bootstrap interpreter is linked without deepfreeze.c so there is no definition of the symbol _Py_Deepfreeze_Fini? In that case, you can probably just add a dummy one to _bootstrap_python.c.

    @kumaraditya303
    Copy link
    Contributor

    Kumar, I'm not sure I follow your concerns about the bootstrap working differently on Windows than on Unix. Is the problem that on Unix the bootstrap interpreter is linked without deepfreeze.c so there is no definition of the symbol _Py_Deepfreeze_Fini? In that case, you can probably just add a dummy one to _bootstrap_python.c.

    Yes, that comment was outdated as I didn't knew that _bootstrap_python.c and _freeze_module.c are different executables on Linux. In the latest commit it is fixed and there are 0 memory blocks left on Linux and 131 on Windows. The next thing to be done is solving how to restore the code objects before it was quickened.

    @kumaraditya303
    Copy link
    Contributor

    Aha. Maybe for now, the memory of immortal code objects can be deallocated in pymain_free(). It's the last function called before Python exit the process.

    It seems like a viable option for now to only clear them when python is not embedded and in pymain_free.

    @vstinner
    Copy link
    Member Author

    Releasing memory in pymain_free() is a good start! At least it fix the issue for the "python" command. So people running a memory debugger like Valgrind will no longer see any leak at "python" exit.

    Kumar: Do you want to work on a PR for that?

    @kumaraditya303
    Copy link
    Contributor

    Created PR #30853

    @kumaraditya303
    Copy link
    Contributor

    In the end I was able to restore the codeobjects before they were quickened so it works for both embedded and python command now. See the latest commits and it is also ready for review now.

    @vstinner
    Copy link
    Member Author

    New changeset c7f810b by Kumar Aditya in branch 'main':
    bpo-46476: Fix memory leak in code objects generated by deepfreeze (GH-30853)
    c7f810b

    @vstinner
    Copy link
    Member Author

    Thanks Kumar! Your change fixed my issue:

    $ ./python -I -X showrefcount -c pass
    [-5 refs, 0 blocks]

    Python no longer leaks memory at exit: it "leaks" exactly *zero* memory block.

    The negative reference count is likely caused by bpo-46449.

    @markshannon
    Copy link
    Member

    New changeset 26b0482 by Christian Heimes in branch 'main':
    bpo-46476: Simplify and fix _PyStaticCode_Dealloc (GH-30965)
    26b0482

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.11 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs)
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants