New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace direct calls to malloc() with PyMem_Malloc() or PyMem_RawMalloc() #62403
Comments
The issue bpo-3329 proposes an API to replace memory allocator functions. But Python calls directly malloc(), realloc() and free() in some functions, so custom allocators would not be used there. Examples of functions calling malloc/realloc/free directly: _PySequence_BytesToCharpArray(), block_new() (of pyarena.c), find_key() (of thread.c), PyInterpreterState_New(), win32_wchdir(), posix_getcwd(), Py_Main(), etc. We have to be careful with the GIL: PyMem_*() functions can only be called when holding the GIL. |
Be aware about external code which allocate memory itself (i.e. expat). |
Well, I know that it will hard to cover 100% of the stdlib. I just want to replace the most obvious calls. Some libraries can be configured to use a custom memory allocators:
We should probably uses these functions to reuse Python allocators (PyMem_Malloc()), or maybe only if PyMem_SetAllocators() has been called? (if Python does not use system allocators, aka malloc) See also bpo-18178 for libffi, it may be related. The _decimal module configures libmpdec to use PyMem_Malloc: #define _Py_DEC_MINALLOC 4 mpd_mallocfunc = PyMem_Malloc;
mpd_reallocfunc = PyMem_Realloc;
mpd_callocfunc = mpd_callocfunc_em;
mpd_free = PyMem_Free;
mpd_setminalloc(_Py_DEC_MINALLOC); |
expat has a XML_Memory_Handling_Suite. You just have to replace XML_ParserCreate() and XML_ParserCreateNS with XML_ParserCreate_MM(). |
I think there's a potential problem here :) |
I didn't understand the motivation to require the GIL held for PyMem_Malloc(). I searched in the source code history and on the Internet (archives of python-dev). In my opinion, the restiction is motivated by a bug: PyMem_Malloc() calls (indirectly) PyObject_Malloc() in debug mode, and PyObject_Malloc() is not thread-safe. I opened a thread on python-dev to discuss this point. |
Keeping the GIL requirement is _very_ useful for PyMem_MALLOC et al. It allows applications to hook in their own monitoring code, accessible from python itself, without having to worry about conflicts with python. Because of this, and to allow ourselves the flexibility to do all sorts of things inside PyMem_Malloc(), at CCP we added a parallel api, PyMem_MALLOC_RAW() etc. We have patched pythoncore in 2.7 in all places there were using malloc directly using the file attached to the defect. Notice how it can patch "malloc" in two different ways, using either regular malloc (in non-sensitive areas) and using the raw malloc (in sensitive areas.) e.g. thread.c contains the following lines in our branch:
#include "Python.h"
/* patch malloc/free with threadsafe python versions */
#define CCPMEM_PATCH_RAW
#include "ccpmem_patch.h" |
I commited my new API to customize memory allocators: New changeset 6661a8154eb3 by Victor Stinner in branch 'default': I added PyMem_RawMalloc(), PyMem_RawRealloc() and PyMem_RawFree() in the same commit. These functions are wrappers to malloc/realloc/free which can be called without the GIL held. Using these new functions instead of malloc/realloc/free is interesting because the internal functions can be replaced with PyMem_SetRawAllocators() and many checks are added in debug mode (ex: check for buffer under- and overflow). |
pymem_debugcheckgil.patch: Patch adding check in debug mode to ensure that PyMem_Malloc() and PyObject_Malloc() are called with the GIL held. |
py_finalize.patch: modify Py_Finalize() to destroy the GIL after the last call to PyMem_Malloc() or PyObject_Malloc(). For example, |
pymem_debugcheckgil.patch: oops, a forgot a change in pgenmain.c. New fixed patch attached. |
See also issue bpo-16742: "PyOS_Readline drops GIL and calls PyOS_StdioReadline, which isn't thread safe". |
malloc_init.patch: Patch for functions called at Python initialization. This patch is not complete: to parse "-X" option (PySys_AddXOption) and "-W" (PySys_AddWarnOption), PyMem_Malloc() and PyObject_Malloc() are still called indirectly. Fixing this issue may need to reorganize completly how Python is initialized, because we need to have basic Python types (ex: Unicode) to be ready to be able to parse the command line. But we cannot initialize too much because the Python initialization also depends on options from the command line... |
malloc_modules.patch: Patch for different Python modules. |
Ok, initial patches are attached. Let describe them a little bit.
All these patches must be reviewed carefully to check if the GIL is held or not, and tested with pymem_debugcheckgil-2.patch (on Windows too! some patched functions os posixmodule.c are only compiled on Windows). |
Note that CPython's main function accesses the C API before calling Py_Initialize(). This is insane, but fixing it is hard (and one of the main goals of PEP-432). I suggest using Py_IsInitialized() to exclude those from your debug checks for now. |
New changeset 213d6d7f5979 by Victor Stinner in branch 'default': New changeset 18bb92b0c458 by Victor Stinner in branch 'default': New changeset 41ef797e6639 by Victor Stinner in branch 'default': |
New changeset fc01f9497da7 by Victor Stinner in branch 'default': |
New changeset 638d43665356 by Victor Stinner in branch 'default': New changeset 9af1905f20af by Victor Stinner in branch 'default': New changeset fb7d346b45fa by Victor Stinner in branch 'default': New changeset 10db0c67fc72 by Victor Stinner in branch 'default': |
New changeset 31a635303e55 by Victor Stinner in branch 'default': |
New changeset 65f2c92ed079 by Victor Stinner in branch 'default': |
Ok, with the last changeset, I consider the issue has fixed. There is no more direct call to malloc() in the Python source code. |
New changeset 1cba6687993e by Victor Stinner in branch 'default': |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: