msg191076 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2013-06-13 10:52 |
The issue #3329 proposes an API to replace memory allocator functions. But Python calls directly malloc(), realloc() and free() in some functions, so custom allocators would not be used there.
Examples of functions calling malloc/realloc/free directly: _PySequence_BytesToCharpArray(), block_new() (of pyarena.c), find_key() (of thread.c), PyInterpreterState_New(), win32_wchdir(), posix_getcwd(), Py_Main(), etc.
We have to be careful with the GIL: PyMem_*() functions can only be called when holding the GIL.
|
msg191078 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2013-06-13 11:14 |
Be aware about external code which allocate memory itself (i.e. expat).
|
msg191080 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2013-06-13 11:32 |
> Be aware about external code which allocate memory itself (i.e. expat).
Well, I know that it will hard to cover 100% of the stdlib. I just want to replace the most obvious calls.
Some libraries can be configured to use a custom memory allocators:
- zlib: zalloc and zfree, http://www.zlib.net/manual.html#Usage
- bz2
- lzma: LzmaEnc_Create parameter
- OpenSSL: CRYPTO_set_mem_functions
- etc.
We should probably uses these functions to reuse Python allocators (PyMem_Malloc()), or maybe only if PyMem_SetAllocators() has been called? (if Python does not use system allocators, aka malloc)
See also #18178 for libffi, it may be related.
The _decimal module configures libmpdec to use PyMem_Malloc:
#define _Py_DEC_MINALLOC 4
mpd_mallocfunc = PyMem_Malloc;
mpd_reallocfunc = PyMem_Realloc;
mpd_callocfunc = mpd_callocfunc_em;
mpd_free = PyMem_Free;
mpd_setminalloc(_Py_DEC_MINALLOC);
|
msg191082 - (view) |
Author: Christian Heimes (christian.heimes) *  |
Date: 2013-06-13 12:15 |
expat has a XML_Memory_Handling_Suite. You just have to replace XML_ParserCreate() and XML_ParserCreateNS with XML_ParserCreate_MM().
|
msg191086 - (view) |
Author: Georg Brandl (georg.brandl) *  |
Date: 2013-06-13 17:27 |
> We have to be careful with the GIL: PyMem_*() functions can only be
> called when holding the GIL.
> Some libraries can be configured to use a custom memory allocators:
> [...]
> We should probably uses these functions to reuse Python allocators
> (PyMem_Malloc())
I think there's a potential problem here :)
|
msg191093 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2013-06-13 22:07 |
>> We have to be careful with the GIL: PyMem_*() functions can only be
>> called when holding the GIL.
> (...)
> I think there's a potential problem here :)
I didn't understand the motivation to require the GIL held for PyMem_Malloc(). I searched in the source code history and on the Internet (archives of python-dev). In my opinion, the restiction is motivated by a bug: PyMem_Malloc() calls (indirectly) PyObject_Malloc() in debug mode, and PyObject_Malloc() is not thread-safe.
I opened a thread on python-dev to discuss this point.
|
msg191111 - (view) |
Author: Kristján Valur Jónsson (kristjan.jonsson) *  |
Date: 2013-06-14 09:12 |
Keeping the GIL requirement is _very_ useful for PyMem_MALLOC et al. It allows applications to hook in their own monitoring code, accessible from python itself, without having to worry about conflicts with python.
even if it were not for the GIL itself, PyMem_Malloc() may have all sorts of side effects.
Because of this, and to allow ourselves the flexibility to do all sorts of things inside PyMem_Malloc(), at CCP we added a parallel api, PyMem_MALLOC_RAW() etc.
This api is guaranteed to delegate directly to the external allocator (malloc by default, or an embedding application's supplied allocastor)
We have patched pythoncore in 2.7 in all places there were using malloc directly using the file attached to the defect. Notice how it can patch "malloc" in two different ways, using either regular malloc (in non-sensitive areas) and using the raw malloc (in sensitive areas.)
e.g. thread.c contains the following lines in our branch:
#include "Python.h"
/* patch malloc/free with threadsafe python versions */
#define CCPMEM_PATCH_RAW
#include "ccpmem_patch.h"
|
msg191169 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2013-06-14 23:05 |
I commited my new API to customize memory allocators:
New changeset 6661a8154eb3 by Victor Stinner in branch 'default':
Issue #3329: Add new APIs to customize memory allocators
http://hg.python.org/cpython/rev/6661a8154eb3
I added PyMem_RawMalloc(), PyMem_RawRealloc() and PyMem_RawFree() in the same commit. These functions are wrappers to malloc/realloc/free which can be called without the GIL held. Using these new functions instead of malloc/realloc/free is interesting because the internal functions can be replaced with PyMem_SetRawAllocators() and many checks are added in debug mode (ex: check for buffer under- and overflow).
|
msg191174 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2013-06-15 00:07 |
pymem_debugcheckgil.patch: Patch adding check in debug mode to ensure that PyMem_Malloc() and PyObject_Malloc() are called with the GIL held.
|
msg191175 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2013-06-15 00:10 |
py_finalize.patch: modify Py_Finalize() to destroy the GIL after the last call to PyMem_Malloc() or PyObject_Malloc(). For example,
PyCFunction_Fini() calls PyObject_GC_Del() which calls PyObject_FREE().
|
msg191176 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2013-06-15 00:13 |
pymem_debugcheckgil.patch: oops, a forgot a change in pgenmain.c. New fixed patch attached.
|
msg191179 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2013-06-15 00:36 |
See also issue #16742: "PyOS_Readline drops GIL and calls PyOS_StdioReadline, which isn't thread safe".
|
msg191180 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2013-06-15 00:44 |
malloc_init.patch: Patch for functions called at Python initialization.
This patch is not complete: to parse "-X" option (PySys_AddXOption) and "-W" (PySys_AddWarnOption), PyMem_Malloc() and PyObject_Malloc() are still called indirectly. Fixing this issue may need to reorganize completly how Python is initialized, because we need to have basic Python types (ex: Unicode) to be ready to be able to parse the command line. But we cannot initialize too much because the Python initialization also depends on options from the command line...
|
msg191182 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2013-06-15 01:19 |
malloc_modules.patch: Patch for different Python modules.
|
msg191183 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2013-06-15 01:35 |
Ok, initial patches are attached. Let describe them a little bit.
- pymem_debugcheckgil-2.patch: I don't think that this patch can be commited before the "bootstrap" issue is solved (Python doesn't start in debug mode with this patch when -W or -X command line option is used). I wrote it to check that other patches are correct (check if the GIL is held when PyMem_Malloc is called).
- py_finalize.patch: should be safe
- malloc_init.patch: change _Py_char2wchar() API, the result must now be freed by PyMem_RawFree() instead of PyMem_Free(). the change does not hurt in release mode (both functions are just wrapper to free()), but may break in debug mode (because python checks that PyMem_RawFree() is called on a buffer allocated by PyMem_RawMalloc()) in extension modules using _Py_char2wchar(). I would like to _Py_char2wchar() API (not to solve this issue, just because it is very useful for applications embedding Python), so changing its name would avoid a crash (applications would get a compilation or link error instead). Other change of the patch: replace free() and PyMem_Free() with PyMem_RawFree(), should be safe.
- malloc_modules.patch: I replaced many malloc() with PyMem_Malloc(), which is not safe.
All these patches must be reviewed carefully to check if the GIL is held or not, and tested with pymem_debugcheckgil-2.patch (on Windows too! some patched functions os posixmodule.c are only compiled on Windows).
|
msg191195 - (view) |
Author: Alyssa Coghlan (ncoghlan) *  |
Date: 2013-06-15 04:06 |
Note that CPython's main function accesses the C API before calling Py_Initialize(). This is insane, but fixing it is hard (and one of the main goals of PEP 432).
I suggest using Py_IsInitialized() to exclude those from your debug checks for now.
|
msg192559 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2013-07-07 14:26 |
New changeset 213d6d7f5979 by Victor Stinner in branch 'default':
Issue #18203: Fix Py_Finalize(): destroy the GIL after the last call to
http://hg.python.org/cpython/rev/213d6d7f5979
New changeset 18bb92b0c458 by Victor Stinner in branch 'default':
Issue #18203: Replace malloc() with PyMem_RawMalloc() at Python initialization
http://hg.python.org/cpython/rev/18bb92b0c458
New changeset 41ef797e6639 by Victor Stinner in branch 'default':
Issue #18203: Replace malloc() with PyMem_Malloc() in Python modules
http://hg.python.org/cpython/rev/41ef797e6639
|
msg192563 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2013-07-07 14:36 |
New changeset fc01f9497da7 by Victor Stinner in branch 'default':
Issue #18203: Fix decode_ascii_surrogateescape(), use PyMem_RawMalloc() as _Py_char2wchar()
http://hg.python.org/cpython/rev/fc01f9497da7
|
msg192572 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2013-07-07 15:26 |
New changeset 638d43665356 by Victor Stinner in branch 'default':
Issue #18203: Replace malloc() with PyMem_Malloc() in _ssl for the password
http://hg.python.org/cpython/rev/638d43665356
New changeset 9af1905f20af by Victor Stinner in branch 'default':
Issue #18203: Replace malloc() with PyMem_RawMalloc() to allocate thread locks
http://hg.python.org/cpython/rev/9af1905f20af
New changeset fb7d346b45fa by Victor Stinner in branch 'default':
Issue #18203: Replace malloc() with PyMem_Malloc() to allocate arena objects
http://hg.python.org/cpython/rev/fb7d346b45fa
New changeset 10db0c67fc72 by Victor Stinner in branch 'default':
Issue #18203: Replace malloc() with PyMem_Malloc() in _PySequence_BytesToCharpArray()
http://hg.python.org/cpython/rev/10db0c67fc72
|
msg192591 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2013-07-07 20:58 |
New changeset 31a635303e55 by Victor Stinner in branch 'default':
Issue #18203: Fix _Py_DecodeUTF8_surrogateescape(), use PyMem_RawMalloc() as _Py_char2wchar()
http://hg.python.org/cpython/rev/31a635303e55
|
msg192596 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2013-07-07 21:32 |
New changeset 65f2c92ed079 by Victor Stinner in branch 'default':
Issue #18203: Add _PyMem_RawStrdup() and _PyMem_Strdup()
http://hg.python.org/cpython/rev/65f2c92ed079
|
msg192597 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2013-07-07 21:33 |
Ok, with the last changeset, I consider the issue has fixed. There is no more direct call to malloc() in the Python source code.
|
msg193766 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2013-07-27 00:39 |
New changeset 1cba6687993e by Victor Stinner in branch 'default':
Issue #18203: Replace PyMem_Malloc() with PyMem_RawMalloc() at Python initialization
http://hg.python.org/cpython/rev/1cba6687993e
|
|
Date |
User |
Action |
Args |
2022-04-11 14:57:46 | admin | set | github: 62403 |
2013-07-27 00:39:51 | python-dev | set | messages:
+ msg193766 |
2013-07-07 21:33:47 | vstinner | set | status: open -> closed resolution: fixed messages:
+ msg192597
|
2013-07-07 21:32:57 | python-dev | set | messages:
+ msg192596 |
2013-07-07 20:58:06 | python-dev | set | messages:
+ msg192591 |
2013-07-07 15:26:32 | python-dev | set | messages:
+ msg192572 |
2013-07-07 14:36:07 | python-dev | set | messages:
+ msg192563 |
2013-07-07 14:26:03 | python-dev | set | nosy:
+ python-dev messages:
+ msg192559
|
2013-06-17 04:40:03 | trent | set | nosy:
+ trent
|
2013-06-16 07:32:26 | aliles | set | nosy:
+ aliles
|
2013-06-15 23:28:38 | vstinner | set | title: Replace calls to malloc() with PyMem_Malloc() or PyMem_RawMalloc() -> Replace direct calls to malloc() with PyMem_Malloc() or PyMem_RawMalloc() |
2013-06-15 23:28:04 | vstinner | set | title: Replace calls to malloc() with PyMem_Malloc() -> Replace calls to malloc() with PyMem_Malloc() or PyMem_RawMalloc() |
2013-06-15 04:06:47 | ncoghlan | set | messages:
+ msg191195 |
2013-06-15 01:35:21 | vstinner | set | messages:
+ msg191183 |
2013-06-15 01:19:49 | vstinner | set | files:
+ malloc_modules.patch
messages:
+ msg191182 |
2013-06-15 00:44:07 | vstinner | set | files:
+ malloc_init.patch
messages:
+ msg191180 |
2013-06-15 00:36:28 | vstinner | set | messages:
+ msg191179 |
2013-06-15 00:13:03 | vstinner | set | files:
+ pymem_debugcheckgil-2.patch
messages:
+ msg191176 |
2013-06-15 00:10:11 | vstinner | set | files:
+ py_finalize.patch
messages:
+ msg191175 |
2013-06-15 00:07:37 | vstinner | set | files:
+ pymem_debugcheckgil.patch keywords:
+ patch messages:
+ msg191174
|
2013-06-14 23:05:45 | vstinner | set | messages:
+ msg191169 |
2013-06-14 09:12:30 | kristjan.jonsson | set | files:
+ ccpmem_patch.h
messages:
+ msg191111 |
2013-06-13 22:07:03 | vstinner | set | messages:
+ msg191093 |
2013-06-13 17:27:30 | georg.brandl | set | nosy:
+ georg.brandl messages:
+ msg191086
|
2013-06-13 12:15:18 | christian.heimes | set | nosy:
+ christian.heimes messages:
+ msg191082
|
2013-06-13 11:32:02 | vstinner | set | messages:
+ msg191080 |
2013-06-13 11:14:12 | serhiy.storchaka | set | nosy:
+ serhiy.storchaka
messages:
+ msg191078 stage: needs patch |
2013-06-13 10:52:39 | vstinner | create | |